Back to articles
Proceedings
Volume: 36 | Article ID: AVM-113
Image
Efficient Fault Tolerant Architecture for Neural Network Compute
  DOI :  10.2352/EI.2024.36.17.AVM-113  Published OnlineJanuary 2024
Abstract
Abstract

With artificial-intelligence (AI) becoming the mainstream approach to solve a myriad of problems across industrial, automotive, medical, military, wearables and cloud, the need for high-performance, low-power embedded devices are stronger than ever. Innovations around designing an efficient hardware accelerator to perform AI tasks also involves making them fault-tolerant to work reliability under varying stressful environmental conditions. These embedded devices could be deployed under varying thermal and electromagnetic interference conditions which require both the processing blocks and on-device memories to recover from faults and provide a reliable quality of service. Particularly in the automotive context, ASIL-B compliant AI systems typically implement error-correction-code (ECC) which takes care of single-error-correction, double-error detection (SECDED) faults. ASIL-D based AI systems implement dual lock step compute blocks and builds processing redundancy to reinforce prediction certainty, on top of protecting its memories. Fault-tolerant systems take it one level higher by tripling the processing blocks, where fault detected by one processing element is corrected and reinforced by the other two elements. This becomes a significant silicon area adder and makes the solution an expensive proposition. In this paper we propose novel techniques that can be applied to a typical deep-learning based embedded solution with many processing stages such as memory load, matrix-multiply, accumulate, activation functions and others to build a robust fault tolerant system without linearly tripling compute area and hence the cost of the solution.

Subject Areas :
Views 38
Downloads 16
 articleview.views 38
 articleview.downloads 16
  Cite this article 

Shyam Jagannathan, Mihir Mody, Prithvi Shankar, Villarreal Jesse, JuneChul Roh, Kumar Desappan, Deepak Poddar, Pramod Swami, "Efficient Fault Tolerant Architecture for Neural Network Computein Electronic Imaging,  2024,  pp 113-1 - 113-5,  https://doi.org/10.2352/EI.2024.36.17.AVM-113

 Copy citation
  Copyright statement 
Copyright © 2024, Society for Imaging Science and Technology 2024
ei
Electronic Imaging
2470-1173
2470-1173
Society for Imaging Science and Technology
IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA