As DNNs become common in mission-critical applications, ensuring their reliable operation has become crucial. Conventional resilience techniques fail to account for the unique characteristics of DNN algorithms/accelerators, and hence, they are infeasible or ineffective.
Our paper https://www.researchgate.net/public...eliability_of_DNN_Algorithms_and_Accelerators surveys techniques for studying and optimizing the reliability of DNN accelerators and architectures. The reliability issues we cover include soft/hard errors arising due to process variation, voltage scaling, timing errors, DRAM errors due to refresh rate scaling and thermal effects, etc. Reviews ~80 papers, accepted in Journal of Systems Architecture 2019.
Our paper https://www.researchgate.net/public...eliability_of_DNN_Algorithms_and_Accelerators surveys techniques for studying and optimizing the reliability of DNN accelerators and architectures. The reliability issues we cover include soft/hard errors arising due to process variation, voltage scaling, timing errors, DRAM errors due to refresh rate scaling and thermal effects, etc. Reviews ~80 papers, accepted in Journal of Systems Architecture 2019.