Tensor Processing Unit Reliability Dependence on Temperature and Radiation Source

Pablo R. Bodmann,Paolo Rech
DOI: https://doi.org/10.1109/tns.2024.3359524
IF: 1.703
2024-01-01
IEEE Transactions on Nuclear Science
Abstract:Several applications in which autonomous capabilities are required, such as unmanned probes used for space exploration, require high reliability but imposes strict power budget limits. To enable autonomy, Convolutional Neural Networks (CNNs) are heavily employed to detect objects in images or frames. Recently, to adapt the computing power to the application needs, the market made available several low-power and low-cost Commercial-Off-The-Shelf (COTS) computing solutions, called EdgeAI accelerators, that are very attractive for executing neural networks with limited power requirements. Since autonomous vehicles may operate on a wide range of temperatures it is fundamental to understand the dependence of EdgeAI devices error rate on the temperature. In this work, we consider Google’s Coral Edge Tensor Processing Unit and measure its atmospheric neutrons reliability at different temperatures, that goes from -40°C to +90°C. We show a decrease in the FIT rate of almost 4× as temperature increases. Moreover, we compare the thermal neutrons and heavy ion cross-sections of the TPU at room temperature. We report a difference of up to ~ 31× between atmospheric and thermal neutrons cross-section on the TPU. The TPU’s cross-section for heavy ions is ~ 20× higher than for atmospheric neutrons and ~ 187× than thermal neutrons.
engineering, electrical & electronic,nuclear science & technology
What problem does this paper attempt to address?