Neural Networks on Chip: from CMOS Accelerators to In-Memory-Computing

Yu Wang
DOI: https://doi.org/10.1109/socc.2018.8618496
2018-01-01
Abstract:Artificial neural networks, which dominate artificial intelligence applications such as object recognition and speech recognition, are in evolution. To apply neural networks to wider applications, customized hardware are necessary since CPU and GPU are not efficient enough. Numerous architectures are proposed in the past 4 years to boost the energy efficiency of deep learning inference processing, including Tsinghua and Deephi's effort. In this talk, we will talk about different architectures based on CMOS technologies, including 200GOPS/W FPGA accelerators, about 1-5TOPS/W chips with DDR subsystems, and over 50TOPs/W chips with everything on chip. The possibilities and trends of adopting emerging NVM technology for efficient learning systems, i.e., inmemory-computing, will also be discussed as one of the most promising ways to improve the energy efficiency.
What problem does this paper attempt to address?