Abstract:With the proliferation of ultrahigh-speed mobile networks and internet-connected devices, along with the rise of artificial intelligence (AI)<a href="/articles/s41586-020-03070-1#ref-CR1">1</a>, the world is generating exponentially increasing amounts of data that need to be processed in a fast and efficient way. Highly parallelized, fast and scalable hardware is therefore becoming progressively more important<a href="/articles/s41586-020-03070-1#ref-CR2">2</a>. Here we demonstrate a computationally specific integrated photonic hardware accelerator (tensor core) that is capable of operating at speeds of trillions of multiply-accumulate operations per second (1012 MAC operations per second or tera-MACs per second). The tensor core can be considered as the optical analogue of an application-specific integrated circuit (ASIC). It achieves parallelized photonic in-memory computing using phase-change-material memory arrays and photonic chip-based optical frequency combs (soliton microcombs<a href="/articles/s41586-020-03070-1#ref-CR3">3</a>). The computation is reduced to measuring the optical transmission of reconfigurable and non-resonant passive components and can operate at a bandwidth exceeding 14 gigahertz, limited only by the speed of the modulators and photodetectors. Given recent advances in hybrid integration of soliton microcombs at microwave line rates<a href="#ref-CR3">3</a>,<a href="#ref-CR4">4</a>,<a href="/articles/s41586-020-03070-1#ref-CR5">5</a>, ultralow-loss silicon nitride waveguides<a href="/articles/s41586-020-03070-1#ref-CR6">6</a>,<a href="/articles/s41586-020-03070-1#ref-CR7">7</a>, and high-speed on-chip detectors and modulators, our approach provides a path towards full complementary metal–oxide–semiconductor (CMOS) wafer-scale integration of the photonic tensor core. Although we focus on convolutional processing, more generally our results indicate the potential of integrated photonics for parallel, fast, and efficient computational hardware in data-heavy AI applications such as autonomous driving, live video processing, and next-generation cloud computing services.

ChatGPT at the Speed of Light: Optical Comb-Based Monolithic Photonic-Electronic Linear-Algebra Accelerators

Monolithic Silicon-Photonics Linear-Algebra Accelerators Enabling Next-Gen Massive MIMO

Parallel Photonic Acceleration Processor for Matrix-Matrix Multiplication

Parallel convolutional processing using an integrated photonic tensor core

Photonic-Electronic Integrated Circuits for High-Performance Computing and AI Accelerators

Silicon Photonic Network-on-chip and Enabling Components

A Parallel Photonic Chip for Nano-Second End-to-end Image Processing, Transmission, and Reconstruction

High-coherence parallelization in integrated photonics

Scalable and Versatile Linear Computation with Minimalistic Photonic Matrix Processor

PIXEL: Photonic Neural Network Accelerator

Photonic matrix multiplication lights up photonic accelerator and beyond

Parallel Photonic Convolutional Processing On-Chip with Cross-Connect Architecture and Cyclic AWGs

Scaling Up Silicon Photonic-based Accelerators: Challenges and Opportunities

Cross-Layer Design for AI Acceleration with Non-Coherent Optical Computing

Microcomb-based integrated photonic processing unit

Artificial intelligence accelerator using photonic computing

Integrated Photonic FFT for Optical Convolutions towards Efficient and High-Speed Neural Networks

All-analog photoelectronic chip for high-speed vision tasks

Emerging Devices and Packaging Strategies for Electronic-Photonic AI Accelerators

Photonic optical accelerators: The future engine for the era of modern AI?

High Throughput Multi-Channel Parallelized Diffraction Convolutional Neural Network Accelerator