Towards real-time photorealistic 3D holography with deep neural networks

Liang Shi,Beichen Li,Changil Kim,Petr Kellnhofer,Wojciech Matusik
DOI: https://doi.org/10.1038/s41586-020-03152-0
IF: 64.8
2021-03-10
Nature
Abstract:The ability to present three-dimensional (3D) scenes with continuous depth sensation has a profound impact on virtual and augmented reality, human–computer interaction, education and training. Computer-generated holography (CGH) enables high-spatio-angular-resolution 3D projection via numerical simulation of diffraction and interference<sup><a href="/articles/s41586-020-03152-0#ref-CR1">1</a></sup>. Yet, existing physically based methods fail to produce holograms with both per-pixel focal control and accurate occlusion<sup><a href="/articles/s41586-020-03152-0#ref-CR2">2</a>,<a href="/articles/s41586-020-03152-0#ref-CR3">3</a></sup>. The computationally taxing Fresnel diffraction simulation further places an explicit trade-off between image quality and runtime, making dynamic holography impractical<sup><a href="/articles/s41586-020-03152-0#ref-CR4">4</a></sup>. Here we demonstrate a deep-learning-based CGH pipeline capable of synthesizing a photorealistic colour 3D hologram from a single RGB-depth image in real time. Our convolutional neural network (CNN) is extremely memory efficient (below 620 kilobytes) and runs at 60 hertz for a resolution of 1,920 × 1,080 pixels on a single consumer-grade graphics processing unit. Leveraging low-power on-device artificial intelligence acceleration chips, our CNN also runs interactively on mobile (iPhone 11 Pro at 1.1 hertz) and edge (Google Edge TPU at 2.0 hertz) devices, promising real-time performance in future-generation virtual and augmented-reality mobile headsets. We enable this pipeline by introducing a large-scale CGH dataset (MIT-CGH-4K) with 4,000 pairs of RGB-depth images and corresponding 3D holograms. Our CNN is trained with differentiable wave-based loss functions<sup><a href="/articles/s41586-020-03152-0#ref-CR5">5</a></sup> and physically approximates Fresnel diffraction. With an anti-aliasing phase-only encoding method, we experimentally demonstrate speckle-free, natural-looking, high-resolution 3D holograms. Our learning-based approach and the Fresnel hologram dataset will help to unlock the full potential of holography and enable applications in metasurface design<sup><a href="/articles/s41586-020-03152-0#ref-CR6">6</a>,<a href="/articles/s41586-020-03152-0#ref-CR7">7</a></sup>, optical and acoustic tweezer-based microscopic manipulation<sup><a href="#ref-CR8">8</a>,<a href="#ref-CR9">9</a>,<a href="/articles/s41586-020-03152-0#ref-CR10">10</a></sup>, holographic microscopy<sup><a href="/articles/s41586-020-03152-0#ref-CR11">11</a></sup> and single-exposure volumetric 3D printing<sup><a href="/articles/s41586-020-03152-0#ref-CR12">12</a>,<a href="/articles/s41586-020-03152-0#ref-CR13">13</a></sup>.
multidisciplinary sciences
What problem does this paper attempt to address?