Abstract:Point cloud frame interpolation is a challenging task that involves accurate scene flow estimation across frames and maintaining the geometry structure. Prevailing techniques often rely on pre-trained motion estimators or intensive testing-time optimization, resulting in compromised interpolation accuracy or prolonged inference. This work presents FastPCI that introduces Pyramid Convolution-Transformer architecture for point cloud frame interpolation. Our hybrid Convolution-Transformer improves the local and long-range feature learning, while the pyramid network offers multilevel features and reduces the computation. In addition, FastPCI proposes a unique Dual-Direction Motion-Structure block for more accurate scene flow estimation. Our design is motivated by two facts: (1) accurate scene flow preserves 3D structure, and (2) point cloud at the previous timestep should be reconstructable using reverse motion from future timestep. Extensive experiments show that FastPCI significantly outperforms the state-of-the-art PointINet and NeuralPCI with notable gains (e.g. 26.6% and 18.3% reduction in Chamfer Distance in KITTI), while being more than 10x and 600x faster, respectively. Code is available at <a class="link-external link-https" href="https://github.com/genuszty/FastPCI" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper attempts to address several key challenges in Point Cloud Frame Interpolation (PCI): 1. **Accurate Scene Flow Estimation**: PCI requires accurate estimation of scene flow between different frames to ensure that the generated intermediate frames maintain the geometric structure of objects. 2. **Preservation of Geometric Structure**: Existing methods often rely on pre-trained motion estimators or dense test-time optimization, which leads to decreased interpolation accuracy or extended inference time. 3. **Efficient Real-Time Processing**: In fields such as autonomous driving, virtual/augmented reality, and robotics, real-time processing of point cloud data is crucial, necessitating an efficient and accurate interpolation method. ### Solution To address the above challenges, the paper proposes FastPCI, a fast point cloud frame interpolation method based on a pyramid convolution-transformer architecture. The main contributions of FastPCI include: 1. **Bidirectional Motion-Structure Transformer Block**: Estimates motion in a structure-aware manner by mixing information from forward and backward point features. 2. **Pyramid Convolution-Transformer Architecture**: Combines the advantages of convolution and transformers to achieve fast and accurate point cloud frame interpolation. 3. **Optimized Loss Function**: Introduces reconstruction loss, multi-scale loss, and bidirectional loss to further enhance performance. ### Experimental Results The paper conducts extensive experiments on three large-scale autonomous driving datasets (KITTI, Argoverse 2, and Nuscenes), showing that FastPCI significantly outperforms existing state-of-the-art methods (such as PointINet and NeuralPCI) in terms of accuracy and speed. Specifically: - **Accuracy**: FastPCI reduces the Chamfer distance and Earth Mover's Distance (EMD) on the KITTI dataset by 26.6% and 18.3%, respectively. - **Speed**: FastPCI's inference time is over 10 times faster than PointINet and over 600 times faster than NeuralPCI. ### Conclusion By introducing the bidirectional motion-structure transformer block and the pyramid convolution-transformer architecture, FastPCI successfully addresses the accuracy and efficiency issues in point cloud frame interpolation, providing strong support for real-time applications.

FastPCI: Motion-Structure Guided Fast Point Cloud Frame Interpolation

A Multiscale-Contour-based Interpolation Framework for Generating a Time-Varying Quasi-Dense Point Cloud Sequence.

NeuralPCI: Spatio-temporal Neural Field for 3D Point Cloud Multi-frame Non-linear Interpolation

Fast Point Cloud Sampling Network.

Frame Interpolation Using Phase and Amplitude Feature Pyramids

FINet: Fast Point Cloud Interpolation Network Via Distance Transform

Dynamic Point Cloud Interpolation

High-quality and real-time frame interpolation on heterogeneous computing system

SPINet: Self-Supervised Point Cloud Frame Interpolation Network

A Point Cloud Video Recognition Acceleration Framework Based on Tempo-Spatial Information

Fast Point Cloud Geometry Compression with Context-based Residual Coding and INR-based Refinement

NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fields for Point Cloud Interpolation

FusionArch: A Fusion-Based Accelerator for Point-Based Point Cloud Neural Networks

A Point Transformer Accelerator with Fine-Grained Pipelines and Distribution-Aware Dynamic FPS

RAI-Net: Range-Adaptive LiDAR Point Cloud Frame Interpolation Network

IC-FPS: Instance-Centroid Faster Point Sampling Module for 3D Point-base Object Detection

Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation and Spatial Supervision

Feature Interpolation Convolution for Point Cloud Analysis*

Inter-Frame Compression for Dynamic Point Cloud Geometry Coding

Bi-Directional Inter-Prediction For Geometry-Based Point Cloud Compression.

Learning Dynamic Point Cloud Compression via Hierarchical Inter-frame Block Matching