BTD-RF: 3D scene reconstruction using block-term tensor decomposition

Kim, Seon Bin,Kim, Sangwon,Ahn, Dasom
DOI: https://doi.org/10.1007/s10489-024-05476-0
IF: 5.3
2024-05-10
Applied Intelligence
Abstract:The Neural Radiance Field (NeRF) exhibits excellent performance for view synthesis tasks, but it requires a large amount of memory and model parameters during three-dimensional (3D) scene reconstruction. This paper proposes a block-term tensor decomposition radiance field (BTD-RF), which is a novel approach that achieves significant model compression while preserving reconstruction quality. BTD-RF decomposes high-dimensional radiance fields into low-dimensional tensor blocks, resulting in a value 2.21 times smaller than the baseline method. Decomposing the model into low-dimensional tensor blocks allows substituting the standard multi-head attention of transformers with a lightweight multi-linear attention mechanism, employing element-wise products and sharing parameters. This significantly reduces the model complexity without compromising performance. Extensive evaluations on various datasets demonstrate that BTD-RF achieves superior image reconstruction quality compared to prior methods. Quantitative metrics and qualitative assessments confirm that BTD-RF generates images that are structurally and perceptually close to ground truth, showcasing exceptional performance despite its lightweight design. BTD-RF offers a compelling trade-off between model size and reconstruction quality for three-dimensional (3D) scene reconstruction. Its efficient design makes it suitable for resource-constrained applications while delivering high-fidelity results, paving the way for broader NeRF utilization. The code is available at https://github.com/seonbin-kim/BTDRF
computer science, artificial intelligence
What problem does this paper attempt to address?