Abstract:Introduction: In the treatment of malocclusion, continuous monitoring of the three-dimensional relationship between dental roots and the surrounding alveolar bone is essential for preventing complications from orthodontic procedures. Cone-beam computed tomography (CBCT) provides detailed root and bone data, but its high radiation dose limits its frequent use, consequently necessitating an alternative for ongoing monitoring. Objectives: We aimed to develop a deep learning-based cross-temporal multimodal image fusion system for acquiring root and jawbone information without additional radiation, enhancing the ability of orthodontists to monitor risk. Methods: Utilizing CBCT and intraoral scans (IOSs) as cross-temporal modalities, we integrated deep learning with multimodal fusion technologies to develop a system that includes a CBCT segmentation model for teeth and jawbones. This model incorporates a dynamic kernel prior model, resolution restoration, and an IOS segmentation network optimized for dense point clouds. Additionally, a coarse-to-fine registration module was developed. This system facilitates the integration of IOS and CBCT images across varying spatial and temporal dimensions, enabling the comprehensive reconstruction of root and jawbone information throughout the orthodontic treatment process. Results: The experimental results demonstrate that our system not only maintains the original high resolution but also delivers outstanding segmentation performance on external testing datasets for CBCT and IOSs. CBCT achieved Dice coefficients of 94.1 % and 94.4 % for teeth and jawbones, respectively, and it achieved a Dice coefficient of 91.7 % for the IOSs. Additionally, in the context of real-world registration processes, the system achieved an average distance error (ADE) of 0.43 mm for teeth and 0.52 mm for jawbones, significantly reducing the processing time. Conclusion: We developed the first deep learning-based cross-temporal multimodal fusion system, addressing the critical challenge of continuous risk monitoring in orthodontic treatments without additional radiation exposure. We hope that this study will catalyze transformative advancements in risk management strategies and treatment modalities, fundamentally reshaping the landscape of future orthodontic practice.

Multiscale geometric window transformer for orthodontic teeth point cloud registration

Transformer-Based Tooth Alignment Prediction With Occlusion And Collision Constraints

Geometric Transformer for Fast and Robust Point Cloud Registration

GeoTransformer: Fast and Robust Point Cloud Registration with Geometric Transformer

TFormer: 3D Tooth Segmentation in Mesh Scans with Geometry Guided Transformer

Transformer-based 2D/3D medical image registration for X-ray to CT via anatomical features

TSegFormer: 3D Tooth Segmentation in Intraoral Scans with Geometry Guided Transformer

Deformable Cross-Attention Transformer for Medical Image Registration

A new dataset of oral panoramic x-ray images and parallel network using transformers for medical image segmentation

Geo-Net: Geometry-Guided Pretraining for Tooth Point Cloud Segmentation

Spatial deformable transformer for 3D point cloud registration

End-to-end point cloud registration with transformer

Dual-scale shifted window attention network for medical image segmentation

Full Transformer Framework for Robust Point Cloud Registration With Deep Information Interaction

Transformer based 3D tooth segmentation via point cloud region partition

Automatic segmentation of mandibular canal using transformer based neural networks

Dental panoramic X-ray image segmentation for multi-feature coordinate position learning

Multi-Class Double-Transformation Network for SAR Image Registration

Two-Stage Mesh Deep Learning for Automated Tooth Segmentation and Landmark Localization on 3D Intraoral Scans

Neural Orthodontic Staging: Predicting Teeth Movements with a Transformer

A cross-temporal multimodal fusion system based on deep learning for orthodontic monitoring