MRgRT real-time target localization using foundation models for contour point tracking and promptable mask refinement
Tom Julius Blöcker,Elia Lombardo,Sebastian Marschner,Claus Belka,Stefanie Corradini,Miguel A Palacios,Marco Riboldi,Christopher Kurz,Guillaume Landry
DOI: https://doi.org/10.1088/1361-6560/ad9dad
IF: 3.5
2024-12-13
Physics in Medicine and Biology
Abstract:Objective: This study aimed to evaluate two real-time target tracking approaches for magnetic resonance imaging (MRI) guided radiotherapy (MRgRT) based on foundation artificial intelligence (AI) models.
Approach: The first approach used a point-tracking model that propagates points from a reference contour. The second approach used a video-object-segmentation model, based on Segment Anything Model 2 (SAM2). Both approaches were evaluated and compared against each other, inter-observer variability, and a transformer-based image registration model, TransMorph, with and without patient-specific (PS) fine-tuning. The evaluation was carried out on 2D cine MRI datasets from two institutions, containing scans from 33 patients with 8060 labeled frames, with annotations from 2 to 5 observers per frame, totaling 29179 ground truth segmentations. The segmentations produced were assessed using the Dice similarity coefficient (DSC), 50% and 95% Hausdorff distances (HD50 / HD95), and the Euclidean center distance (ECD).
Main results: The results showed that the contour tracking (median DSC 0.92 ± 0.04 and ECD 1.9 ± 1.0 mm) and SAM2-based (median DSC 0.93 ± 0.03 and ECD 1.6 ± 1.1 mm) approaches produced target segmentations comparable or superior to TransMorph without PS fine-tuning (median DSC 0.91 ± 0.07 and ECD 2.6 ± 1.4 mm) and slightly inferior to TransMorph with PS fine-tuning (median DSC 0.94 ± 0.03 and ECD 1.4 ± 0.8 mm). Between the two novel approaches, the one based on SAM2 performed marginally better at a higher computational cost (inference times 92 ms for contour tracking and 109 ms for SAM2). Both approaches and TransMorph with PS fine-tuning exceeded inter-observer variability (median DSC 0.90 ± 0.06 and ECD 1.7 ± 0.7 mm).
Significance: This study demonstrates the potential of foundation models to achieve high-quality real-time target tracking in MRgRT, offering performance that matches state-of-the-art methods without requiring PS fine-tuning.
engineering, biomedical,radiology, nuclear medicine & medical imaging