Development of a video-based deep learning model for differentiation of malignant and benign lesions during staging laparoscopy: Is the machine better than the expert?

Francesca Tozzi,Seyed Amir Mousavi,Matthias Van Liefferinge,Dongin Moon,Homin Park,Sharareh Fadaei,Wim Ceelen,Wouter Willaert,Wesley De Neve,Nikdokht Rashidian
DOI: https://doi.org/10.1200/jco.2024.42.16_suppl.e13616
IF: 45.3
2024-05-31
Journal of Clinical Oncology
Abstract:e13616 Background: Laparoscopic staging of abdominal cancer is routinely performed to assess the presence of peritoneal metastasis (PM). One major challenge of laparoscopic staging is the ability to distinguish between a malignant and benign peritoneal lesion, particularly scar tissue. Lesions' recognition is subjective to intra-case variability, surgeons' experience, type of primary tumor and response to systemic chemotherapy. The Peritoneal Regression Grading Score (PRGS) offers an objective histologic evaluation of the biopsied lesion by assessing the complete or partial presence of malignant cells (PRGS 2-3-4) and fibrous or benign lesion (PRGS 1). Machine learning-based (ML-based) computer vision has various applications in the medical field and can be useful in deducing clinical information through surgical video analysis. The aim of this study was to develop a ML model that can aid the surgeon in the intra-operative assessment of PM. Methods: Retrospectively collected videos of laparoscopies for the staging of PM before cytoreductive surgery or PIPAC were screened from the institutional database for the presence of recorded biopsies. PRGS from the pathology report of these biopsies was retrieved and reviewed by a pathologist. The surgical phase of the biopsy was annotated by systematic selection of 200 frames before the closure of the biopsy instrument on the tissue. One single frame was selected from each timeframe based on its representativeness. Two trained annotators performed the selection, one surgical expert reviewed each annotation. Discrepancies were solved by consensus. ML models based on multiple (ResEfficientnetV2_L) and single frames (Resnet50) per biopsy were trained on the PRGS using a one-versus-all binary classification. Five-fold cross-validation was applied. Two oncologic surgeons, experts in treating and assessing PM, blindly and independently scored the biopsies with PRGS. Results: A total of 127 videos from 67 patients were identified, showcasing PM of gastric, colorectal, appendix, hepatic, gallbladder, breast carcinomas, primary peritoneal tumors and benign lesions. Annotation was performed on 463 biopsies: PRGS 1: 204, PRGS 2: 164, PRGS 3: 70, PRGS 4: 25. Based on timeframe annotation 5214 (PRGS 1: 2044, PRGS 2: 1814, PRGS 3: 929, PRGS 4: 427) images were identified. The model trained on one-versus-all classification for single and multiple images showed an accuracy of 52.4% (Precision 56.2, Recall 48.2) and 45.7% (Precision 76.8, Recall 36.0) respectively in the validation set. Accuracy of each expert was 36.0% and 30.5%. Conclusions: While the accuracy of computer vision models may not be clinically acceptable as a substitute for biopsies, they outperform surgeons in differentiating PM from benign lesions. Computer vision models might be of aid for the surgeon in the recognition of these lesions.
oncology
What problem does this paper attempt to address?