Surgical instrument recognition for instrument usage documentation and surgical video library indexing

Bokai Zhang,Darrick Sturgeon,Arjun Ravi Shankar,Varun Kejriwal Goel,Jocelyn Barker,Amer Ghanem,Philip Lee,Meghan Milecky,Natalie Stottler,Svetlana Petculescu
DOI: https://doi.org/10.1080/21681163.2022.2152371
2022-12-07
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization
Abstract:Temporally locating and classifying instruments in surgical video is useful for the analysis and comparison of surgical techniques. This paper aims to apply action segmentation techniques to temporally segment and classify surgical instruments, and to highlight the utility of this modelling approach through example applications. This paper shows that the action segmentation transformer (ASFormer) architecture with an EfficientNetV2 featurizer performs significantly better in mean average precision than any previous approaches to this task on the Cholec80 dataset. The ASFormer also outperforms Long Short-Term Memory (LSTM) and Multi-Stage Temporal Convolutional Network (MS-TCN) architectures with the same featurizer. This model reduces the need for costly human labelling of surgical video, driving the development of indexed surgical video libraries and instrument usage tracking applications. Examples of these applications are included after the results.
What problem does this paper attempt to address?