EndoViT: pretraining vision transformers on a large collection of endoscopic images

Dominik Batić,Felix Holm,Ege Özsoy,Tobias Czempiel,Nassir Navab
DOI: https://doi.org/10.1007/s11548-024-03091-5
2024-04-05
International Journal of Computer Assisted Radiology and Surgery
Abstract:Automated endoscopy video analysis is essential for assisting surgeons during medical procedures, but it faces challenges due to complex surgical scenes and limited annotated data. Large-scale pretraining has shown great success in natural language processing and computer vision communities in recent years. These approaches reduce the need for annotated data, which is of great interest in the medical domain. In this work, we investigate endoscopy domain-specific self-supervised pretraining on large collections of data.
engineering, biomedical,radiology, nuclear medicine & medical imaging,surgery
What problem does this paper attempt to address?