Navigating Distribution Shifts in Medical Image Analysis: A Survey

Zixian Su,Jingwei Guo,Xi Yang,Qiufeng Wang,Frans Coenen,Kaizhu Huang
2024-11-05
Abstract:Medical Image Analysis (MedIA) has become indispensable in modern healthcare, enhancing clinical diagnostics and personalized treatment. Despite the remarkable advancements supported by deep learning (DL) technologies, their practical deployment faces challenges due to distribution shifts, where models trained on specific datasets underperform across others from varying hospitals, regions, or patient populations. To navigate this issue, researchers have been actively developing strategies to increase the adaptability and robustness of DL models, enabling their effective use in unfamiliar and diverse environments. This paper systematically reviews approaches that apply DL techniques to MedIA systems affected by distribution shifts. Unlike traditional categorizations based on technical specifications, our approach is grounded in the real-world operational constraints faced by healthcare institutions. Specifically, we categorize the existing body of work into Joint Training, Federated Learning, Fine-tuning, and Domain Generalization, with each method tailored to distinct scenarios caused by Data Accessibility, Privacy Concerns, and Collaborative Protocols. This perspective equips researchers with a nuanced understanding of how DL can be strategically deployed to address distribution shifts in MedIA, ensuring diverse and robust medical applications. By delving deeper into these topics, we highlight potential pathways for future research that not only address existing limitations but also push the boundaries of deployable MedIA technologies.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to explore how to deal with the problem of data distribution shifts in Medical Image Analysis (MedIA). Specifically, the paper focuses on: 1. **Challenges of data distribution shifts**: - **Background**: Although deep - learning techniques have made remarkable progress in medical image analysis, these models face a major problem when actually deployed, namely data distribution shifts. Training data usually comes from specific hospitals, regions or patient groups, and when these models are applied to different medical environments, their performance often declines. - **Reasons**: Reasons for data distribution shifts include differences in imaging modalities, changes in scanning protocols, differences in patient demographics and temporal changes. 2. **Classification of existing solutions**: - **Methods**: The paper systematically reviews the methods of using deep - learning techniques to deal with data distribution shifts and classifies these methods according to the constraints in actual operations (such as data accessibility, privacy issues and collaboration agreements). - **Classification**: - **Joint Training**: When both the source data and the target data are accessible and there are no privacy issues, multiple medical institutions can share data for joint training to improve the adaptability of the model. - **Federated Learning**: When multiple institutions wish to cooperate but do not want to expose their respective data, federated learning achieves cooperation by training models locally and aggregating models without centrally storing data. - **Fine - tuning**: When synchronous cooperation is not feasible, a pre - trained model can be used and fine - tuned on a new data set to transfer the learned knowledge to a new domain. - **Domain Generalization**: When new domain data that cannot be accessed or is unknown needs the model to adapt, training a sufficiently generalized model to deal with distribution shifts is the key. 3. **Considerations in practical applications**: - **Operational constraints**: The paper particularly emphasizes the operational constraints in practical applications, such as data accessibility, privacy issues and collaboration agreements, which directly affect data distribution shifts in different scenarios. - **Future research directions**: By delving into these topics, the paper points out the potential paths for future research, which not only solves existing problems but also promotes the development of deployable MedIA technologies. ### Summary The main purpose of this paper is to systematically review and classify the existing deep - learning techniques to deal with the problem of data distribution shifts in medical image analysis. By considering various constraints in actual operations, the paper provides researchers with strategies and methods to deal with this challenge, thus promoting the wide application of deep - learning in the field of medical image analysis.