PePR: Performance Per Resource Unit as a Metric to Promote Small-Scale Deep Learning in Medical Image Analysis

Raghavendra Selvan,Bob Pepin,Christian Igel,Gabrielle Samuel,Erik B Dam
2024-12-05
Abstract:The recent advances in deep learning (DL) have been accelerated by access to large-scale data and compute. These large-scale resources have been used to train progressively larger models which are resource intensive in terms of compute, data, energy, and carbon emissions. These costs are becoming a new type of entry barrier to researchers and practitioners with limited access to resources at such scale, particularly in the Global South. In this work, we take a comprehensive look at the landscape of existing DL models for medical image analysis tasks and demonstrate their usefulness in settings where resources are limited. To account for the resource consumption of DL models, we introduce a novel measure to estimate the performance per resource unit, which we call the PePR score. Using a diverse family of 131 unique DL architectures (spanning 1M to 130M trainable parameters) and three medical image datasets, we capture trends about the performance-resource trade-offs. In applications like medical image analysis, we argue that small-scale, specialized models are better than striving for large-scale models. Furthermore, we show that using existing pretrained models that are fine-tuned on new data can significantly reduce the computational resources and data required compared to training models from scratch. We hope this work will encourage the community to focus on improving AI equity by developing methods and models with smaller resource footprints.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the balance between resource consumption and performance of current deep - learning (DL) models in the field of medical image analysis. With the development of deep - learning technology, the training of large - scale models requires a large amount of data, computing resources, energy, and carbon emissions. This not only burdens the environment but also becomes an obstacle for researchers and practitioners to enter this field, especially in regions with limited resources (such as the Global South). Therefore, the paper proposes a new metric - the Performance - per - Resource - unit (PePR) score - to evaluate the performance of models under resource - constrained conditions. By introducing the PePR score, the author hopes to promote the application of small - scale deep - learning models in medical image analysis. These models are more economical in terms of resource consumption and can also achieve good performance. Specifically, the paper addresses the above - mentioned problems through the following points: 1. **Proposing the PePR score**: This is a new metric that combines performance and resource consumption and aims to evaluate the performance of models under resource - constrained conditions. The PePR score is defined as: \[ \text{PePR}(R, P)=\frac{P}{1 + R} \] where \(P\) is the normalized performance indicator and \(R\) is the normalized resource cost. 2. **Experimental verification**: The paper uses 131 different deep - learning architectures (with parameter ranges from 1M to 130M) and conducts experiments on three medical image datasets to evaluate the trade - off between resource consumption and performance of different models. 3. **Advantages of pre - trained models**: Research shows that in resource - constrained situations, fine - tuning using pre - trained models can significantly improve performance without the need to train models from scratch, thereby greatly reducing computing resources and data requirements. 4. **Superiority of small - scale models**: In resource - constrained environments, small - scale models usually provide a better balance between performance and resource consumption. The paper demonstrates this through the PePR score. In particular, in large - scale models with high resource consumption, the PePR score is low, indicating that their cost - performance ratio is not high. 5. **Promoting AI fairness**: By promoting small - scale deep - learning models, the paper hopes to lower the threshold for research and practice, especially in regions with limited resources, thereby promoting AI fairness in the medical and health fields. In conclusion, by introducing the PePR score, this paper systematically analyzes the relationship between performance and resource consumption of deep - learning models under resource - constrained conditions, providing theoretical basis and empirical support for promoting the application of small - scale deep - learning models.