A Study of Data Augmentation Techniques to Overcome Data Scarcity in Wound Classification using Deep Learning

Harini Narayanan,Sindhu Ghanta
2024-11-04
Abstract:Chronic wounds are a significant burden on individuals and the healthcare system, affecting millions of people and incurring high costs. Wound classification using deep learning techniques is a promising approach for faster diagnosis and treatment initiation. However, lack of high quality data to train the ML models is a major challenge to realize the potential of ML in wound care. In fact, data limitations are the biggest challenge in studies using medical or forensic imaging today. We study data augmentation techniques that can be used to overcome the data scarcity limitations and unlock the potential of deep learning based solutions. In our study we explore a range of data augmentation techniques from geometric transformations of wound images to advanced GANs, to enrich and expand datasets. Using the Keras, Tensorflow, and Pandas libraries, we implemented the data augmentation techniques that can generate realistic wound images. We show that geometric data augmentation can improve classification performance, F1 scores, by up to 11% on top of state-of-the-art models, across several key classes of wounds. Our experiments with GAN based augmentation prove the viability of using DE-GANs to generate wound images with richer variations. Our study and results show that data augmentation is a valuable privacy-preserving tool with huge potential to overcome the data scarcity limitations and we believe it will be part of any real-world ML-based wound care system.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenge of data scarcity in chronic wound classification. Specifically, chronic wounds (such as diabetic ulcers, pressure ulcers, surgical ulcers, and venous ulcers) pose a significant burden on individuals and the social medical system, affecting millions of people and bringing high treatment costs. Although deep - learning techniques have great potential in wound classification and can accelerate the diagnosis and treatment process, the lack of high - quality data required to train these models has become a major obstacle to realizing this potential. ### Specific manifestations of the problem: 1. **Data scarcity**: It is often not feasible to collect a large number of real - patient wound images due to privacy and legal issues. 2. **Data imbalance**: The number of different types of wound images is unbalanced, with some categories having far more images than others, leading to problems such as over - fitting during model training. 3. **Limitations of existing research**: Many studies have shown that existing deep - learning models require more labeled data to improve accuracy and generalization ability, but in practical applications, it is very difficult to obtain sufficient data. ### Goals of the paper: To overcome the data scarcity problem by studying data augmentation techniques (Data Augmentation Techniques), especially geometric transformations and generative adversarial networks (GANs), thereby improving the performance of deep - learning models in the wound - classification task. Specific goals include: - **Improving classification accuracy**: Increase the diversity of training data through data augmentation techniques, thereby improving the classification performance of the model. - **Verifying the effectiveness of data augmentation**: Demonstrate the effectiveness of geometric data augmentation and GAN - based data augmentation methods in improving the F1 score and other evaluation metrics. - **Exploring the application of DE - GAN**: Use generative adversarial networks with decoder - encoder output noise (DE - GAN) to generate more abundant wound images to further improve the performance of the model. ### Main contributions: - **Geometric data augmentation**: Significantly improved the F1 score of some key categories (by up to 11%) through geometric transformations such as rotation and brightness adjustment. - **DE - GAN - generated images**: Demonstrated the feasibility of using DE - GAN to generate more complex and diverse wound images, although further optimization is still required to improve the quality of the generated images and the classification effect. In summary, this paper aims to solve the data scarcity problem in chronic wound classification through data augmentation techniques and provide more effective solutions for deep - learning models in practical applications.