The PV-ALE Dataset: Enhancing Apple Leaf Disease Classification Through Transfer Learning with Convolutional Neural Networks

Joseph Damilola Akinyemi,Kolawole John Adebayo
2024-10-30
Abstract:As the global food security landscape continues to evolve, the need for accurate and reliable crop disease diagnosis has never been more pressing. To address global food security concerns, we extend the widely used PlantVillage dataset with additional apple leaf disease classes, enhancing diversity and complexity. Experimental evaluations on both original and extended datasets reveal that existing models struggle with the new additions, highlighting the need for more robust and generalizable computer vision models. Test F1 scores of 99.63% and 97.87% were obtained on the original and extended datasets, respectively. Our study provides a more challenging and diverse benchmark, paving the way for the development of accurate and reliable models for identifying apple leaf diseases under varying imaging conditions. The expanded dataset is available at <a class="link-external link-https" href="https://www.kaggle.com/datasets/akinyemijoseph/apple-leaf-disease-dataset-6-classes-v2" rel="external noopener nofollow">this https URL</a> enabling future research to build upon our findings.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the accuracy and reliability of apple leaf disease classification in response to global food security issues. Specifically: 1. **Expand the existing dataset**: The paper expands the widely - used PlantVillage dataset by adding additional apple leaf disease categories (such as Alternaria leaf spot and powdery mildew), thus enhancing the diversity and complexity of the dataset. This helps to train more general - purpose and robust computer vision models. 2. **Improve model performance**: By introducing new disease categories, the research found that the existing models perform poorly when dealing with these newly added categories, highlighting the need to develop more powerful and general - purpose deep - learning models. The experimental results show that the F1 score on the original dataset is 99.63%, while the F1 score on the extended dataset is 97.87%. 3. **Provide a more challenging benchmark**: The extended dataset (PV - ALE) provides a more challenging and diverse benchmark, promoting the development of more accurate and reliable apple leaf disease identification models, especially under different imaging conditions. 4. **Promote future research**: By opening the extended dataset, researchers can conduct further research on this basis, promoting the progress in the field of apple leaf disease detection. ### Core contributions of the paper - **Comprehensive apple leaf image dataset**: High - quality annotated images with multiple apple leaf disease labels. - **Efficient CNN architecture**: A convolutional neural network architecture specifically designed for multi - class classification tasks. - **Strict evaluation metrics**: Evaluate the performance of the model in different scenarios, including cases of class imbalance. - **Superior accuracy**: Demonstrates higher accuracy than existing methods, indicating its potential in practical applications. ### Problems solved The paper mainly solves the following problems: - **Limitations of existing datasets**: Most existing datasets are small in size, have class imbalance and lack diversity. - **Insufficient model generalization ability**: Existing models perform poorly when facing new categories or complex backgrounds. - **Lack of publicly available large - scale datasets**: This hinders the training and verification of models and limits the progress of research. Through the solution of these problems, the paper provides new tools and methods for the field of apple leaf disease detection and promotes the progress of related technologies.