iCassava 2019 Fine-Grained Visual Categorization Challenge

Ernest Mwebaze,Timnit Gebru,Andrea Frome,Solomon Nsumba,Jeremy Tusubira
DOI: https://doi.org/10.48550/arXiv.1908.02900
2019-12-24
Abstract:Viral diseases are major sources of poor yields for cassava, the 2nd largest provider of carbohydrates in <a class="link-external link-http" href="http://Africa.At" rel="external noopener nofollow">this http URL</a> least 80% of small-holder farmer households in Sub-Saharan Africa grow cassava. Since many of these farmers have smart phones, they can easily obtain photos of dis-eased and healthy cassava leaves in their farms, allowing the opportunity to use computer vision techniques to monitor the disease type and severity and increase yields. How-ever, annotating these images is extremely difficult as ex-perts who are able to distinguish between highly similar dis-eases need to be employed. We provide a dataset of labeled and unlabeled cassava leaves and formulate a Kaggle challenge to encourage participants to improve the performance of their algorithms using semi-supervised approaches. This paper describes our dataset and challenge which is part of the Fine-Grained Visual Categorization workshop at CVPR2019.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to monitor and diagnose cassava diseases through computer vision technology in order to increase yield and reduce losses caused by diseases. Specifically, the paper focuses on the following aspects: 1. **Cassava Disease Monitoring and Diagnosis**: Cassava is the second - largest source of carbohydrates in Africa, but viral diseases are the main cause of the decline in cassava production. Since many small - scale farmers are unable to identify these diseases, government agricultural experts need to go to the farms in person for inspection, which is not only time - consuming and labor - intensive, but also the number of experts is limited and it is difficult to cover all areas. Therefore, the paper proposes to use computer vision algorithms on smart phones to help farmers monitor and diagnose cassava diseases. 2. **Difficulty in Data Labeling**: The symptoms of cassava diseases are very similar, and multiple diseases may exist simultaneously on a single leaf, which makes data labeling very difficult. To overcome this problem, the paper provides a data set containing labeled and unlabeled images and encourages participants to use semi - supervised methods to improve algorithm performance. 3. **Application on Low - Resource Devices**: In order to enable the solution to run on farmers' smart phones, the algorithm needs to be lightweight and efficient and can be executed quickly on low - resource devices. ### Paper Background - **Importance of Cassava**: Cassava is a key food crop in Africa and is crucial for food security. However, due to the impact of diseases, the annual production loss is estimated to be between $12 million and $23 million. - **Limitations of Current Diagnostic Methods**: Currently, it mainly relies on government agricultural experts for on - site inspections. This method is highly subjective and difficult to implement on a large scale. - **Potential of Computer Vision**: Through computer vision technology, automated tools can be developed to help farmers remotely monitor and diagnose cassava diseases, thereby increasing yield. ### Data Set and Challenges - **Data Set**: The paper provides 9,436 labeled images and 12,595 unlabeled images, covering healthy cassava leaves and four common cassava diseases: Cassava Mosaic Disease (CMD), Cassava Brown Streak Disease (CBSD), Cassava Bacterial Blight (CBB), and Cassava Green Mite (CGM). - **Kaggle Challenge**: The paper organized a Kaggle competition to encourage participants to develop algorithms that can accurately distinguish these four diseases. The competition uses overall accuracy as an evaluation metric, and the top few contestants finally achieved an accuracy of about 93%. ### Main Contributions - **Real - World Data Set**: The provided data set reflects the complexity in practical applications, including different backgrounds, lighting conditions, co - existence of multiple diseases, and image blurring. - **Semi - supervised Method**: Encourage participants to use a large amount of unlabeled data to improve model performance, which is particularly important in practical applications. - **Lightweight Model**: Emphasize the lightweight and high - efficiency of the model so that it can run on low - resource devices. Through these efforts, the paper aims to promote the computer vision community to pay attention to and solve food security problems in the real world, especially to help small - scale farmers increase cassava production.