CongNaMul: A Dataset for Advanced Image Processing of Soybean Sprouts

Byunghyun Ban,Donghun Ryu,Su-won Hwang
DOI: https://doi.org/10.1109/ICTC58733.2023.10393217
2023-08-31
Abstract:We present 'CongNaMul', a comprehensive dataset designed for various tasks in soybean sprouts image analysis. The CongNaMul dataset is curated to facilitate tasks such as image classification, semantic segmentation, decomposition, and measurement of length and weight. The classification task provides four classes to determine the quality of soybean sprouts: normal, broken, spotted, and broken and spotted, for the development of AI-aided automatic quality inspection technology. For semantic segmentation, images with varying complexity, from single sprout images to images with multiple sprouts, along with human-labelled mask images, are included. The label has 4 different classes: background, head, body, tail. The dataset also provides images and masks for the image decomposition task, including two separate sprout images and their combined form. Lastly, 5 physical features of sprouts (head length, body length, body thickness, tail length, weight) are provided for image-based measurement tasks. This dataset is expected to be a valuable resource for a wide range of research and applications in the advanced analysis of images of soybean sprouts. Also, we hope that this dataset can assist researchers studying classification, semantic segmentation, decomposition, and physical feature measurement in other industrial fields, in evaluating their models. The dataset is available at the authors' repository. (<a class="link-external link-https" href="https://bhban.kr/data" rel="external noopener nofollow">this https URL</a>)
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning,Image and Video Processing
What problem does this paper attempt to address?
The paper attempts to address the issue of the lack of high-quality datasets for advanced processing of soybean sprout images. Specifically, the paper mentions: 1. **Need for Quality Control**: Due to the large-scale production and high consumption of soybean sprouts, improving the level of product quality control has become key to competition. However, it is almost impossible to measure each soybean sprout individually on industrial production lines, and even sampling surveys take a considerable amount of time. Therefore, developing technology that can automatically measure the growth results of soybean sprouts has become very necessary. 2. **Lack of Datasets**: Although there has been a lot of research on soybeans themselves, there are very few data-driven analyses or available datasets related to soybean sprouts. This makes AI-based research on soybean sprout production more difficult. 3. **Need for Advanced Analysis**: Considering the huge demand for soybean sprouts in Korea and the similar cultivation methods for mung bean sprouts globally, applying advanced analysis to soybean sprout images has significant advantages. Therefore, creating a complex soybean sprout image dataset is crucial. To address these issues, the authors constructed a comprehensive dataset named "CongNaMul," aimed at supporting the following tasks: - **Image Classification**: Distinguishing the quality of soybean sprouts (normal, broken, spotted, broken and spotted) for developing AI-assisted automatic quality inspection technology. - **Semantic Segmentation**: Including images of varying complexity from single soybean sprout images to multiple soybean sprout images, as well as manually annotated mask images, with labels divided into four categories: background, head, body, and tail. - **Image Decomposition**: Providing images and masks for decomposing multiple soybean sprout images into individual images. - **Physical Feature Measurement**: Providing measurements of 5 physical features (head length, body length, body thickness, tail length, weight) for image-based measurement tasks. Through these tasks, the CongNaMul dataset can provide valuable resources for advanced analysis of soybean sprout images and contribute to the development of automated quality control technology.