Deep Learning Based Mineral Image Classification Combined With Visual Attention Mechanism

Yang Liu,Zelin Zhang,Xiang Liu,Wang Lei,Xuhui Xia
DOI: https://doi.org/10.1109/access.2021.3095368
IF: 3.9
2021-01-01
IEEE Access
Abstract:Mineral image classification technology based on machine vision is an efficient system for ore sorting. With the development of artificial intelligence and computer technology, the deep learning-based mineral image classification system is gradually applied to ore sorting. However, there is a bottleneck in improving classification accuracy, and the feature extraction ability of the CNNs model is relatively limited for multi-category mineral image classification tasks. Therefore, four visual attention blocks are designed and embedded in the existing CNNs model, and new mineral image classification models based on the visual attention mechanism and CNNs are proposed. Then, referring to the building strategies of the different depth ResNet, we build various CNNs model embedding with attention blocks for mineral image classification and visualize the models by Grad-CAM to observe the change in classification weight distributions and classification weight values. Finally, by using the confusion matrices, this experiment systematically evaluates the classification performance of the proposed models and analyzes the misjudgment rate.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?
The paper mainly aims to address two key issues in mineral image classification: 1. **Improving the accuracy of multi-class mineral image classification**: Existing deep learning-based mineral image classification systems encounter bottlenecks when handling multi-class (more than 2 classes) mineral image classification tasks, particularly in terms of improving classification accuracy. 2. **Optimizing feature extraction capability**: Convolutional Neural Networks (CNNs) models have relatively limited feature extraction capabilities when dealing with multi-class mineral image classification tasks. Specifically, during the training phase, the model may be affected by irrelevant information in the images (such as reflections, dust, and noise points), which can interfere with the model's feature extraction ability, leading to a loss in classification accuracy. To address the above issues, the paper proposes the following solutions: - Four types of visual attention modules were designed and embedded into existing CNNs models to construct a new mineral image classification model based on the visual attention mechanism. These attention modules help the model focus on key areas in the images, thereby improving classification performance. - Referring to different depths of ResNet construction strategies, various CNNs models with attention modules were experimentally constructed, and these models were visualized using Grad-CAM to observe changes in the distribution and values of classification weights. - The proposed models were systematically evaluated for classification performance using confusion matrices, and the misclassification rates were analyzed. The focus of the paper is on exploring how to construct and embed visual attention modules, comparing the performance of different attention modules in multi-class mineral image classification tasks, and comparing the classification performance differences between conventional CNNs models and CNNs models with embedded attention modules. Finally, it also studies how attention modules affect the distribution and values of model classification weights.