Improving Mineral Classification Using Multimodal Hyperspectral Point Cloud Data and Multi-Stream Neural Network

Aldino Rizaldy,Ahmed Jamal Afifi,Pedram Ghamisi,Richard Gloaguen
DOI: https://doi.org/10.3390/rs16132336
IF: 5
2024-06-27
Remote Sensing
Abstract:In this paper, we leverage multimodal data to classify minerals using a multi-stream neural network. In a previous study on the Tinto dataset, which consisted of a 3D hyperspectral point cloud from the open-pit mine Corta Atalaya in Spain, we successfully identified mineral classes by employing various deep learning models. However, this prior work solely relied on hyperspectral data as input for the deep learning models. In this study, we aim to enhance accuracy by incorporating multimodal data, which includes hyperspectral images, RGB images, and a 3D point cloud. To achieve this, we have adopted a graph-based neural network, known for its efficiency in aggregating local information, based on our past observations where it consistently performed well across different hyperspectral sensors. Subsequently, we constructed a multi-stream neural network tailored to handle multimodality. Additionally, we employed a channel attention module on the hyperspectral stream to fully exploit the spectral information within the hyperspectral data. Through the integration of multimodal data and a multi-stream neural network, we achieved a notable improvement in mineral classification accuracy: 19.2%, 4.4%, and 5.6% on the LWIR, SWIR, and VNIR datasets, respectively.
environmental sciences,geosciences, multidisciplinary,imaging science & photographic technology,remote sensing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the classification accuracy in the mineral classification task. Specifically, the authors use multi - modal hyperspectral point cloud data and multi - stream neural networks to improve mineral classification. In previous studies, they used a 3D hyperspectral point cloud data set (Tinto data set) from the Corta Atalaya open - pit mine in Spain and successfully identified mineral categories through various deep - learning models. However, these previous works only relied on hyperspectral data as the input of the deep - learning model. In this study, the authors aim to improve the classification accuracy by combining multi - modal data (including hyperspectral images, RGB images and 3D point clouds). ### Main Methods and Contributions 1. **Multi - modal Data Fusion**: - The authors adopted the Graph - Convolutional Neural Network (Graph - CNN), which performs well in aggregating local information and has performed well in various situations based on past observations of different hyperspectral sensors. - A multi - stream neural network was constructed, with each stream specifically processing one type of modal data (hyperspectral, RGB, geometric). - A channel attention module was introduced in the hyperspectral stream to make full use of the spectral information in the hyperspectral data. 2. **Network Architecture**: - The network architecture consists of two parts: the first part contains multiple branches, each of which processes different modal data; the second part is a single - stream network for aggregating the features learned in the previous stage. - The EdgeConv operator was used as a feature encoder, which performs well on hyperspectral data and is lightweight and efficient. - 3D Convolutional Neural Networks (3D CNNs) were introduced in the hyperspectral stream to capture spectral patterns, thereby improving the segmentation performance. 3. **Experimental Results**: - By integrating multi - modal data and multi - stream neural networks, the authors increased the mineral classification accuracy on the LWIR, SWIR and VNIR data sets by 19.2%, 4.4% and 5.6% respectively. - A comparative experiment between point cloud classification and image classification was carried out, and the results showed that the accuracy of point cloud classification was significantly better than that of image classification. ### Key Contributions 1. **Advantages of Multi - modal Data**: - It was proved that using multi - modal data has higher classification accuracy in the mineral classification task than using only a single modal data. 2. **Design of Multi - stream Network**: - It was shown that the network can learn different modal data in 3D hyperspectral point clouds, and this was achieved by expanding the network into a multi - stream architecture. 3. **Channel Attention Mechanism**: - A method of using 3D Convolutional Neural Networks in the hyperspectral stream was introduced to capture spectral patterns, thereby improving the segmentation performance. 4. **Comparison between Point Cloud and Image Classification**: - A direct comparison between point cloud classification and image classification was carried out, highlighting the superior performance of point cloud classification. In conclusion, this paper significantly improves the accuracy of mineral classification by combining multi - modal data and multi - stream neural networks, providing new ideas and technical solutions for research in related fields.