Spectral Image Data Fusion for Multisource Data Augmentation

Roberta Iuliana Luca,Alexandra Baicoianu,Ioana Cristina Plajer
2024-04-05
Abstract:Multispectral and hyperspectral images are increasingly popular in different research fields, such as remote sensing, astronomical imaging, or precision agriculture. However, the amount of free data available to perform machine learning tasks is relatively small. Moreover, artificial intelligence models developed in the area of spectral imaging require input images with a fixed spectral signature, expecting the data to have the same number of spectral bands or the same spectral resolution. This requirement significantly reduces the number of usable sources that can be used for a given model. The scope of this study is to introduce a methodology for spectral image data fusion, in order to allow machine learning models to be trained and/or used on data from a larger number of sources, thus providing better generalization. For this purpose, we propose different interpolation techniques, in order to make multisource spectral data compatible with each other. The interpolation outcomes are evaluated through various approaches. This includes direct assessments using surface plots and metrics such as a Custom Mean Squared Error (CMSE) and the Normalized Difference Vegetation Index (NDVI). Additionally, indirect evaluation is done by estimating their impact on machine learning model training, particularly for semantic segmentation.
Computer Vision and Pattern Recognition,Numerical Analysis
What problem does this paper attempt to address?
This paper focuses on the fusion of multi-source spectral image data to enhance the quantity of data in machine learning. Currently, although multispectral and hyperspectral images are widely used in remote sensing, astronomical imaging, and precision agriculture, the availability of free data for machine learning tasks is relatively limited. In addition, artificial intelligence models usually require input images with fixed spectral features, which limits the available data sources. The objective of this research is to propose a method for spectral image data fusion that allows machine learning models to utilize data from multiple sources for training and application, thereby improving generalization ability. The paper proposes different interpolation techniques to make spectral data from different sources compatible. The interpolation effect is evaluated through direct assessment (such as custom mean square error CMSE and normalized difference vegetation index NDVI) and indirect assessment (by estimating the impact on machine learning model training, especially for semantic segmentation tasks). In the paper, the authors use several publicly available multispectral and hyperspectral image datasets and fuse them through interpolation methods. The fused datasets are then used to train and test neural networks for semantic segmentation tasks to validate the effectiveness of this approach. The results show that the proposed fusion method can be used to create larger image collections for different machine learning algorithms and has a positive impact on model performance. In conclusion, the paper attempts to address the limitations of spectral image data by expanding the available dataset through data fusion techniques to improve the generalization ability of machine learning models and their performance in tasks such as semantic segmentation.