Revealing Geochemical Patterns Associated with Mineralization Using t-Distributed Stochastic Neighbor Embedding and Random Forest
Zixian Shi,Renguang Zuo,Yihui Xiong,Siquan Sun,Bao Zhou
DOI: https://doi.org/10.1007/s11004-022-10024-y
2022-10-06
Mathematical Geosciences
Abstract:The identification of multivariate geochemical anomalies is critical in mineral exploration. Machine learning algorithms have been successfully employed in the recognition of multivariate geochemical anomalies in support of mineral exploration, owing to their strong ability to learn the complex relationship between geochemical characteristics and mineralization. However, applications of machine learning algorithms suffer from data redundancy and the curse of dimensionality. In this study, a hybrid model combining t-distributed stochastic neighbor embedding (t-SNE) and random forest (RF) was used to solve the aforementioned problems in geochemical mapping for gold exploration in the northwestern Hubei Province of China. Specifically, t-SNE was used for dimension reduction and feature extraction from the major and trace elements of geochemical survey data, and RF was used for probabilistic classification of geochemical patterns related to gold deposits. A comparative study demonstrated that the hybrid model of t-SNE + RF possesses stronger generalization ability than that of PCA + RF and pure RF. Specifically, after 15 experiments, the mean area under the receiver operator characteristic curve (AUC) values of t-SNE + RF, PCA + RF, and pure RF were 0.83, 0.65, and 0.75, respectively. These results suggest that the hybrid model combining t-SNE and RF can more efficiently recognize geochemical anomalies associated with gold mineralization. Compared with PCA, t-SNE can more effectively identify hidden information in complex and nonlinear geochemical survey data. In addition, it can reduce information redundancy and further improve the efficiency of RF for processing multidimensional geochemical survey data. The high-probability areas obtained by t-SNE + RF showed a strong spatial correlation with known gold deposits, which can provide critical clues for further prospecting in the study area.
geosciences, multidisciplinary,mathematics, interdisciplinary applications