Quantitative Analysis of Coal Properties Using Laser-Induced Breakdown Spectroscopy and Semi-Supervised Learning

Wang An,Cui Jia-cheng,Song Wei-ran,Hou Zong-yu,Chen Xiang,Chen Fei
DOI: https://doi.org/10.3964/j.issn.1000-0593(2024)07-1940-06
2024-01-01
Spectroscopy and spectral analysis
Abstract:Laser-induced breakdown spectroscopy(LIBS)is an emerging atomic spectroscopy technique that has the advantages of low sample pre-treatment and rapid,in situ,and simultaneous multi-element measurements.LIBS demonstrates good prospects in the field of coal analysis.In recent years,chemometric and machine learning models have been widely used to improve the quantitative accuracy of LIBS in coal analysis.Generally,these models rely on a certain number of training samples to ensure the reliability of the prediction results.However,obtaining the certified content(label information)of coal samples used for model training requires traditional chemical analysis,which is complex and time-consuming.This may lead to insufficient training samples and poor model performance.To tackle the small sample problem in LIBS-based coal analysis,this work proposes a semi-supervised learning method based on the ensemble of multiple models.5 baseline models are first established based on the initial training set,including multiple linear regression(MLR),partial least squares regression(PLSR),locally weighted partial least squares regression(LW-PLSR),support vector regression(SVR),and kernel extreme learning machine(K-ELM).The unlabelled data are processed using the 5 models,and 5 prediction values are obtained.For each unlabelled sample,the standard deviation of the 5 prediction values is calculated,and the unlabelled sample corresponding to the smallest standard deviation is added to the training set.Its pseudo label is the average of the 5 prediction values.As the training set is iteratively expanded,its corresponding training model is updated.The final training model is optimized and used to analyse the test samples.The proposed method is tested on a coal dataset containing 20 training samples,39 test samples and 280 unlabelled samples.The results show that the proposed method improves the coefficient of determination(R2)for content prediction of fixed carbon,ash,and volatile by 0.033,0.102 and 0.118,respectively.Therefore,if the number of training samples is insufficient,semi-supervised learning can effectively improve the accuracy and reliability of LIBS quantification.
What problem does this paper attempt to address?