Application of Genetic Algorithm in Eigenvalue Extraction of Infrared Spectrum and Model Optimization

Li Yujun,Tang Xiaojun,Liu Junhua
DOI: https://doi.org/10.1109/icemi.2011.6037886
2011-01-01
Abstract:As the overlapping band of the infrared spectrum absorption wavelength of the gas mixture is serious and the variables of the spectrum datas are very great, least square support vector machine (LS-SVM) is introduced to create the gas mixtures quantitative analysis models. The gas mixtures are made up of methane, ethane and propane gases, which concentration range of every component is from 1% to 10%, 1% to 10% and 1% to 12.5% respectively. In order to avoid the saturation absorption spectrum, the spectrum data which has 551 variables those close to secondary absorption wavelength are selected. In order to decrease the computation complexity and improve the accuracy of the prediction models, genetic algorithm (GA) is proposed to extract the eigenvalues of the infrared spectrum data. This method is compared with principal component analysis (PCA) method. The eigenvalues extracted by these two methods are the input of the model and calibration concentration of every component gas are the expectation output, each component quantitative analysis model is reconstructed by LS-SVM respectively. Because the hyper-parameters of LS-SVM i.e. penalty factor and kernel parameter have great impact on the accuracy of the models, GA is introduced to optimize it again. Then the optimal regression models would be modeled according to the optimal hyper-parameters. The experiment results shows that the 12 eigenvalues extracted by GA are the input variables of every component gas optimal regression model, the average mean square error (MSE) of the three prediction models (which is about 3.57E-6) is reduced to about one order of magnitude that of the model built based on the 5 eigenvalues extracted by PCA (which is about 1.13E-5). So the method combined GA with LS-SVM to extract eigenvalues of the infrared spectrum data is feasible. It has more superior performance in improving the accuracy of the prediction models, and has definite development space.
What problem does this paper attempt to address?