CN-121997005-A - LIBS (laser induced breakdown spectroscopy) spectral feature optimization and machine learning based method for predicting element content in coal
Abstract
The invention relates to a method for predicting element content in coal based on LIBS spectral feature optimization and machine learning, which converts a preprocessed spectrum into a two-dimensional spectral matrix, then matches a peak wavelength point of the two-dimensional spectral matrix with an NIST standard atomic spectrum database, identifies spectral lines belonging to target elements, and screens out a characteristic wavelength set which appears stably in all coal sample spectrums as a candidate feature; then, a plurality of feature selection methods are applied to further screen out a core feature wavelength subset which is most relevant to the content of the target element and has minimum redundancy; and then adopting a plurality of machine learning regression algorithms to select the optimal model combination. According to the invention, through a two-stage screening strategy combining physical spectral line identification (NIST matching) and statistical learning feature selection, irrelevant and redundant spectral information is removed to the greatest extent, core features which are directly related to the content of target elements and are anti-interference are reserved, and the quality of model input data is improved from the source.
Inventors
- LIU CONG
- QIU JINBO
- DAI JIANPING
- SHI CHUNXIANG
- Jiang Ziyong
- LU JIANGFENG
- CHEN SHAOHUA
- ZHAO ZONGPING
- PENG ANZHAI
- HE YONG
- WU HAOKUN
- MAO JIHUA
- ZHU SHENGQIANG
- WANG ZHIHUA
- WENG WUBIN
- LIU SIYU
- ZHONG RUIYING
Assignees
- 中煤科工集团上海有限公司
- 浙江大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260410
Claims (10)
- 1. A method for predicting element content in coal based on LIBS spectral feature optimization and machine learning, the method comprising the steps of: collecting LIBS original spectrum of a coal sample to be detected, preprocessing the original spectrum to obtain a preprocessed spectrum, and converting the preprocessed spectrum into a two-dimensional spectrum matrix; Matching the peak wavelength point of the two-dimensional spectrum matrix with an NIST standard atomic spectrum database, extracting the wavelength characteristics of key elements, identifying spectral lines belonging to target elements, and screening out characteristic wavelength sets which appear stably in the spectrums of all coal samples as candidate characteristics; Taking the intensity data of the candidate characteristic wavelength set as an input variable and the corresponding known target element concentration as an output variable, and further screening out a core characteristic wavelength subset which is most relevant to the target element content and has minimum redundancy by applying a plurality of characteristic selection methods; And training based on the screened core characteristic wavelength subsets respectively by adopting a plurality of machine learning regression algorithms, establishing a target element content prediction model, optimizing super-parameters, comparing performance indexes of each model combination, and selecting the model combination with optimal prediction precision and robustness.
- 2. The method of claim 1, wherein the target element is an element in coal, the method further comprising obtaining a reference value for the target element content, the reference value being obtained by a chemical analysis method, the reference value being used for model training and performance evaluation, and/or, The coal sample to be measured is a plurality of coal samples with different target element contents obtained by grinding and sieving a plurality of different coal materials and mixing the different coal materials according to different proportions.
- 3. The method according to claim 1, wherein the pre-treatment comprises the steps of: noise reduction treatment is carried out on the spectrum through a wavelet transformation algorithm; Performing baseline correction on the spectrum after noise reduction by adopting a self-adaptive iterative re-weighting punishment least square method; The spectrum after baseline correction is subjected to Z-score standardization, so that each characteristic dimension is subjected to standard normal distribution with the mean value of 0 and the standard deviation of 1.
- 4. The method of claim 3, wherein the optimal parameters in the wavelet transform algorithm are selected from the group consisting of a decomposition level of 1 and a basis function Db5.
- 5. The method of claim 1, wherein the feature selection method comprises an L1 regularization algorithm, a continuous projection algorithm, or a minimum redundancy maximum correlation algorithm.
- 6. The method of claim 5, wherein the machine learning regression algorithm comprises an artificial neural network, support vector machine regression, random forest, gradient lifting decision tree, partial least squares regression, or gaussian process regression.
- 7. The method of claim 6, wherein the super-parameters are optimized using a cross-validation method, and wherein each model is optimized and selected to match each feature selection method.
- 8. The method of claim 7, wherein model performance is assessed by determining at least one of a coefficient, a cross-validation root mean square error, a predictive root mean square error, or an average absolute error.
- 9. The method of claim 1, further comprising classifying the coal sample into a training set and a test set according to a concentration gradient of the target element prior to model training using a machine learning regression algorithm.
- 10. The method according to claim 1, further comprising extracting intensity information of a subset of core characteristic wavelengths corresponding to the optimal model for LIBS spectra of the sample to be tested with unknown target element concentration, and inputting the intensity information into the trained optimal model, so as to output a predicted value of the target element content.
Description
LIBS (laser induced breakdown spectroscopy) spectral feature optimization and machine learning based method for predicting element content in coal Technical Field The invention relates to the technical field of spectrum analysis chemometrics and machine learning application, in particular to a method for predicting element content in coal based on LIBS spectrum feature optimization and machine learning. Background The coal mine distribution in China is very wide, the coal types are complex and changeable, and the coals in different producing areas have obvious differences in element composition, component content and the like. The design coal types are difficult to use all the time for a plurality of power station boilers, more and more thermal power generation enterprises need to use mixed coal or frequently replace the coal types, and the safe and stable operation of the power station boilers is seriously affected. Therefore, the content information of various elements in the coal can be rapidly and accurately obtained, and the method has great significance for clean and efficient operation in industries such as coal-fired power generation, coal chemical industry and the like. Currently, elemental analysis of coal is primarily dependent on laboratory standard methods such as X-ray fluorescence spectroscopy (XRF), atomic Absorption Spectroscopy (AAS), inductively coupled plasma mass spectrometry (ICP-MS), and the like. Although the methods have high accuracy, the inherent defects of complex sample preparation, long analysis period (from hours to days), high cost, incapability of realizing on-site real-time analysis and the like exist, and the real-time requirements of modern industry on process control and rapid screening of raw materials are difficult to meet. Laser Induced Breakdown Spectroscopy (LIBS) is an ideal choice for industrial online detection due to its advantages of simple sample preparation, rapid detection, in situ and full element synchronous analysis. However, due to differences in the physical properties (e.g., particle size, density, surface morphology) and chemical composition (e.g., volatile, ash, fixed carbon content) of the coal samples, the coupling efficiency of the laser and the species, plasma formation and evolution processes can be significantly affected, resulting in large differences in the spectral signal intensities generated by the same elemental content in different coal samples, resulting in quantitative analysis errors due to strong matrix effects, and in addition, the LIBS full spectrum contains thousands of wavelength data points, but there are a few characteristic wavelengths that are truly related and stable to the specific target elemental content. How to automatically and accurately mine out the most relevant and robust characteristic wavelength set with the content of each target element from massive and high-dimensional spectrum data becomes a core challenge for establishing a high-precision quantitative model. However, although some feature selection methods have been studied to optimize LIBS quantitative models, these methods are mostly built based on standard samples with relatively simple components or controllable matrix effects, and fail to adequately cope with the severe challenges caused by complex and variable sample matrices in actual coal production processes, so that the built models have significant problems of performance attenuation and insufficient generalization capability in industrial practical applications across batches and across coal types. In view of the foregoing, there is a strong need in the art for a data analysis method capable of effectively coping with the complex matrix effects of actual coal samples, where the method should be capable of screening out characteristic wavelengths with strong correlation and high robustness from the LIBS spectrum of the actual coal samples, and establishing a prediction model with high precision and generalization capability matched with the characteristic wavelengths, so as to promote the LIBS technology to be applied from a laboratory to an industrial field. Disclosure of Invention In view of the above, the present invention provides a method for optimizing and machine learning based on LIBS spectral features to predict elemental content in coal that addresses or at least alleviates one or more of the above-identified problems and other problems of the prior art. In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a method for predicting element content in coal based on LIBS spectral feature optimization and machine learning, comprising the following steps: (1) Collecting LIBS original spectrum of a coal sample to be detected, preprocessing the original spectrum to obtain a preprocessed spectrum, and converting the preprocessed spectrum into a two-dimensional spectrum matrix; (2) Matching the peak wavelength point of the two-dimensional spectrum matr