Search

CN-121997011-A - Microplastic infrared characteristic spectrum extraction and efficient and accurate identification method

CN121997011ACN 121997011 ACN121997011 ACN 121997011ACN-121997011-A

Abstract

The invention belongs to the field of environmental protection and microplastic identification, and discloses a microplastic infrared characteristic spectrum extraction and efficient and accurate identification method, which comprises the following steps of 1, obtaining infrared spectrum data of a microplastic sample, and constructing a microplastic spectrum database; step 2, extracting characteristic spectrum by adopting a progressive two-step method combining equidistant sampling and competition self-adaptive re-weighting sampling algorithm on the infrared spectrum data in step 1, step 3, carrying out standard normal transformation on the transmittance of the characteristic spectrum in step 2, and step 4, constructing a characteristic training set and a characteristic test set by taking the transmittance of the infrared spectrum after transformation in step 3 as input and the type of microplastic as output. And 5, for the micro plastics to be identified, acquiring spectral data through the step 1, acquiring the characteristic transmittance of the micro plastics to be identified through the characteristic wave number extracted in the step 2, carrying out standard normal transformation on the micro plastics to be identified through the step 3, and inputting the transformed infrared spectral transmittance into the micro plastics identification artificial neural network model to give the type of the micro plastics and realize the identification of the micro plastics. The method effectively avoids the traversal sampling of the whole spectrum data set, improves the extraction efficiency of the characteristic spectrum, adopts a genetic algorithm to carry out super-parameter optimization on the artificial neural network model, effectively prevents the overfitting of the model, and improves the generalization capability of the model.

Inventors

  • ZHANG SHUANGSHENG
  • QIANG JING
  • LIU HANHU
  • ZHOU JUNJIE

Assignees

  • 徐州工程学院

Dates

Publication Date
20260508
Application Date
20240402

Claims (6)

  1. 1. The method for extracting the infrared characteristic spectrum of the microplastic and efficiently and accurately identifying the infrared characteristic spectrum of the microplastic is characterized by comprising the following steps of: Step 1, acquiring infrared spectrum data of a micro-plastic sample, and constructing a micro-plastic spectrum database; Step 2, extracting characteristic spectrum by adopting a progressive two-step method combining equidistant sampling (Equal INTERVAL SAMPLING, EIS) and competition adaptive re-weighting sampling (Competitive ADAPTIVE REWEIGHTED SAMPLING, CARS) algorithm on the infrared spectrum data in the step 1; step3, carrying out standard normal transformation on the transmittance of the characteristic spectrum in the step 2; and 4, constructing a characteristic training set and a characteristic testing set by taking the transmittance of the infrared spectrum after conversion in the step 3 as input and the type of the microplastic as output. Training an artificial neural network (ARTIFICIAL NEURAL NETWORK, ANN) model by adopting a feature training set, and optimizing super parameters in the model by adopting a genetic algorithm through a feature testing set to form a final micro-plastic recognition ANN model; And 5, for the micro-plastic to be identified, acquiring spectral data through the step 1, acquiring the characteristic transmittance of the micro-plastic to be identified through the characteristic wave number extracted through the step 2, performing standard normal transformation on the micro-plastic to be identified through the step 3, and inputting the transformed infrared spectral transmittance into a micro-plastic identification ANN model, so that the type of the micro-plastic can be given, and the identification of the micro-plastic can be realized.
  2. 2. The method for extracting and efficiently and accurately identifying the infrared characteristic spectrum of the micro plastic according to the claim 1 is characterized in that in the step 1, infrared spectrum data of the micro plastic is obtained through a Fourier infrared spectrometer, and the scanning times of each micro plastic sample can be determined through literature investigation and pre-experiment.
  3. 3. The method for extracting and efficiently and accurately identifying the infrared characteristic spectrum of the microplastic according to claim 1 is characterized in that the characteristic spectrum is extracted in the step 2 by two steps of (1) EIS of an original spectrum and (2) characteristic spectrum extraction based on CARS algorithm. (1) The EIS process of the original spectrum is as follows: the abscissa wave number vector of the original infrared spectrum is The ordinate transmittance vector is Wherein the method comprises the steps of A wavenumber point, cm -1 , representing the axis of abscissa in the infrared spectrum; Representing the sum in the spectrum Corresponding transmittance,%. In the case of wave number interval taking, the interval value is assumed to be N, namely the first wave number point Taking N points as starting points For the second wave number point, sequentially taking points, and further constructing a new wave number vector r'. In order to determine the optimal interval value N, based on the micro plastic spectrum database (c groups of samples) constructed in the step 1, taking the transmissivity corresponding to r' as input, taking the micro plastic type as output, establishing a partial least square (PARTIAL LEAST Squares, PLS) model, performing k-fold cross validation, and determining the optimal interval value by taking the cross validation root mean square error (Root Mean Square Error of Cross Validation, RMSECV) as an evaluation index. The corresponding interval transmittance vector at RMSECV minimum is denoted as x '= [ x' 1 ,x' 2 ,…,x' q ]. K-fold cross-validation means that the C-group sample data is randomly and uniformly divided into k subsets (denoted as C 1 ,C 2 ,…,C k ), one subset is sequentially selected as a test set, the rest of the subsets are used as training sets, and k PLS models are built. The computational expression for the k-fold cross-validated RMSECV is: When the test set is C i (i=1, 2,..k), the model output is noted (u 1 i ,u 2 i ,…,u c/k i ) T , the correct output is noted (d 1 i ,d 2 i ,…,d c/k i ) T ,u j i and d j i both represent m-dimensional column vectors). (2) The process of extracting the characteristic spectrum based on the CARS algorithm is as follows: ① Initializing the weight, namely initializing the absolute value weight of the transmittance regression coefficient in the x 'by PLS modeling based on the transmittance x' determined in the step (1); ② Feature selection cycle-in each cycle, the following operations are performed: EDF sampling, namely adopting an attenuation exponential method (Exponentially Decreasing Function, EDF) to reject the transmittance variable with smaller absolute value aiming at the current weight. ARS sampling, namely selecting characteristic transmittance by adopting an adaptive re-weighting algorithm (ADAPTIVE REWEIGHTED SAMPLING, ARS) based on Monte Carlo sampling (Monte Carlo sampling, MCS) aiming at EDF sampling results. PLS modeling, namely taking the characteristic transmittance after ARS sampling as input and the micro plastic type as output, and adopting PLS modeling. ③ And (3) outputting a result, namely setting iteration times, repeating the steps a, b and c, and taking a transmittance combination corresponding to the model with the minimum RMSECV as a final characteristic transmittance variable. The corresponding characteristic wave number at RMSECV minimum is denoted as r "= [ r" 1 ,r″ 2 ,…,r″ n ], and the characteristic transmittance vector is denoted as x "= [ x" 1 ,x″ 2 ,…,x″ n ].
  4. 4. The method for extracting and efficiently and accurately identifying the infrared characteristic spectrum of the micro plastic according to claim 1, wherein the formula for carrying out standard normal transformation on a characteristic spectrum transmittance vector x "= [ x '' - 1 ,x″ 2 ,…,x″ n ] in the step 3 is as follows: x=(x”-μ)/σ Where x is the transformed characteristic transmittance vector, μ is the mean of x ", and σ is the standard deviation of x". The mean value of x after transformation is 0 and the standard deviation is 1. After treatment, the characteristic transmittance between different samples has similar distribution characteristics, and is convenient for comparison and analysis.
  5. 5. The method for extracting and efficiently and accurately identifying the infrared characteristic spectrum of the micro plastic according to claim 1, wherein the sample number ratio of the characteristic training set to the characteristic testing set in the step 4 is generally 1:2-8:2, and is determined according to the sample data amount. The ANN model includes a Single-layer neural network (SNN) model, a Multi-layer neural network (Multi-layer Neural Network, MNN) model, and a convolutional neural network (Convolutional Neural Network, CNN) model. And optimizing the super parameters of the ANN model by adopting a genetic algorithm, wherein the method comprises the following steps of: ① Screening a feature training set and a feature testing set from the original training set and the testing set respectively based on the feature wave number r' and carrying out standard normal transformation; ② Building an ANN model by using the feature training set; ③ Determining a model hyper-parameter to be optimized according to the ANN model, taking the model hyper-parameter as an optimization variable z, and giving a value range [ A, B ]; ④ Establishing an optimization model Wherein f (z) represents the correct recognition rate of the model in the feature test set when the ANN model hyper-parameter is taken as z; ⑤ Solving an optimization model in ④ by using a genetic algorithm to obtain an optimal solution z * ; ⑥ And taking z * as a super parameter in the ANN model, and building an optimized ANN model by using a feature training set.
  6. 6. The method for extracting and efficiently and accurately identifying the infrared characteristic spectrum of the microplastic according to claim 1, wherein the identification performance evaluation index of the ANN model in the step 5 comprises a root mean square error (Root Mean Squared Error, RMSE) and a correct identification rate (Accuracy). The training set and the testing set adopt RMSE to evaluate the Accuracy of the ANN model, and adopt correct recognition rate (Accuracy) to evaluate the recognition performance of the ANN model. Taking training set as an example to give a calculation formula of each index: Where y= (Y 1 ,y 2 ,…,y a ) T represents the model output of a samples in the training set, d= (D 1 ,d 2 ,…,d a ) T represents the correct output of the training set, and Y i and D i both represent m-dimensional column vectors; In the formula, TP and FP represent the number of correctly identified samples and the number of incorrectly identified samples in the training set respectively.

Description

Microplastic infrared characteristic spectrum extraction and efficient and accurate identification method Technical Field The invention relates to the technical field of environmental protection and microplastic identification, in particular to a microplastic infrared characteristic spectrum extraction and efficient and accurate identification method. Background Plastic products are widely used in the fields of cosmetics and personal care products, textiles and clothing, packaging materials, agriculture, construction and building materials, etc., because of their light weight and durability, and outstanding advantages in terms of color, touch and shape, etc. Most consumer plastics are disposable and most accumulate in natural systems. Under the influence of natural forces such as light, oxidation, weathering, etc., these large plastics entering the environment gradually decompose into fine plastic contaminants (fine plastic contaminants with diameters less than 5mm are generally defined as microplastic). Experiments have shown that exposure to microplastic causes a variety of toxic effects including oxidative stress, metabolic disorders, immune responses, neurotoxicity, reproductive and developmental toxicity, and the like. The environmental and health risks posed by plastics must therefore be of great importance, and microplastic pollution is one of the most serious environmental challenges in the 21 st century. The establishment of an efficient and accurate plastic identification and detection method is the basis for plastic recycling and establishment of micro plastic pollution prevention measures. The existing identification and detection methods of microplastic mainly comprise a visual method, a thermal analysis method and a spectroscopic method. Visual methods are highly subjective and thermal analysis requires the determination of the chemical composition of a sample by measuring the thermal properties of the material at a certain temperature, which can easily damage the sample. With the development of computer science, the combined use of spectroscopic techniques and machine learning has been widely used in the field of identification and detection of microplastic. The infrared spectrum analysis technology has the characteristics of simple operation, high sensitivity, accurate wave number, good repeatability and the like, and is widely applied to identification and detection of microplastic. However, when the spectrum is used for constructing the identification model, full spectrum modeling is often adopted, so that the calculated amount is large and the efficiency is low. In addition, how to prevent the overfitting of the model and improve the generalization capability of the model is also a problem to be solved in the identification and detection of microplastic. Disclosure of Invention The invention aims to provide a method for extracting and efficiently and accurately identifying infrared characteristic spectrums of microplastic, which can extract characteristic spectrums on the basis of keeping main information of the infrared spectrums and realize efficient and accurate identification of the microplastic by constructing an artificial neural network (ARTIFICIAL NEURAL NETWORK, ANN) model. The method can prevent the loss of the characteristic spectrum and improve the modeling efficiency of the ANN model. In order to achieve the purpose, the invention provides the following technical scheme that the microplastic infrared characteristic spectrum extraction and efficient and accurate identification method comprises the following steps: Step 1, acquiring infrared spectrum data of a micro-plastic sample, and constructing a micro-plastic spectrum database; Step 2, extracting characteristic spectrum by adopting a progressive two-step method combining equidistant sampling (Equal INTERVAL SAMPLING, EIS) and competition adaptive re-weighting sampling (Competitive ADAPTIVE REWEIGHTED SAMPLING, CARS) algorithm on the infrared spectrum data in the step 1; step3, carrying out standard normal transformation on the transmittance of the characteristic spectrum in the step 2; and 4, constructing a characteristic training set and a characteristic testing set by taking the transmittance of the infrared spectrum after conversion in the step 3 as input and the type of the microplastic as output. Training the ANN model by adopting a characteristic training set, and optimizing the super parameters in the model by adopting a genetic algorithm through a characteristic testing set to form a final microplastic recognition ANN model; And 5, for the micro-plastic to be identified, acquiring spectral data through the step 1, acquiring the characteristic transmittance of the micro-plastic to be identified through the characteristic wave number extracted through the step 2, performing standard normal transformation on the micro-plastic to be identified through the step 3, and inputting the transformed infrared spectral transmit