CN-121207914-B - Rapid nondestructive testing method and system for lipid content and degradation degree of Korean pine nut based on hyperspectral imaging and deep learning

CN121207914BCN 121207914 BCN121207914 BCN 121207914BCN-121207914-B

Abstract

The invention discloses a rapid nondestructive testing method and a rapid nondestructive testing system for the lipid content and the degradation degree of Korean pine nuts based on hyperspectral imaging and deep learning, and belongs to the technical field of nondestructive testing of quality and safety of agricultural and forestry products. The invention aims at solving the problems that the existing detection method for the lipid content and the oxidation degree in the red pine nuts is generally complex in operation flow, high in mass detection cost, long in time consumption and difficult to realize the detection of the whole storage and transportation process. The technical key point is that original near infrared spectrum data of pine nut samples are obtained through collected hyperspectrum, lipid reference content true value and oxidation degradation reference degree of the pine nut samples at different sampling times are measured, and a database is built according to the lipid reference content true value and the oxidation degradation reference degree. Preprocessing the collected raw near infrared spectrum data of the Korean pine nuts, and dividing the Korean pine nut sample data collected in different batches into a training set and a verification set. The improved one-dimensional cavity convolution network construction deep learning model based on the dynamic weight distribution module is designed and used for constructing a deep learning structure suitable for analysis of spectrum characteristics of red pine nuts. Training the constructed model by adopting a back propagation algorithm, and realizing the back updating of the network weight parameters by minimizing the loss function. When the performance of the constructed model meets the requirement of rapid detection, the method is used for high-efficiency and nondestructive synchronous detection of the lipid content and the oxidative rancidity degree of the red pine nuts.

Inventors

LI HONGBO
NAN YUE
ZHANG TIANLE
ZHANG XIHAI
JIANG DAPENG
HUANG JUNJIE
DONG WANJING
WANG HAO
WANG KAILI
YIN XUPENG
ZHANG QIUYING

Assignees

东北农业大学

Dates

Publication Date: 20260505
Application Date: 20251029

Claims (10)

1. A fast nondestructive testing method for the lipid content and deterioration degree of Korean pine nut based on hyperspectral imaging and deep learning is characterized by comprising the following steps: S1, placing pine nut samples of the same harvest batch in a sample bin for proper storage, and ensuring that the degradation degree of the pine nut samples at the same sampling time point can be kept in an equilibrium state; s2, sampling pine nut samples at intervals of 1-3 Days, taking out the pine nut samples, placing the pine nut samples on an experiment table, keeping no stacked shielding among the pine nut samples, and carrying out hyperspectral scanning acquisition; s3, acquiring original near infrared spectrum data of the pine nut sample through the collected hyperspectrum, measuring lipid reference content true value and oxidation degradation reference degree of the pine nut sample at different sampling times, and establishing a database according to the lipid reference content true value and the oxidation degradation reference degree to ensure that the lipid reference content true value and the oxidation degradation reference degree correspond to the original spectrum data measured in the S2; s4, preprocessing the collected raw near infrared spectrum data of the Korean pine nuts, and dividing the Korean pine nut sample data collected in different batches into a training set and a verification set, wherein the dividing process accords with a random sampling principle and a machine learning modeling data set setting principle; S5, designing an improved one-dimensional cavity convolutional network based on a dynamic weight distribution module on the basis of the one-dimensional convolutional neural network to construct a deep learning model suitable for analysis of spectrum characteristics of red pine nuts, enhancing modeling capacity of long-distance dependence in a spectrum sequence by introducing a cavity convolutional mechanism, and realizing self-adaptive adjustment of a characteristic channel by combining a dynamic weight mechanism; S6, training the constructed deep learning model by adopting a back propagation algorithm, and realizing the back update of network weight parameters by minimizing a loss function, performing performance evaluation on the model by utilizing the verification set divided in the step S4, and optimizing key super parameters of the model based on an evaluation result so as to improve the prediction precision and robustness of the model, obtaining a final detection model with stable convergence by multiple rounds of cross verification and parameter optimization, and considering that the performance of the constructed model meets the requirement of rapid detection when a decision coefficient R 2 of the model on the verification set is greater than or equal to 0.75 and a root mean square error RMSE is less than or equal to 0.2; s7, taking out a pine nut sample to be detected which is placed in a sample preparation cabin at intervals, taking out a sample scanning hyperspectral image and obtaining original spectrum data, and then carrying out a Soxhlet extraction experiment and an acid value/peroxide value test; The dynamic weight distribution module aims at automatically identifying key absorption peaks in an input spectrum, establishing a mapping relation with a band response position corresponding to an output task, finally establishing a corresponding relation between each absorption peak and a wavelength position in a pine nut output characteristic band and each absorption peak and a wavelength position in the input characteristic band, enabling the model to be automatically focused on a spectrum band position related to functional group characteristic vibration or molecular absorption peaks in a training process, and finally designing a spectrum resampling mechanism, wherein the characteristic coding mechanism is designed for extracting space structure characteristics in a near infrared spectrum, improving perception of the model to spectrum space distribution, and secondly, establishing a nonlinear mapping relation between the space structure characteristics and internal substance components of the pine nut by means of a fully connected network, introducing a network generation module, and realizing effective correspondence between the input space and spectrum characteristic space by taking the space structure characteristics as conditions.
2. The method for rapid nondestructive testing of lipid content and degradation degree of Korean pine nut based on hyperspectral imaging and deep learning according to claim 1, wherein, In the step S1, light treatment and environmental control are carried out, specifically, the red pine nuts of the same harvest batch are shelled, pine nut samples are placed in an environmental-controllable sample cabin, standing treatment is carried out under sunlight, the whole experimental process control temperature is 25+/-2 ℃ and the humidity is 45+/-5%, and environmental parameter data are acquired every 15min to 30min by using a HOBO U-001 temperature and humidity recorder; In step S2, the sample collection array is constructed and hyperspectral collection is specifically that selected samples are uniformly arranged on a black plate with holes, sample shaking caused by platform movement in the collection process is avoided, the regularity of hyperspectral scanning collection is optimized according to the plate size and the arrangement of the plate, hyperspectral images of pine nut samples are scanned and collected and exported for storage, and the samples are filled into a sealed bag for light-shielding and proper storage.
3. The method for rapid nondestructive testing of lipid content and deterioration degree of Korean pine nut based on hyperspectral imaging and deep learning according to claim 1 or 2, wherein in step S3, Selecting a region of interest (ROI) from the derived hyperspectral image of the pine nut sample, and acquiring near infrared band spectrum data of the region; carrying out physicochemical analysis experiments on the part of samples immediately after hyperspectral acquisition for constructing training tags; each plate is randomly and fully sampled and divided into two parts, the first part is smashed and added with normal hexane to extract lipid, after standing for 10-15 minutes, the colorimetric detection is carried out by using mlbio company acid value/peroxide value test paper, the reaction time of the acid value test paper is controlled to be read within 5 minutes to form a mapping data set between a spectrum and a lipid oxidation rancidity index, the second part is subjected to a soxhlet extraction experiment according to a national standard method to form a mapping data set between the spectrum and the lipid content index, the degradation degree and the lipid content of pine nut samples at different oxidation times are measured to establish a database, the lipid content and the oxidation level data obtained by physicochemical detection are ensured to be in one-to-one correspondence with the original spectrum data measured in the S2, and the corresponding hyperspectral image data are subjected to annotation matching, so that a complete training data set is constructed.
4. The method for rapid nondestructive testing of lipid content and degradation degree of Korean pine nut based on hyperspectral imaging and deep learning according to claim 1, wherein step S4 is used for finishing data arrangement and modeling preparation, specifically: the method comprises the steps of acquiring original near infrared spectrum data, preprocessing the acquired original near infrared spectrum data to eliminate background interference, improve signal to noise ratio and enhance the extraction capability of a model to key features, wherein a selected preprocessing algorithm is flexibly adjusted according to spectrum characteristics and sample types to realize effective extraction of key information and suppression of redundant information in the spectrum data and improve the accuracy and stability of subsequent modeling; the data preprocessing process comprises the following steps: s4.1, collecting and cleaning data; s4.2, carrying out data normalization to improve the stability and convergence rate of model training, wherein the normalization method Z-Score is as follows: Wherein, the Is the original data of the data set, Is the mean value of the data and, Is the standard deviation of the data, and the normalized data has zero mean and unit variance, and is used for preventing unstable training caused by overlarge numerical range of certain characteristics; S4.3 in near infrared spectrum analysis, the sliding window technology is used for segmenting the time series data, and the input data is assumed to be The sliding window operation partitions the data into multiple overlapping sub-sequences: And so on, Wherein, the Window size, each window data as input to the model; the original near infrared spectrum data of the pretreated pine nut sample is randomly divided into a training set and a verification set, and the training set and the verification set are used for providing basic data for subsequent deep learning modeling and identification of the lipid content and oxidation degradation degree of the pine nut inner kernel.
5. The method for rapid nondestructive testing of lipid content and degradation degree of Korean pine nut based on hyperspectral imaging and deep learning according to claim 1 or 4, wherein step S5 specifically comprises: S5.1, randomly dividing a sample into a training set and a verification set by adopting an SPXY algorithm; S5.2, establishing a regression model based on deep learning; s5.3, automatically extracting the preprocessed training set spectral features through the cavity convolutional neural network, expanding the receptive field of the model, and improving the multi-scale spectral feature extraction capacity of the model; s5.4, introducing a dynamic weight distribution module to execute feedback regulation, enhancing the balance between dynamic selection and fine regulation of a model, and finally outputting the pine nut lipid content and the deterioration degree predicted value; s5.5, predicting pine nut components based on multi-task learning, and realizing accurate modeling and collaborative prediction from a plurality of wave band features in an input spectrum to a plurality of output tasks by introducing a dynamic weight mechanism and an absorption peak position mapping mechanism; The constructed deep learning model uses a multi-task network, uses near infrared spectrum data and lipid content data of a pine nut sample of a training set to establish a multi-task detection model I of the lipid content of the pine nut, and uses near infrared spectrum data and degradation degree grade data of the pine nut sample of the training set to establish a multi-task detection model II of the degradation degree in the pine nut.
6. The method for rapid nondestructive testing of lipid content and degradation degree of Korean pine nut based on hyperspectral imaging and deep learning according to claim 5, wherein, S5.3, expanding the receptive field of the convolution kernel by introducing a void rate (resolution rate), effectively capturing a larger range of dependency relationship under the condition of not increasing the calculated amount, adopting one-dimensional void convolution (Dilated Convolution) operation, and processing near infrared spectrum signals, wherein the specific process of the step S5.3 is as follows: S5.3.1 construction of a one-dimensional cavity convolution network model in the cavity convolution layer design process, wherein cavity convolution passes through cavity rate To expand the receptive field of the convolution kernel and capture dependency relationships over longer or larger ranges; s5.3.2 for one-dimensional hole convolution, the input signal is assumed to be The convolution kernel is of the size of The output calculation formula of the cavity convolution is as follows: Wherein, the For the purpose of the void fraction, The weights for the convolution kernel are, As a result of the bias term, Is an index of the output, and The method has the advantages that the space between convolution kernel elements is represented, the receptive field can be effectively increased by selecting proper void ratio r without increasing the size of the convolution kernel, so that the surge of computational complexity is avoided; S5.3.3, constructing a plurality of cavity convolution layers, namely stacking the plurality of cavity convolution layers to form a deep feature extraction network so as to enlarge a receptive field, wherein each convolution layer can extract feature information of different scales according to the cavity rate, and the receptive field can be expanded in multiple times along with the increase of the cavity rate, so that the time dependence of a longer distance is captured; S5.3.4 activation function selection, each convolution layer is followed by a nonlinear activation function ReLU, defined as: the ReLU function helps the network to introduce nonlinearity, so that the model can learn a more complex mode, and the model is used for relieving gradient disappearance and accelerating the training process of the depth network; The specific process of step S5.4 is as follows: S5.4.1 designing a characteristic spectrum positioning network structure: s5.4.1.1 defining the output characteristic spectrum as Representing the wavelength position after feature encoding; Feature encoding by using a multi-layer perceptron to map the feature encoding, and calculating a corresponding input feature spectrum as : S5.4.1.2 based on A local feature extractor (Localization Network) design for multi-channel feature coding: The module receives the spectrum from the input Extracting characteristic wave band information related to pine nut mildew, and returning The multi-layer perceptron carries out spectrum information feature coding, and the form is as the function: Wherein, the Is a function capable of learning, representing the use of multi-layer perceptron Performing feature coding on an optical band; s5.4.1.3 obtaining based on the spliced MLP when the lengths of different channels of the MLP network are consistent : GAP is an abbreviation of global average pooling layer Global Average Pooling, uses global average pooling layer to screen characteristic wave bands from output wave bands, and uses Batch Normalization normalization layer to merge pooled characteristic wave bands; S5.4.1.3 when the lengths of different channels of the MLP network are not consistent, each branch independently provides the characteristics: , S5.4.2 design of a characteristic spectrum fusion structure S5.4.2.1 judging whether the dimensions of the characteristic wave bands are consistent, if not, using Dimension alignment is carried out on convolution: s5.4.2.2 weighting the fused spectral bands using gating weights: S5.4.3 design generator structure For each near infrared spectral output, its sampling position in the input spectrum is calculated I.e. generating a sampled spectral network: Wherein the method comprises the steps of Representing a set of target output characteristic spectra; s5.4.4 design a multichannel based characteristic spectrum Sampler (Sampler) architecture: for input characteristic spectrum In position Sampling, and adopting bilinear interpolation; For the first The output of the near infrared spectrum wave bands is as follows: wherein H is the maximum number of wave bands, W is the maximum number of channels, n is the number of current wave bands, m is the current channel, and the method is recorded as: wherein the kernel function Is a bilinear interpolation kernel; the specific process of the step 5.5 is as follows: S5.5.1 task decoupling modeling Setting two specific subnetworks of lipid content and degradation degree, wherein each task subnetwork comprises an independent parameter structure for learning the characteristic representation required by the task, and sending the output characteristics of the dynamic weight distribution module to a plurality of task specific subnetworks to realize multi-task parallel modeling; S5.5.2 shared feature extraction Inputting the pine nut original spectrum data into a shared characteristic extraction sub-network, extracting uniform potential characteristic representation through a plurality of convolution layers, full connection layers or transformation coding structures, and capturing global spectrum information and main absorption peak distribution of an input wave band; S5.5.3 modeling of absorption peak position correspondence An absorption peak corresponding mechanism is introduced on the basis of task decoupling, and the correlation of the absorption peak positions between an input wave band and a remarkable response wave band of each output task is calculated to be used for realizing the mapping between the wave band and the characteristic vibration of the functional group and improving the wave band interpretability of the model; s5.5.4 joint loss optimization And carrying out loss calculation on the prediction results output by each task sub-network and the real labels, constructing an overall joint loss function by combining the dynamically allocated task weights, and simultaneously optimizing a sharing layer, a task layer and a dynamic weight module by an end-to-end joint training mode to realize cooperative learning and feature fusion of multiple tasks.
7. The method for rapid nondestructive testing of Korean pine nut lipid content and deterioration degree based on hyperspectral imaging and deep learning as set forth in claim 6, wherein the step S6 is a method for iterative training of a model by using a back propagation algorithm, wherein error calculation is based on a mean square error loss function, model weight parameter updating is performed by using an adaptive moment estimation optimizer to improve convergence efficiency and stability, and the model training and optimizing process specifically comprises: S6.1, selecting a loss function for optimizing model parameters in the training process; For the regression problem, mean Square Error (MSE), defined as: For classification problems, a cross entropy loss function is used: Wherein, the As a real tag it is possible to provide a real tag, For the class probability of the model prediction, Is the category number; And S6.2, in the model training stage, updating model parameters by selecting an Adam optimizer to realize efficient optimization of network parameters, wherein an Adam optimization algorithm combines a Momentum method Momentum and an adaptive learning rate ADAPTIVE LEARNING RATE, and dynamically adjusts parameter updating step length by simultaneously estimating a first moment and a second moment, and the parameter updating rule is as follows: the above is used for calculating The first moment of the gradient of the time-of-day model, wherein, Is a target loss function objective loss function; gradient of model parameters for loss function; As the first moment of the gradient, obtained by an exponentially weighted moving average calculation of the gradient, The above is used for calculating Moment of the gradient of the time-of-day model, wherein, And (3) with Exponential decay coefficients of the first and second moments, respectively; The second moment of the gradient is obtained through the calculation of the exponential weighted moving average of the square of the gradient; According to the above First and second moments of gradients of time models for updating computation Parameters of the time of day model, wherein, Is the first Model parameter vector at the time of iteration; Model parameters of the previous iteration round are obtained; For learning rate, controlling the step length of each parameter update; To prevent the denominator from being zero, the value is ; Through the optimization process, the Adam optimizer can dynamically adjust the learning rate of each parameter while maintaining the gradient smoothness, and the convergence stability and training efficiency of the model in a complex non-convex loss space are improved.
8. A rapid nondestructive testing system for lipid content and degradation degree of Korean pine nut based on hyperspectral imaging and deep learning, which is realized based on the testing method of any one of claims 1-7, comprising: The pine nut sample original data acquisition unit is used for acquiring hyperspectral data of the pine nut sample at different moments, extracting original near infrared spectrum data, the degradation degree of the pine nut sample and lipid content, and establishing a database according to the hyperspectral data; The data preprocessing unit is used for preprocessing and extracting features of the acquired raw near infrared spectrum data of the pine nut sample and dividing the raw near infrared spectrum data into a training set and a verification set; The deep learning model building unit is used for building a model by introducing a cavity convolution mechanism and a multi-task neural network of the dynamic weight distribution module through the convolution neural network, fitting near infrared spectrum data and lipid content data of a pine nut sample of a training set, and building a multi-task learning model of lipid content in the pine nut; The model verification and optimization unit is used for minimizing a loss function through a back propagation algorithm, reversely updating weight parameters of a multi-task learning model for detecting the degradation degree in pine nuts and a multi-task learning model for detecting the degradation degree in the pine nuts, verifying and optimizing super parameters of the model by using a verification set to improve the prediction precision and the model performance of the model, and obtaining a final detection model through multiple times of verification and optimization; The pine nut sample rapid detection unit is used for obtaining original spectrum data of a sample to be detected, inputting the original spectrum data into a multitask learning model of the lipid content and the degradation degree in pine nuts, and outputting a result to obtain a predicted value of the lipid content and the degradation degree in the pine nuts; The dynamic weight distribution module aims at automatically identifying key absorption peaks in an input spectrum, establishing a mapping relation with a band response position corresponding to an output task, finally establishing a corresponding relation between each absorption peak and a wavelength position in a pine nut output characteristic band and each absorption peak and a wavelength position in the input characteristic band, enabling the model to be automatically focused on a spectrum band position related to functional group characteristic vibration or molecular absorption peaks in a training process, and finally designing a spectrum resampling mechanism, wherein the characteristic coding mechanism is designed for extracting space structure characteristics in a near infrared spectrum, improving perception of the model to spectrum space distribution, and secondly, establishing a nonlinear mapping relation between the space structure characteristics and internal substance components of the pine nut by means of a fully connected network, introducing a network generation module, and realizing effective correspondence between the input space and spectrum characteristic space by taking the space structure characteristics as conditions.
9. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program configured to implement, when called by a processor, the steps correspondingly performed in the fast nondestructive testing method for lipid content and degradation degree of Korean pine nut based on hyperspectral imaging and deep learning as described in any one of claims 1-7.
10. A rapid nondestructive testing device for the lipid content and the degradation degree of Korean pine nuts is characterized by comprising at least one processor, a memory in communication connection with the at least one processor and a device for collecting hyperspectral acquired pine nut sample original near infrared spectrum data, wherein the memory stores instructions which can be executed by the at least one processor so that the at least one processor can execute the corresponding steps in the rapid nondestructive testing method for the lipid content and the degradation degree of Korean pine nuts based on hyperspectral imaging and deep learning according to any one of claims 1-7, and the detection for the lipid content and the degradation degree of Korean pine nuts is realized.

Description

Rapid nondestructive testing method and system for lipid content and degradation degree of Korean pine nut based on hyperspectral imaging and deep learning Technical Field The invention belongs to the technical field of nondestructive testing of quality and safety of agricultural and forestry products, and particularly relates to a cooperative rapid nondestructive testing method and system for lipid content and oxidation degradation degree in Korean pine seed inner kernels based on hyperspectral imaging and deep learning. Background With the increasing emphasis of the public on food quality safety, precise control of nutritional ingredients and sustainable development, the Korean pine nut is taken as an economic agriculture and forestry product with high grease and high nutrition, the lipid content and the acidification degradation level of the Korean pine nut become key indexes of urgent attention in the industry chain, and the lipid is taken as an important substrate of oxidative metabolism and has an 'overgrowth' relationship with oxidative metabolism (acid) products. The red pine nut is rich in unsaturated fatty acid, vitamin E and other functional lipid substances and trace elements, has higher nutritive value and market popularization potential, and is widely applied to the fields of food processing, vegetable oil extraction and the like. However, it is susceptible to oxidative decomposition during storage and processing to produce peroxides, which produce strong carcinogens, severely threatening food safety and consumer health. Therefore, the high-efficiency and accurate detection method for the lipid content and the oxidation degradation degree of the red pine nut kernel is established, and has important practical significance and application value for improving the quality control level of products, promoting the deep processing of the red pine nut and guaranteeing the sustainable development of industry. How to realize the synchronous quality detection of the lipid and the oxidation level of the Korean pine nut on the premise of ensuring non-destructive performance by integrating the spectrum analysis and the deep learning intelligent modeling technology, and meeting the urgent demands of modern agricultural product and food safety supervision and high-throughput quality evaluation. The existing method for detecting the lipid content and the oxidation rancidity level in the red pine nuts mainly comprises a Soxhlet extraction method, an acid hydrolysis method, a Soxhlet extraction method, a peroxide value method (POV), a TBARS method, a carbonyl valence determination method, a colorimetry method, a gas chromatography-mass spectrometry (GC-MS) method and the like. Although the methods have higher accuracy, the problems of long detection period, complex operation, difficult detection at high frequency in large batch, high dependence on organic solvents, destructive detection of detected samples and the like generally exist, the rapid and batch detection application of the method for classifying and screening the quality of pine nut raw materials in the storage and transportation process is limited, and the practical requirements of high-efficiency, nondestructive, high-coverage and green detection are difficult to meet. Near infrared spectroscopy (Near-Infrared Spectroscopy, NIRS) is used as a nondestructive optical analysis method based on the molecular-level vibration characteristics of a sample to be detected, has the advantages of high detection speed, simple and convenient operation, no need of chemical pretreatment and suitability for on-line real-time monitoring, and is widely used for quantitative and qualitative analysis of lipid, protein, moisture, toxin and other substance indexes in agricultural products and foods. However, the lipid content in the red pine nut kernel has complex spectrum response sites and certain similarity with the spectrum response sites of the oxidation marker products, and the detection precision influence caused by the overlapping and crossing phenomenon of spectrum bands between samples, background interference and other factors is remarkable. That is, the problems of strong nonlinear relation, weak signal, complex background and the like existing in the detection process of the lipid content of the Korean pine nut and the level of the oxidation marker in the prior detection technology are not solved effectively. However, the problems of limited receptive field and insufficient multi-scale spectral feature extraction capability in the near infrared spectrum model of the one-dimensional convolution network exist, and meanwhile, the problem of difficult mapping modeling between a high-dimensional space structure and a substance component exists in the near infrared spectrum, so that the near infrared spectrum modeling becomes a technical bottleneck. Therefore, how to construct a neural network model with a multi-layer structure through an algorithm, realize the auto