Search

CN-122021366-A - Ginseng seed emergence rate prediction method and system, and model construction method and system

CN122021366ACN 122021366 ACN122021366 ACN 122021366ACN-122021366-A

Abstract

A ginseng seed emergence rate prediction method and system, a model construction method and system belong to the technical field of ginseng seed emergence rate prediction, and solve the technical problems that in the prior art, the ginseng seed emergence rate is low, and if the existing spectrum model is directly utilized for ginseng seed emergence rate prediction, the model is unstable, the prediction precision is low and the generalization capability is insufficient. Collecting hyperspectral data of ginseng seeds, respectively preprocessing the hyperspectral data by utilizing a plurality of preprocessing methods, screening out an optimal preprocessing method and corresponding preprocessed spectral data, adopting a plurality of spectral feature selection methods to perform feature selection on the screened preprocessed spectral data to respectively obtain corresponding feature wave bands, and respectively evaluating a plurality of regression models to obtain the optimal spectral feature selection method and the corresponding optimal regression model. The invention is used for realizing scientific and nondestructive prediction of emergence rate and optimal laser irradiation method of ginseng seeds of different types.

Inventors

  • YU HELONG
  • GUO JING
  • XUE MINGXUAN
  • ZHAO YUXIN
  • LI JINGYU
  • LIANG SHUANG
  • YAN WEI
  • FU QIYUAN
  • Zhou Leijinyu

Assignees

  • 吉林农业大学

Dates

Publication Date
20260512
Application Date
20260415

Claims (10)

  1. 1. The ginseng seed emergence rate prediction model construction method is characterized by comprising the following steps of: Step 1, collecting hyperspectral data of ginseng seeds, and respectively preprocessing the hyperspectral data by utilizing a plurality of preprocessing methods to obtain corresponding preprocessed spectral data; Step 2, comparing the hyperspectral data with the preprocessed spectral data processed by the preprocessing methods, taking a determination coefficient R 2 , a Root Mean Square Error (RMSE) and an average absolute error (MAE) as evaluation indexes, and screening out the optimal preprocessing method and the corresponding preprocessed spectral data by adopting a random forest algorithm; step 3, adopting a plurality of spectrum characteristic selection methods to perform characteristic selection on the screened preprocessed spectrum data to respectively obtain corresponding characteristic wave bands; Step 4, based on characteristic wave bands corresponding to the multiple spectral characteristic selection methods, respectively evaluating the multiple regression models by taking a determination coefficient R 2 , a Root Mean Square Error (RMSE) and an average absolute error (MAE) as evaluation indexes to obtain an optimal spectral characteristic selection method and a corresponding optimal regression model; and 5, combining the optimal pretreatment method, the optimal spectral feature selection method and the optimal regression model to obtain a ginseng seed emergence rate prediction model.
  2. 2. The method according to claim 1, wherein the plurality of preprocessing methods in step 1 include smoothing technique, standard normal variable transformation, multiple scatter correction, first derivative, second derivative, wavelet denoising, SG and MSC combination, SG and first derivative combination, SG and second derivative combination, MSC and SG combination, and SNV and first derivative combination.
  3. 3. The method for constructing a model for predicting emergence rate of ginseng seeds according to claim 1, wherein the multiple spectral feature selection methods in the step 3 include UVE, CARS, XGBoost and IHO.
  4. 4. The method of claim 1, wherein the regression models in step 4 include SVR, PLSR and RF.
  5. 5. The method for constructing a prediction model of emergence rate of ginseng seeds according to claim 1, wherein the optimal pretreatment method in the step 5 is a combination of SG and a second derivative, the optimal spectral feature selection method is IHO, and the optimal regression model is SVR.
  6. 6. The ginseng seed emergence rate prediction method is realized based on the ginseng seed emergence rate prediction model construction method according to any one of claims 1 to 5, and is characterized by comprising the following specific steps: performing laser irradiation treatment on ginseng seeds, collecting hyperspectral data of the ginseng seeds after treatment, and predicting the emergence rate of the ginseng seeds by using a ginseng seed emergence rate prediction model to obtain a ginseng seed emergence rate prediction result.
  7. 7. The method for predicting emergence rate of ginseng seeds according to claim 6, wherein the optimal laser irradiation method of ginseng seeds of different types is obtained according to the prediction result, wherein the optimal laser irradiation method of ginseng under the forest is blue light irradiation for 5 minutes, and the optimal laser irradiation method of ginseng under the forest is red light irradiation for 30 minutes.
  8. 8. A ginseng seed emergence rate prediction model construction system constructed based on the method of claim 1, comprising the following modules: The pretreatment module is used for collecting hyperspectral data of ginseng seeds, and respectively carrying out pretreatment on the hyperspectral data by utilizing a plurality of pretreatment methods to obtain corresponding pretreated spectral data; The screening module is used for comparing the hyperspectral data with the preprocessed spectral data processed by the preprocessing methods, taking the determination coefficient R 2 , the Root Mean Square Error (RMSE) and the average absolute error (MAE) as evaluation indexes, and screening out the optimal preprocessing method and the corresponding preprocessed spectral data by adopting a random forest algorithm; the feature selection module is used for carrying out feature selection on the screened preprocessed spectrum data by adopting a plurality of spectrum feature selection methods to respectively obtain corresponding feature wave bands; The evaluation module is used for respectively evaluating the multiple regression models based on characteristic wave bands corresponding to the multiple spectral characteristic selection methods and taking the determination coefficient R 2 , the Root Mean Square Error (RMSE) and the Mean Absolute Error (MAE) as evaluation indexes to obtain an optimal spectral characteristic selection method and a corresponding optimal regression model; And the construction module is used for combining the optimal pretreatment method, the optimal spectral feature selection method and the optimal regression model to obtain the ginseng seed emergence rate prediction model.
  9. 9. A ginseng seed emergence rate prediction system constructed based on the method of claim 6, comprising the following modules: And the prediction module is used for carrying out laser irradiation treatment on the ginseng seeds, collecting hyperspectral data of the ginseng seeds after treatment, and predicting the emergence rate of the ginseng seeds by utilizing a ginseng seed emergence rate prediction model to obtain a ginseng seed emergence rate prediction result.
  10. 10. A computer program product comprising a computer program or instructions which, when executed by a processor, implements the method for constructing a model for predicting emergence rate of ginseng seeds according to any one of claims 1 to 5.

Description

Ginseng seed emergence rate prediction method and system, and model construction method and system Technical Field The invention relates to the technical field of ginseng seed emergence rate prediction, in particular to a ginseng seed emergence rate prediction method and system, and a model construction method and system. Background The ginseng seeds have the characteristics of long dormancy time, sensitivity to environmental conditions, low emergence rate and the like, and the artificial cultivation and the industrialized development are severely restricted, so that the emergence rate of the ginseng seeds is efficiently and rapidly improved, a scientific and lossless emergence rate prediction method is established, and the method has important theoretical and application values. At present, the aspect of improving the emergence rate of ginseng seeds by using laser is still blank, and although researches are attempted to control the emergence rate of the seeds by laser irradiation, the following defects generally exist in the existing scheme: (1) The system is concentrated on a single light source or fixed irradiation time length, and lacks system optimization and mechanism research on light source combination, spectrum proportioning and irradiation parameters; (2) Most only pay attention to the influence of laser on germination rate, but lack the systematic evaluation of the key agricultural index of the germination rate; (3) The existing scheme is mainly aimed at main grain crops such as corn, wheat, soybean and the like, and the optimization research on the emergence rate of medicinal plant seeds such as ginseng and the like is not seen. In the field of ginseng seed emergence rate prediction, the traditional germination test, morphological observation, conventional physicochemical index measurement and other methods are mainly adopted. The prior traditional techniques have obvious defects, are long in time consumption and obvious in destructiveness, and cannot realize rapid and nondestructive prediction of the emergence rate. Some technologies utilize hyperspectral imaging and machine learning models to predict seed emergence rate, but still have outstanding problems: (1) The current researches are aimed at grain crops or economic crops, and the existing spectrum model cannot be directly applied due to the fact that the structure and the physiological state of ginseng seeds are different from those of other crops, so that the prediction precision is low; (2) The hyperspectral data has high dimensionality and high noise, and the traditional modeling method is easily influenced by redundant variables and collinearity to cause unstable models; (3) Part of research models lack of adaptation to spectrum change characteristics caused by laser treatment, the model generalization capability is insufficient, and stable and reliable seedling emergence prediction is difficult to realize. In the prior art, chinese patent document CN107258149A discloses a method and a system for measuring the germination rate of cotton seeds based on near infrared spectrum, wherein the germination rate of cotton seeds is measured by a pre-established measurement model of the germination rate of cotton seeds, and Chinese patent document CN104255118A discloses a rapid nondestructive testing method for the germination rate of rice seeds based on near infrared spectrum technology, wherein near infrared spectrum data of rice seeds with different aging time periods are collected, band selection and pretreatment are carried out on the near infrared spectrum data, and a partial least square method is adopted to establish a rice seed germination rate model based on near infrared spectrum. However, the prior art is mainly used for establishing a near infrared/hyperspectral germination rate prediction model for grain crops or commercial crop seeds, and the evaluation indexes of the near infrared/hyperspectral germination rate prediction model are mainly based on germination criteria such as radicle breaking through seed coats and the like, and have caliber difference with the emergence rate (germination-soil emergence-seedling formation) which is more concerned by ginseng seedling production. The ginseng seeds have long dormancy period, complex dormancy removing process, sensitive layering condition and temperature and humidity environment, obvious intra-batch difference, so that the mapping relation of spectrum, vitality, germination and emergence is of a staged and nonlinear characteristic, meanwhile, the ginseng seeds have large difference of morphological structure and moisture/component characteristics, strong spectrum scattering effect, multiple noise, high dimension and dispersed effective information, if the general pretreatment, band selection and linear modeling method of the existing crops are directly applied, unstable model parameters, low prediction precision and insufficient batch/variety crossing generalization capability are ea