CN-121684738-B - Peanut seed quality online sorting method and system

CN121684738BCN 121684738 BCN121684738 BCN 121684738BCN-121684738-B

Abstract

The application discloses an on-line sorting method and system for peanut seed quality, which relates to the technical field of agricultural product sorting automation, the problem of sorting accuracy decline that the characteristic drift leads to is solved, can self-adaptation characteristic drift detects and model updating ability, automatic maintenance sorting accuracy reduces the downtime maintenance time in continuous production process, improves production efficiency.

Inventors

LIN XIANFENG
LI YU
ZHAO CHANGKUN
YOU YU
SHANG YUE
CAI LI
BAI ZISONG

Assignees

南充市农业科学院
南充市农业农村局
南充市种子质量监督检验站

Dates

Publication Date: 20260512
Application Date: 20260210

Claims (8)

1. An on-line sorting method for peanut seed quality is characterized by comprising the following steps: Acquiring an initial image set of peanut seeds, carrying out feature extraction processing on the initial image set to generate a first feature vector set, and calculating statistical parameters of the first feature vector set to generate first reference distribution data; Collecting an online image of peanuts to be sorted, and carrying out feature extraction processing on the online image to generate an online feature vector; Inputting the online feature vector into a current sorting model for prediction processing to generate a real-time quality prediction result and corresponding prediction confidence coefficient data; after continuously running for a preset period, acquiring online feature vectors corresponding to all the sorted peanuts in the preset period, summarizing the online feature vectors to construct a current period feature set, and calculating statistical parameters of the current period feature set to generate current feature distribution data; calculating the difference degree between the first reference distribution data and the current characteristic distribution data to generate characteristic drift index data; Comparing and judging the characteristic drift index data with a preset drift threshold, and triggering a model updating process if the characteristic drift index data exceeds the drift threshold, wherein the model updating process comprises the following steps: Screening processing is carried out on the basis of the feature set of the current time period and the corresponding prediction confidence coefficient data so as to generate a feature subset with high confidence coefficient and corresponding prediction quality label data; Updating the first reference distribution data based on the high confidence feature subset to generate second reference distribution data; Performing parameter fine tuning on a current sorting model based on the high confidence feature subset and the corresponding predicted quality tag data to generate a first updated sorting model; Applying the second reference distribution data and the first updated sorting model to a subsequent real-time sorting process; The step of screening to generate a high confidence feature subset and corresponding predicted quality tag data based on the current time period feature set and corresponding predicted confidence data comprises: Inputting online feature vectors in the current time period feature set into the pre-trained screening model to generate membership grade corresponding to each online feature vector; comparing the membership grade with the predicted confidence coefficient data of the corresponding online feature vector, and reserving the online feature vector with the membership grade higher than a first screening threshold and the predicted confidence coefficient data higher than a second screening threshold to form a preliminary screening set; adding a corresponding real-time quality prediction result of each online feature vector reserved in the preliminary screening set as a prediction quality label of the online feature vector, so as to obtain the high-confidence feature subset and corresponding prediction quality label data; Inputting the online feature vectors in the feature set of the current time period into the pre-trained screening model, and generating the membership grade corresponding to each online feature vector comprises the following steps: taking the characteristic set of the current period as input data, and randomly extracting part of characteristic subspaces; On the extracted feature subspace, recursively and randomly selecting features and segmentation values thereof to construct a plurality of binary tree structures; For any online feature vector in the feature set of the current period, calculating the path length that the online feature vector passes from the root node to the leaf node in each binary tree structure; and mapping to obtain the membership grade corresponding to the online feature vector based on the average path length of the online feature vector in all binary tree structures.
2. The method of on-line sorting of peanut seed quality of claim 1, wherein the step of performing feature extraction processing on the initial image set to generate a first set of feature vectors comprises: Converting the color space of a single image in the initial image set, and calculating a pixel value histogram on a target color channel to generate color distribution characteristic data; Traversing pixels and neighborhoods thereof in the image by adopting a local binary pattern operator for a single image in the initial image set, generating a binary pattern by comparing gray values of the neighborhood pixels and the center pixels, and counting binary pattern histograms of all pixels to generate texture feature data; determining the outline of the seed by edge detection for a single image in the initial image set, and calculating the aspect ratio and the area duty ratio of an external rectangle based on the outline to generate shape characteristic data; and splicing the color distribution characteristic data, the texture characteristic data and the shape characteristic data to form the multi-dimensional visual characteristic of the seeds to obtain the first characteristic vector, and collecting the first characteristic vectors of all the seeds to obtain the first characteristic vector set.
3. The method of on-line sorting peanut seed quality of claim 1, wherein the step of calculating a degree of difference between the first reference distribution data and the current characteristic distribution data, generating characteristic drift index data, comprises: Analyzing and obtaining a first reference mean vector and a first reference covariance matrix from the first reference distribution data; calculating an inverse matrix of the first reference covariance matrix based on the first reference covariance matrix to generate first inverse matrix data; And calculating the characteristic drift index data based on the first reference mean vector, the current mean vector, the first inverse matrix data and the current covariance matrix.
4. The peanut seed quality online sorting method of claim 1, wherein the step of updating the first baseline distribution data to generate second baseline distribution data based on the subset of high confidence features comprises: Calculating the mean value of all online feature vectors in the high confidence feature subset to generate candidate mean value vectors; reading a first reference mean vector and a first reference covariance matrix from the first reference distribution data; carrying out weighted summation on the first reference mean vector and the candidate mean vector, wherein the weight of the candidate mean vector is a preset learning rate, the weight of the first reference mean vector is one minus the learning rate, and a summation result is obtained to obtain a second reference mean vector; carrying out weighted fusion on the first reference covariance matrix and the candidate covariance matrix according to the same weight strategy to obtain a second reference covariance matrix; and combining the second reference mean vector with the second reference covariance matrix to generate the second reference distribution data.
5. The method of on-line sorting of peanut seed quality of claim 1, wherein the step of parametrically fine tuning a current sorting model to generate a first updated sorting model based on the high confidence feature subset and the corresponding predicted quality label data comprises: inputting the high-confidence feature subset into a pre-trained current sorting model to perform forward propagation once to obtain importance weight distribution data of each parameter pair of the model for correctly predicting the high-confidence feature subset; constructing a regularization constraint loss function based on the importance weight distribution data; And taking the high-confidence feature subset and the corresponding prediction quality label data thereof as training data, and generating the first updated sorting model by iterative updating of model parameters in combination with the regularization constraint loss function.
6. The method of on-line sorting peanut seed quality of claim 5, wherein the step of inputting the subset of high confidence features into a pre-trained current sorting model for one forward propagation to obtain importance weight distribution data of each parameter pair of the model for correctly predicting the subset of high confidence features comprises: For each online feature vector in the high-confidence feature subset, calculating the gradient of the loss function of the model output layer on the online feature vector after the online feature vector passes through the current sorting model; Calculating expected values of all online feature vector gradient squares corresponding to each model parameter based on the gradient of the online feature vector to obtain an importance estimated value of the model parameter; And carrying out normalization processing on the importance estimated values of all the model parameters to obtain the importance weight distribution data.
7. The method of on-line sorting of peanut seed quality of claim 1, wherein the step of generating sorting control instructions to control a sorting actuator to perform corresponding actions based on real-time quality predictions comprises: When the real-time quality prediction result is lower than a preset threshold value, analyzing and obtaining outline dimension data and surface smoothness estimation data from online feature vectors corresponding to the current peanuts to be sorted; Based on a pre-stored experience corresponding relation, dynamically calculating the pressure value and the injection duration adjustment quantity of the high-speed air flow according to the outline dimension data and the surface smoothness estimation data; and generating the sorting control instruction containing specific action parameters according to the calculated pressure value and the injection duration adjustment quantity.
8. An on-line sorting system for peanut seed quality, characterized in that the on-line sorting system for peanut seed quality comprises a memory, a processor and an on-line sorting program for peanut seed quality stored on the memory and operable on the processor, the on-line sorting program for peanut seed quality being configured to implement the steps of the on-line sorting method for peanut seed quality according to any one of claims 1 to 7.

Description

Peanut seed quality online sorting method and system Technical Field The application relates to the technical field of agricultural product sorting automation, in particular to a peanut seed quality online sorting method and system. Background Currently, the peanut seed sorting mainly relies on a machine vision technology to construct a static classification model, and quality judgment is achieved through image feature analysis. In the actual industrial production environment, peanut seeds are used as natural biological materials, and the appearance characteristics of the peanut seeds show a dynamic change rule. The peanut seeds of different varieties have inherent differences in terms of color saturation, surface texture roughness and light reflection characteristics, meanwhile, the maturity distribution of the seeds is changed due to the change of the season generation, the gradual change of the skin color is caused by the humidity fluctuation in the storage process, and the instability of the image characteristics is further aggravated by the random fluctuation of the illumination intensity and angle of the environment of the production line. These factors work together to systematically shift the visual feature data distribution of the input model over time, forming a continuously evolving feature drift phenomenon. The static model cannot adapt to the distribution change, so that the classification decision boundary gradually deviates from the real quality boundary, and the proportion of the high-quality seeds which are rejected by mistake or the low-quality seeds which are reserved by misjudgment continuously rises. To cope with this problem, existing sorting equipment must interrupt the continuous production flow, re-collect a large number of samples and rely on manual quality labeling, and then take several hours to complete model retraining and deployment. The periodic shutdown maintenance not only greatly shortens the effective operation time of the production line, but also introduces new uncertainty due to subjective differences of manual labeling, and severely restricts the automatic operation capability and long-term stability of the sorting system. Particularly in a large-scale seed processing scene crossing quarters, the accumulated effect of characteristic drift enables sorting precision to show irreversible declining trend, and the prior art framework is difficult to realize self-adaptive quality control under unattended conditions. The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present application and is not intended to represent an admission that the foregoing is prior art. Disclosure of Invention The application mainly aims to provide an on-line sorting method and system for peanut seed quality, which aims to reduce shutdown maintenance time and improve production efficiency. In order to achieve the above purpose, the application provides an on-line sorting method for peanut seed quality, which comprises the following steps: Acquiring an initial image set of peanut seeds, carrying out feature extraction processing on the initial image set to generate a first feature vector set, and calculating statistical parameters of the first feature vector set to generate first reference distribution data; Collecting an online image of peanuts to be sorted, and carrying out feature extraction processing on the online image to generate an online feature vector; Inputting the online feature vector into a current sorting model for prediction processing to generate a real-time quality prediction result and corresponding prediction confidence coefficient data; after continuously running for a preset period, acquiring online feature vectors corresponding to all the sorted peanuts in the preset period, summarizing the online feature vectors to construct a current period feature set, and calculating statistical parameters of the current period feature set to generate current feature distribution data; calculating the difference degree between the first reference distribution data and the current characteristic distribution data to generate characteristic drift index data; Comparing and judging the characteristic drift index data with a preset drift threshold, and triggering a model updating process if the characteristic drift index data exceeds the drift threshold, wherein the model updating process comprises the following steps: Screening processing is carried out on the basis of the feature set of the current time period and the corresponding prediction confidence coefficient data so as to generate a feature subset with high confidence coefficient and corresponding prediction quality label data; Updating the first reference distribution data based on the high confidence feature subset to generate second reference distribution data; Performing parameter fine tuning on a current sorting model based on the high confidence feature subset and the