Search

CN-121479153-B - Germplasm screening method

CN121479153BCN 121479153 BCN121479153 BCN 121479153BCN-121479153-B

Abstract

The application provides a germplasm screening method which comprises the steps of obtaining a preliminary screening sample set containing multi-phenotype data, normalizing a phenotype value to eliminate heterogeneity among traits, calculating comprehensive scores of each sample based on the normalized value, and finally screening a preferable sample set according to the scores. Through an automatic normalization and comprehensive scoring process, the difficult problem that the multi-trait data is difficult to integrate due to different measurement scales and genetic backgrounds is effectively solved, breeders are liberated from tedious and subjective manual judgment, a standard and efficient large-scale germplasm screening scheme is formed, and the accuracy and efficiency of excellent germplasm resource mining are remarkably improved.

Inventors

  • GU JU
  • Jiao Chengzhi
  • XU FENGFENG
  • ZHANG RUIZHOU

Assignees

  • 青岛极智生物技术有限公司

Dates

Publication Date
20260508
Application Date
20260108

Claims (5)

  1. 1. A germplasm screening method, comprising the steps of: Acquiring a preliminary screening sample set, wherein the preliminary screening sample set comprises sample serial numbers of all preliminary screening samples and data sets corresponding to the sample serial numbers, and each data set comprises a plurality of phenotypes and phenotype values corresponding to each phenotype; Normalizing the phenotype values corresponding to the phenotypes of the preliminary screening samples according to all the phenotype values corresponding to the same phenotype in the preliminary screening sample set to obtain normalized phenotype values of the phenotypes of the preliminary screening samples; Calculating according to the normalized form value to obtain a comprehensive score set, wherein the comprehensive score set comprises the preliminary screening samples and the comprehensive score values corresponding to the preliminary screening samples; screening the comprehensive scoring set to obtain a preferred sample set, wherein the preferred sample set comprises sample serial numbers of a plurality of preferred samples and data sets corresponding to the sample serial numbers; the step of obtaining a preliminary screening sample set comprises the following steps: acquiring an initial sample set, wherein the initial sample set comprises sample serial numbers of all initial samples and data sets corresponding to the sample serial numbers, and the data sets comprise a plurality of phenotypes and phenotype values corresponding to the phenotypes; obtaining screening conditions of each phenotype, wherein the screening conditions comprise basic conditions; determining whether each of said phenotypes simultaneously satisfies said underlying condition of said phenotype corresponding thereto; If yes, setting the initial sample corresponding to the initial sample as the preliminary screening sample; Obtaining a primary screening sample set according to the primary screening sample and a data set corresponding to the primary screening sample; The screening of the integrated score set to obtain a preferred sample set, wherein the preferred sample set comprises sample serial numbers of a plurality of preferred samples and a data set corresponding to the preferred sample serial numbers, and the method comprises the following steps: setting a high-quality percentage; sorting the preliminary screening samples in the comprehensive scoring set according to the magnitude of the comprehensive scoring values; Selecting the preliminary screening samples with the top ranking as candidate samples according to the high quality percentage, and constructing a candidate sample set, wherein the candidate sample set comprises the candidate samples and a data set corresponding to the candidate samples; Screening the candidate sample set to obtain the preferred sample set; The screening conditions further comprise quality conditions, and the screening of the candidate sample set to obtain the preferred sample set comprises the following steps: judging whether each phenotype of the candidate sample meets the high-quality condition corresponding to the phenotype; If at least one obtained phenotype meets the high-quality condition, setting the candidate sample corresponding to the phenotype as a preferable sample; the preferred sample set is constructed, wherein the preferred sample set comprises the preferred samples and data sets corresponding to the preferred samples, the preferred samples are samples which meet the condition that comprehensive scores are ranked ahead and at least one phenotype reaches the high-quality condition at the same time, the comprehensive scores are screened to obtain the preferred sample set, the preferred sample set comprises sample serial numbers of a plurality of preferred samples and the data sets corresponding to the preferred samples, and the method further comprises the following steps: Obtaining a phenotype level for each phenotype in the preferred sample, wherein the phenotype level comprises acceptable and premium; and constructing a dynamic interaction graph according to the preferred sample set, wherein the dynamic interaction graph visually displays the sample serial numbers, phenotypes, phenotype values corresponding to the phenotypes and the phenotype levels corresponding to the phenotypes of the preferred samples.
  2. 2. The germplasm screening method according to claim 1, wherein the phenotype level of each phenotype in the preferred sample is obtained by: setting the phenotype level of the phenotype satisfying the quality condition in the preferred sample to be quality, and setting the phenotype level of the phenotype not satisfying the quality condition in the preferred sample to be qualified.
  3. 3. The germplasm screening method according to claim 1, wherein the calculation according to the normalized form value obtains a comprehensive score set, the comprehensive score set includes each of the preliminary screening samples and a comprehensive score value corresponding thereto, and is implemented by the following formula: Formula (1) Wherein Score represents the composite Score value, i represents the i-th phenotype in the dataset, n represents the number of phenotypes in the dataset, Representing normalized phenotype values for the ith phenotype, W i represents any value in the range of phenotype weights for the ith phenotype, and T i represents the phenotype trend coefficient for the ith phenotype.
  4. 4. The germplasm screening method according to claim 1, wherein after calculating a comprehensive score set according to the normalized phenotype values, all the comprehensive score values are corrected to be positive before screening the comprehensive score set.
  5. 5. The germplasm screening method according to claim 3, wherein the normalization processing is performed on the phenotype values corresponding to the phenotypes of the preliminary screening samples according to all the phenotype values corresponding to the same phenotype in the preliminary screening sample set, so as to obtain normalized phenotype values of the phenotypes of the preliminary screening samples, and the normalized phenotype values are obtained by the following formula: Formula (2) Wherein phe i represents the phenotype value of the i-th phenotype in a given one of said preliminary screening samples, phe min represents the minimum phenotype value in each of said phenotypes in all of the preliminary screening samples, and phe max represents the maximum phenotype value in each of said phenotypes in all of the preliminary screening samples.

Description

Germplasm screening method Technical Field The application relates to the technical field of germplasm screening, in particular to a germplasm screening method. Background In crop genetic improvement and breeding practice, excavation and evaluation of excellent germplasm resources are fundamental links. With the change of breeding targets from single trait optimization to multiple traits such as yield, resistance, quality, adaptability and the like, the traditional screening method relying on single trait expression is difficult to meet actual demands. Particularly in the modern agriculture context, breeders need to comprehensively evaluate and balance among a plurality of complex traits which possibly have trade-off relationships so as to select and breed materials with advantages in various aspects. In the prior art, screening for multiple trait germplasm often relies on breeders manually setting thresholds and making independent assessments of each trait, or making simple subjective weighted comparisons. The genetic background, environmental sensitivity and metric scale of different traits are obviously different, and the traditional method is difficult to consider the interaction effect and the mutual trade-off between the traits. Disclosure of Invention In view of the above-mentioned drawbacks or shortcomings of the prior art, it is desirable to provide a germplasm screening method comprising the steps of: Acquiring a preliminary screening sample set, wherein the preliminary screening sample set comprises sample serial numbers of all preliminary screening samples and data sets corresponding to the sample serial numbers, and each data set comprises a plurality of phenotypes and phenotype values corresponding to each phenotype; Normalizing the phenotype values corresponding to the phenotypes of the preliminary screening samples according to all the phenotype values corresponding to the same phenotype in the preliminary screening sample set to obtain normalized phenotype values of the phenotypes of the preliminary screening samples; Calculating according to the normalized form value to obtain a comprehensive score set, wherein the comprehensive score set comprises the preliminary screening samples and the comprehensive score values corresponding to the preliminary screening samples; and screening the comprehensive scoring set to obtain a preferred sample set, wherein the preferred sample set comprises sample serial numbers of a plurality of preferred samples and data sets corresponding to the sample serial numbers. According to the technical scheme provided by the application, the method for acquiring the preliminary screening sample set comprises the following steps: acquiring an initial sample set, wherein the initial sample set comprises sample serial numbers of all initial samples and data sets corresponding to the sample serial numbers, and the data sets comprise a plurality of phenotypes and phenotype values corresponding to the phenotypes; obtaining screening conditions of each phenotype, wherein the screening conditions comprise basic conditions; determining whether the phenotype satisfies the underlying condition of the phenotype corresponding thereto; If yes, setting the initial sample corresponding to the initial sample as the preliminary screening sample; and obtaining the preliminary screening sample set according to the preliminary screening sample and the data set corresponding to the preliminary screening sample. According to the technical scheme provided by the application, the screening of the comprehensive scoring set to obtain a preferred sample set, wherein the preferred sample set comprises sample serial numbers of a plurality of preferred samples and data sets corresponding to the preferred sample serial numbers, and the method comprises the following steps: setting a high-quality percentage; sorting the preliminary screening samples in the comprehensive scoring set according to the magnitude of the comprehensive scoring values; Selecting the preliminary screening samples with the top ranking as candidate samples according to the high quality percentage, and constructing a candidate sample set, wherein the candidate sample set comprises the candidate samples and a data set corresponding to the candidate samples; And screening the candidate sample set to obtain the preferred sample set. According to the technical scheme provided by the application, the screening conditions further comprise high-quality conditions, and the step of screening the candidate sample set to obtain the preferred sample set comprises the following steps: judging whether each phenotype of the candidate sample meets the high-quality condition corresponding to the phenotype; If at least one obtained phenotype meets the high-quality condition, setting the candidate sample corresponding to the phenotype as a preferable sample; and constructing the preferred sample set, wherein the preferred sample set comprises th