Search

CN-121999872-A - Sugarcane parent detection and positioning system based on big data

CN121999872ACN 121999872 ACN121999872 ACN 121999872ACN-121999872-A

Abstract

The invention discloses a system for detecting and positioning sugarcane parents based on big data, which relates to the technical field of agricultural biology and comprises the following components of a data acquisition module, wherein the data acquisition module acquires leaf sample sequencing by adopting a space-time dynamic sampling strategy, adjusts sampling frequency of a sensor network according to growth characteristics, acquires gene expression change data, field growth data, soil component data and climate parameters of the sugarcane parents in a seedling stage, a tillering stage and a maturing stage, and establishes a gene database by constructing a gene detection model through the gene detection module, collects gene expression data of various known sugarcane varieties in a full life cycle, compares the gene data of each stage of a preprocessed target sugarcane parent with the gene expression data, determines variety types by calculating weighting and genetic similarity of each stage, simultaneously analyzes genetic purity and genetic relationship, continuously optimizes the model through a data verification optimization method, adjusts the parameters to ensure detection effect, and greatly improves the accuracy of parent variety identification and genetic analysis.

Inventors

  • LI MING
  • TANG SHIYUN
  • He Panshan
  • JING YAN
  • FANG WEIKUAN
  • ZHOU HUI
  • LUO TING
  • YAN HAIFENG
  • WU XIAOQING

Assignees

  • 广西壮族自治区农业科学院

Dates

Publication Date
20260508
Application Date
20260106

Claims (10)

  1. 1. The sugarcane parent detection and positioning system based on big data is characterized by comprising the following components: the data acquisition module acquires leaf sample sequencing by adopting a space-time dynamic sampling strategy, adjusts the sampling frequency of a sensor network according to growth characteristics, and acquires gene expression change data, field growth data, soil component data and climate parameters of a sugarcane parent seedling stage, a tillering stage and a maturity stage; The big data storage and preprocessing module is used for storing multi-source time sequence data by adopting a distributed architecture, cleaning, de-duplicating, standardizing and extracting characteristics of the data, removing abnormal data, filling missing stage data, filtering high-frequency noise by a technology, and realizing alignment of data with different growth periods; The gene detection module is used for constructing a detection model based on the preprocessed gene data, comparing the detection model with a database of known varieties, determining the types, genetic purity and genetic relationship of parent varieties by constructing the gene detection model, optimizing model parameters and incrementally updating the database; The data association analysis module is used for realizing the backward tracing of the original planting land block of the parent through a dynamic weight association method, constructing a model prediction growth index and plant disease and insect pest resistance based on genes and historical environment data, and dividing the adaptability grade; And the visual display module displays detection, tracing and prediction results in the form of a chart and a map, supports gene expression-environment association thermodynamic diagram, environment data track playback and plot tracing animation, and supports double-end interaction, data query export and historical data comparison.
  2. 2. The system for detecting and positioning the sugarcane parents based on big data according to claim 1 is characterized in that a data acquisition module adopts a space-time dynamic sampling strategy, acquires leaf samples at three key growth stages of a seedling stage, a tillering stage and a maturing stage of the sugarcane parents respectively, performs transcriptome sequencing through gene sequencing equipment to acquire gene expression change data of each stage, including differential expression genes, gene methylation levels and transcriptome data, deploys an Internet of things sensor network in a sugarcane planting field, configures a plant height measuring sensor, a stem diameter sensor and a chlorophyll meter, acquires field growth data of the sugarcane parents at each growth stage in real time, including plant height, stem diameter, leaf number, chlorophyll content and growth cycle data, adjusts sampling frequency in a linkage mode according to growth characteristics of different growth stages, acquires soil component data of a planting region regularly through soil sampling equipment, acquires climate parameters in real time through a field weather station, and establishes an environment variable time sequence database of a land block level to realize time dimension correlation of the environment data and the parent data.
  3. 3. The sugarcane parent detection and positioning system based on big data is characterized in that a distributed file system is adopted by the big data storage and preprocessing module to store massive time sequence data, phase characteristic data comprising sensor IDs, sampling time and land numbers are stored in a combined structured database, so that accurate correlation of time stamps of gene expression data, growth data and environment data is realized, high-frequency environment parameters are specially stored in the time sequence database, high-efficiency data writing and inquiring operation is supported to meet the requirement of the system on data access speed, in a data preprocessing stage, abnormal data such as gene data error bases caused by equipment faults, sequencing errors and the like are identified and removed through an abnormal value identification technology, abnormal values are acquired by the sensor, the missing phase data is filled through a time sequence data complementation technology to avoid the influence of data missing on analysis results, the high-frequency noise in the environment data is removed through a noise filtering technology, the trend item characteristics of the data are reserved, the alignment of different individuals is realized based on a periodic data alignment technology, analysis deviation caused by the difference of planting time is eliminated, and the consistency and the comparability of the data is ensured.
  4. 4. The system for detecting and positioning sugarcane parents based on big data according to claim 1, wherein the gene detection module collects the full life cycle gene expression data of a plurality of known sugarcane varieties covering conventional varieties, hybrid varieties and stress-resistant varieties, the data are stored in a classified manner according to seedling stage, tillering stage and maturity stage, a sugarcane parent gene database with time labels is established, incremental updating according to growth stages is supported, meanwhile, a gene detection model is constructed, the preprocessed target sugarcane parent gene data in each stage are input into the model, multi-stage sequence comparison is carried out on the preprocessed target sugarcane parent gene data in each stage with the known variety gene sequences in the gene database, the similarity of genes in each stage is calculated, meanwhile, the genetic purity index and the genetic relationship of parents are analyzed based on genetic marker data, and the gene detection model is continuously optimized through a data verification optimization method.
  5. 5. The system for detecting and locating sugarcane parents based on big data according to claim 4, wherein the specific formula for calculating the similarity of genes in each stage in the gene detection module is as follows: , wherein, Is the gene matching degree of the target parent and the known variety, Is the first The weight of the stage gene is calculated, Is the target parent The characteristic gene set of the stage is that, Is of the known variety The characteristic gene set of the stage is that, Is a gene And (3) with The homology identity of (2) is 1, the mismatch is 0.
  6. 6. The sugarcane parent detection and positioning system based on big data according to claim 1, wherein the data association analysis module gives dynamic weight coefficients of environmental data of different stages according to the sensitivity difference of parent to the environment of each growth stage, calculates the association degree of gene expression quantity of each stage with soil components and climate parameters, screens key environmental factors with high association and constructs a gene-environment association matrix, carries out similarity matching on the time sequence change track of the key environmental factors of the target parent and a land parcel environmental database containing historical environmental data of a land parcel with known coordinates, and combines GPS reference coordinate correction to realize the reverse tracing of the accurate coordinates of the original planted land parcel.
  7. 7. The system for detecting and locating sugarcane parents based on big data according to claim 6, wherein the data association analysis module performs similarity matching between a key environmental factor time sequence change track of a target parent and a plot environment database containing historical environmental data of plots with known coordinates, and a matching degree calculation formula is as follows: , wherein, Is the matching degree of the target parent and the candidate land block, Is the first The phase context weight is used to determine the phase context weight, Is the first The associated weight of the environmental-like factor, Is the target parent Of stages The class of environmental factor observations is that, Is the candidate block Of stages The class environmental factor history value is used to determine, Is the first Stage(s) The regional average of the environmental-like factors.
  8. 8. The system for detecting and positioning sugarcane parents based on big data according to claim 1, wherein the data association analysis module constructs an adaptability predictor model based on gene expression data, historical environment data and a dynamic weight model, historical climate data of a target region including temperature, precipitation and illumination long-term average values, including pH value, nutrient content soil type data and parent gene expression characteristics including differential expression gene quantity and stress resistance gene expression quantity are taken as model input parameters, a predicted growth index and a disease and pest resistance related result of the target parent in each growth stage of the region are calculated and output through the model, and then the adaptability comprehensive scoring is carried out by combining yield, stress resistance and growth rate factors, and three adaptability grades of strong, medium and weak are divided, so that scientific basis is provided for the layout of a variety region.
  9. 9. The big data based sugarcane parent detection and positioning system according to claim 8, wherein the data correlation analysis module performs adaptive comprehensive scoring in combination with yield, stress resistance and growth rate factors, and the adaptive calculation formula is: , wherein, Is an adaptive composite score, which is a composite score, Is the adaptive weight of the kth stage, Is the k-th stage predicted yield index, Is the index of predicting the risk of plant diseases and insect pests in the kth stage, The weight coefficients for yield and resistance, respectively.
  10. 10. A big data based sugarcane parent detection and localization system according to claim 9, wherein the Is an adaptive composite score when >80 Minutes, judging that the adaptability is strong, when the time is 60-60% <80 Minutes, the medium adaptability is judged, when And <60 minutes, judging that the adaptability is weak.

Description

Sugarcane parent detection and positioning system based on big data Technical Field The invention relates to the technical field of agricultural biology, in particular to a sugarcane parent detection and positioning system based on big data. Background The sugarcane is taken as a global important sugar crop and an energy crop, plays a key role in agricultural economy, the breeding work of the sugarcane has a crucial significance in the aspects of improving the yield, improving the quality, enhancing the stress resistance and the like, along with the rapid development of agricultural biotechnology and information technology, the big data technology and the gene sequencing technology are remarkably improved, a powerful technical support is provided for the innovative research in the agricultural field, and in the field of sugarcane breeding, how to fully utilize the advanced technologies to realize the efficient detection and the accurate positioning of sugarcane parents is a key problem for promoting the further development of the sugarcane industry. In a traditional technical system for detecting and positioning sugarcane parents, the defects to be solved are numerous, the traditional sugarcane parent detection mainly depends on morphological observation and a simple molecular marking technology, the morphological observation is greatly influenced by subjective factors, genetic information in the sugarcane is difficult to accurately reflect, the accuracy of detection results is poor, the simple molecular marking technology is limited in detection range, dynamic changes of gene expression of the sugarcane in different growth stages cannot be comprehensively captured, the detection period is long, the requirement of large-scale detection is difficult to meet, the existing positioning technology can only realize static coordinate recording, original planting plots of the sugarcane parents cannot be reversely traced according to the gene characteristics and growth data of the parents, great inconvenience is caused in management and utilization of parent resources, in addition, the traditional technology lacks consideration of environmental factors in different regions, adaptability of the sugarcane parents in various regions cannot be scientifically predicted, the optimal configuration of the parent resources cannot be obtained in practical application, the resource utilization efficiency is low, the breeding period is prolonged, and sustainable development of the sugarcane industry is severely restricted. Disclosure of Invention The invention aims to overcome the defects of the prior art, and provides a sugarcane parent detection and positioning system based on big data, which can comprehensively collect gene expression change data, field growth data and environment data of sugarcane parents in different growth stages through a data collection module by adopting a space-time dynamic sampling strategy, provide abundant multidimensional time sequence data for subsequent analysis, realize high-efficiency storage and high-quality preprocessing of the data by using a distributed database architecture and an advanced data processing technology, realize accurate identification of parent varieties, analysis of genetic purity and judgment of genetic relationship by a gene detection module, construct a dynamic weight correlation method by a data correlation analysis module, realize accurate coordinate inverse traceability of original planting plots of the parents, construct an adaptive predictor model based on a dynamic weight model, integrally realize deep fusion analysis of the gene data, the growth data and the environment data, provide full-flow scientific support for parent screening, variety optimization and region layout of sugarcane breeding, and effectively promote the breeding efficiency and planting benefit and shorten the period of sugarcane breeding. The invention provides a sugarcane parent detection and positioning system based on big data, which comprises the following components: the data acquisition module acquires leaf sample sequencing by adopting a space-time dynamic sampling strategy, adjusts the sampling frequency of a sensor network according to growth characteristics, and acquires gene expression change data, field growth data, soil component data and climate parameters of a sugarcane parent seedling stage, a tillering stage and a maturity stage; The big data storage and preprocessing module is used for storing multi-source time sequence data by adopting a distributed architecture, cleaning, de-duplicating, standardizing and extracting characteristics of the data, removing abnormal data, filling missing stage data, filtering high-frequency noise by a technology, and realizing alignment of data with different growth periods; The gene detection module is used for constructing a detection model based on the preprocessed gene data, comparing the detection model with a database of known varieties, de