CN-121999870-A - Method and system for predicting breeding characters of Jinhua pigs based on knowledge graph
Abstract
The invention provides a method and a system for predicting the breeding characters of Jinhua pigs based on a knowledge graph, which relate to the technical field of animal breeding and comprise the steps of firstly acquiring multi-source data covering genome, growth, pedigree, carcass, environment and three-dimensional morphology, and constructing a fusion entity and an associated initial knowledge graph. Further, a transmission chain from the ancestral gene to the offspring trait through a specific temperature environment is searched in the map, and a generation-crossing trait formation path is obtained. And then, quantifying the influence intensity of each path and carrying out competitive weighting and fusion in the global map to obtain a probability map marking the contribution degree of each element. And finally, screening path subnetworks matched with future environments for the individuals to be predicted based on the map, and estimating target property predicted values of the individuals under specified conditions by aggregating joint contributions of the path subnetworks. The invention realizes deep mining and quantification of complex breeding association, and remarkably improves the accuracy and the interpretability of prediction.
Inventors
- ZHANG XIAOJUN
- HU XUJIN
- YE SHU
- FANG ZHENTAO
- LOU FANGFANG
- DU XIZHONG
- TU PINGGUANG
- Zhang Chengsai
Assignees
- 金华市农业科学研究院(浙江省农业机械研究院)
Dates
- Publication Date
- 20260508
- Application Date
- 20260127
Claims (10)
- 1. The method for predicting the breeding characters of the Jinhua pigs based on the knowledge graph is characterized by comprising the following steps of: Basic data of the individual Jinhua pigs are obtained, wherein the basic data comprise whole genome sequencing sequences of the individual Jinhua pigs, daily weight and feed intake records from birth to delivery, pedigree of father parent and ancestor, carcass fat content and backfat thickness measured after slaughter, reproduction interval days recorded under a preset temperature interval and whole body three-dimensional point cloud scanning data of the individual Jinhua pigs; performing map construction processing of entities and relations based on the basic data, and constructing edges which are connected with the nodes and have directions and weights according to genetic logic, physiological time sequence, environmental influence and phenotype co-occurrence relations by defining each independent individual, each gene mutation site, each individual weight recording date, each preset temperature interval and each ketone body measurement value as nodes to obtain an initial knowledge map comprising the entities, the attributes and the association relations; Performing guided deep traversal according to the initial knowledge graph, searching a transmission chain which starts from an ancestor gene locus, passes through continuous growth date nodes, is finally connected to individual character nodes of the offspring, and identifying chains which pass through preset temperature interval nodes at the same time to obtain a group of character forming paths revealing the actions of crossing generations and crossing environments; Calculating the quantitative influence intensity of the characteristic forming path on the end point characteristic value according to the characteristic forming path, and carrying out competitive weighting and superposition on the influence intensities of different paths pointing to the same target characteristic in a global structure of a knowledge graph to obtain a probability graph marked with the contribution degree of each node and each side to the characteristic; And carrying out property prediction processing on the individual to be predicted according to the probability map, screening and activating a path sub-network matched with the individual to be predicted according to a preset environment node parameter, and estimating a target property predicted value of the individual under a preset condition by aggregating the combined effect of all contribution degrees in the sub-network.
- 2. The knowledge-based method for predicting the breeding traits of Jinhua pigs according to claim 1, wherein the entity and relation-based map construction process is performed based on the basic data, and the method comprises the following steps: Defining entity nodes and direct relations according to the basic data, defining individual numbers of each Jinhua pig, each detected gene mutation site, the corresponding date of each growth record, the temperature interval of each propagation event and each carcass measurement value as independent nodes in a map, and constructing basic edges for connecting the nodes according to the parental direction in the family, the existence state of the gene sites in the whole genome sequencing sequence and the attribution relation between carcass data and slaughtered individuals to obtain a primary map containing basic entities and direct relations; Carrying out integration processing of scene specific morphological data according to the primary map, extracting a coordinate set corresponding to a preset breeding evaluation in the primary map by analyzing the whole body three-dimensional point cloud scanning data, taking the coordinate set as an attribute set for describing a corresponding individual node morphological structure, establishing a spatial mapping relation between the coordinate set and the carcass measured value node, and adding the mapping relation as a new side into the primary map to obtain an enhanced map for representing the individual three-dimensional morphological structure of the Jinhua pig; Carrying out atlas processing of a time sequence growth mode according to the enhanced atlas, carrying out pairwise connection on weight and feed intake record nodes of the same individual, which are arranged according to the time sequence, in a front-back continuous node pair mode according to the time sequence relation, constructing a time sequence side for describing the individual growth track, and associating the time sequence side with a specific temperature interval node experienced by the same individual to obtain an initial knowledge atlas containing time sequence growth dynamics and key environmental event marks.
- 3. The knowledge-based method for predicting the breeding traits of Jinhua pigs according to claim 1, wherein the guided deep traversal is performed according to the initial knowledge, searching for a transmission chain from an ancestor gene locus through successive growth date nodes and finally connecting to individual trait nodes of the offspring, and identifying the chains which simultaneously cross preset temperature interval nodes, comprising: performing discovery processing of basic inheritance and environment association paths according to the initial knowledge graph, wherein the genetic locus nodes of known father or ancestor individuals are used as starting points, constrained deep traversal is performed along the blood relationship and the time sequence, and all candidate transmission paths which can be communicated to carcass measured value nodes of the father or ancestor individuals are searched out to obtain a group of basic association paths; Carrying out critical path screening treatment of a preset temperature interval according to the basic association paths, wherein the paths containing the nodes of the preset temperature interval are identified and screened through checking node sequences passing through each basic association path, and the positions of the nodes of the preset temperature interval in the paths and associated propagation events are marked to obtain a group of candidate paths with environmental marks; And carrying out construction treatment of a cross-generation continuous action path according to the candidate path with the environmental mark, and reconstructing a multi-generation transmission chain which is obtained by comparing and connecting all candidate paths with common gene locus starting points and respectively ending at individual character nodes of offspring of different generations, wherein the paths are expressed as the same character through multi-generation inheritance and preset temperature intervals from ancestor genes, so as to obtain a group of character forming paths revealing the cross-generation and cross-environmental actions.
- 4. The knowledge-based method for predicting the breeding traits of Jinhua pigs according to claim 1, wherein the quantitative influence intensity of the end-point trait values is calculated according to the trait formation path, and the influence intensities of different paths pointing to the same target trait are competitively weighted and overlapped in a global structure of the knowledge graph, and the method comprises the following steps: Calculating the influence intensity of a single path according to the character forming paths, and quantitatively calculating the initial influence intensity of each character forming path on the character value of the end point by analyzing the mutation state of the initial gene locus node, the weight record difference value of the passing growth date node and the specific numerical value of the carcass measured value node of the end point in each single path and combining the corresponding time span of the single path to obtain a basic path set with the initial intensity value; Performing path contribution modulation processing of temperature exposure according to the basic path set, and estimating the modulation effect of the nodes on the physiological process on the path according to propagation events associated with the nodes by identifying the temperature interval nodes contained in each path, so as to perform scene correction on the initial influence intensity of the path to obtain a group of corrected path influence intensities; and carrying out contribution fusion and labeling processing on the global map according to the corrected path influence intensity, regarding the corrected intensities of all paths pointing to the same node as contribution degrees of both competiveness and synergy, and reversely labeling the contribution degrees to all nodes and edges through which the paths pass to obtain a probability map labeled with the contribution degrees of all the nodes and edges.
- 5. The knowledge-based method for predicting the breeding traits of Jinhua pigs according to claim 1, wherein the predicting the traits of the individuals to be predicted according to the probability map comprises the following steps: Positioning the related elements of the individual to be predicted according to the probability map, and obtaining all related node sets of the individual to be predicted in the map by matching independent individual nodes corresponding to the individual to be predicted in the map and synchronously positioning all nodes associated with the independent individual nodes; Carrying out path subnetwork activation processing according to the related node set, wherein all starting points belong to the related node set and node paths containing the preset temperature interval are screened out from a probability map according to the preset temperature interval, so as to obtain a path subnetwork matched with an individual to be predicted; And carrying out combined effect aggregation and property value deduction processing according to the path subnetwork, aggregating contribution degrees of all paths in the subnetwork to target property nodes along a map topological structure, and calculating a target property predicted value of the individual to be predicted in a preset temperature interval based on a preset target property predicted value calculation formula.
- 6. The utility model provides a golden Hua pig breeding character prediction system based on knowledge graph which characterized in that includes: The acquisition unit is used for acquiring basic data of the individual Jinhua pigs, wherein the basic data comprise whole genome sequencing sequences of the individual Jinhua pigs, daily weight and feed intake records from birth to delivery, pedigrees of father parents and ancestors, carcass fat content and backfat thickness measured after slaughter, propagation interval days recorded under a preset temperature interval and whole body three-dimensional point cloud scanning data of the individual Jinhua pigs; The construction unit is used for carrying out the map construction processing of the entity and the relation based on the basic data, and an initial knowledge map comprising the entity, the attribute and the association relation is obtained by defining each independent individual, each gene mutation site, each individual weight recording date, each preset temperature interval and each ketone body measurement value as nodes and constructing edges which are connected with the nodes and have the direction and the weight according to the genetic logic, the physiological time sequence, the environmental influence and the phenotype co-occurrence relation; the traversing unit is used for conducting guided deep traversing according to the initial knowledge graph, searching a transmission chain which starts from the ancestor gene locus, passes through continuous growth date nodes and is finally connected to the offspring individual character nodes, and identifying chains which pass through the preset temperature interval nodes at the same time to obtain a group of character forming paths revealing the actions of crossing generations and crossing environments; The calculation unit is used for calculating the quantitative influence intensity of the characteristic forming path on the end point characteristic value according to the characteristic forming path, carrying out competitive weighting and superposition on the influence intensities of different paths pointing to the same target characteristic in the global structure of the knowledge graph to obtain a probability graph marked with the contribution degree of each node and each side to the characteristic; The prediction unit is used for carrying out the property prediction processing of the individual to be predicted according to the probability map, screening and activating a path sub-network matched with the probability map according to the preset environmental node parameter, and estimating a target property prediction value of the individual under the preset condition by aggregating the combined effect of all contribution degrees in the sub-network.
- 7. The knowledge-based golden pig breeding trait prediction system according to claim 6, wherein the construction unit comprises: The first construction subunit is used for carrying out definition processing of entity node and direct relation according to the basic data, and constructing basic edges for connecting the nodes by respectively defining individual numbers of each Jinhua pig, each detected gene mutation site, the date corresponding to each growth record, the temperature interval of each propagation event and each carcass measurement value as independent nodes in a map, and according to parent direction in the pedigree, the existence state of the gene site in the whole genome sequencing sequence and the attribution relation between carcass data and slaughtered individuals, so as to obtain a primary map comprising basic entities and direct association; The second construction subunit is used for carrying out integration processing on scene specific morphological data according to the primary map, extracting a coordinate set corresponding to a preset breeding evaluation in the primary map by analyzing the whole body three-dimensional point cloud scanning data, taking the coordinate set as an attribute set for describing the morphological structure of a corresponding individual node, establishing a spatial mapping relation between the coordinate set and the carcass measured value node, and adding the mapping relation as a new side into the primary map to obtain an enhanced map for representing the individual three-dimensional morphological structure of the Jinhua pig; And the third construction subunit is used for carrying out mapping treatment of a time sequence growth mode according to the enhanced map, and carrying out pairwise connection on weight and feed intake record nodes of the same individual, which are arranged according to the time sequence, in a front-back continuous node pair mode according to the time sequence relationship, so as to construct a time sequence side for describing the individual growth track, and associating the time sequence side with a specific temperature interval node experienced by the same individual, so as to obtain an initial knowledge map comprising time sequence growth dynamics and key environmental event marks.
- 8. The knowledge-based golden pig breeding trait prediction system of claim 6, wherein the traversal unit comprises: The first traversal subunit is used for carrying out discovery processing on basic inheritance and environment association paths according to the initial knowledge graph, wherein the genetic locus nodes of known father or ancestor individuals are used as starting points, constrained deep traversal is carried out along the blood margin relation and the time sequence, and all candidate transmission paths which can be communicated to carcass measured value nodes of the father or ancestor individuals are searched out, so that a group of basic association paths are obtained; The second traversing subunit is used for carrying out critical path screening processing of a preset temperature interval according to the basic association paths, wherein the paths containing the nodes of the preset temperature interval are identified and screened out by checking node sequences passing through each basic association path, and the positions of the nodes of the preset temperature interval in the paths and associated propagation events are marked to obtain a group of candidate paths with environmental marks; And the third traversing subunit is used for constructing a cross-generation continuous action path according to the candidate path with the environmental mark, and reconstructing a multi-generation transmission chain which is obtained by comparing and connecting all candidate paths with common gene locus starting points and respectively ending at different generation offspring individual character nodes, wherein the paths are expressed as the same character expression through multi-generation inheritance and preset temperature intervals, and a group of character forming paths revealing the cross-generation and cross-environmental action is obtained.
- 9. The knowledge-based golden pig breeding trait prediction system according to claim 6, wherein the calculating unit comprises: The first calculating subunit is used for calculating the influence intensity of a single path according to the character forming paths, and quantitatively calculating the initial influence intensity of each character forming path on the character value of the end point according to the time span corresponding to the single path by analyzing the mutation state of the initial gene locus node, the weight record difference value of the passing growth date node and the specific numerical value of the carcass measured value node of the end point in each single path to obtain a basic path set with the initial intensity value; The second calculation subunit is used for carrying out path contribution modulation processing of temperature exposure according to the basic path set, and obtaining a group of corrected path influence intensities by identifying a temperature interval node contained in each path and evaluating the modulation effect of the node on the physiological process on the path according to the propagation event associated with the node, so as to carry out scene correction on the initial influence intensity of the path; And the third calculation subunit is used for carrying out contribution fusion and labeling processing on the global map according to the corrected path influence intensity, regarding the corrected intensities of all paths pointing to the same node as contribution of both competiveness and synergy, and reversely labeling the contribution to each node and each side through which the path passes to obtain a probability map labeled with the contribution of each node and each side.
- 10. The knowledge-based golden pig breeding trait prediction system of claim 6, wherein the prediction unit comprises: the first prediction subunit is used for carrying out positioning processing on related elements of an individual to be predicted according to the probability map, and obtaining all related node sets of the individual to be predicted in the map by matching independent individual nodes corresponding to the individual to be predicted in the map and synchronously positioning all nodes associated with the independent individual nodes; The second prediction subunit is used for performing path subnetwork activation processing according to the related node set, wherein all starting points belong to the related node set in a probability map according to a preset temperature interval, and the paths comprise node paths of the preset temperature interval, so that a path subnetwork matched with an individual to be predicted is obtained; And the third prediction subunit is used for carrying out combined effect aggregation and property value deduction processing according to the path subnetwork, aggregating contribution degrees of all paths in the subnetwork to target property nodes along a map topological structure, and calculating a target property predicted value of the individual to be predicted in a preset temperature interval based on a preset target property predicted value calculation formula.
Description
Method and system for predicting breeding characters of Jinhua pigs based on knowledge graph Technical Field In the fields of intelligent agriculture and animal breeding, the utilization of a data driving method for character prediction has become a key direction for improving the breeding efficiency, and in particular, how to integrate multidimensional character data such as growth, propagation, carcass and the like so as to realize accurate breeding. The breeding process of Jinhua pigs involves massive, multi-source, heterogeneous and implying complex spatio-temporal associated data such as genome, continuous growth records, multi-generation genealogy, slaughter assays, environmental responses (such as propagation performance at specific temperatures), three-dimensional body ruler forms and the like, deep network relations between cross-generation genetics, gene-environment interactions, morphological development and final economic traits exist among the data, traditional quantitative genetics models or single machine learning methods usually split the data or establish the association through simplified statistical assumptions, it is difficult to describe and trace the complete, complex causal and non-causal chains from ancestor genes under a unified framework, through individual growth development dynamics, modulated by specific environments and finally embodied as specific phenotypes, especially difficult to quantify the staged influence of environmental factors in cross-generation transmission, and potential contribution of morphological structure details to final traits, which leads to limitation of target trait prediction accuracy of individuals under future specified conditions (such as specific temperature environments). Therefore, a method and a system for predicting the breeding traits of Jinhua pigs based on a knowledge graph are needed to solve the technical problems. Disclosure of Invention The invention aims to provide a method and a system for predicting the breeding characters of Jinhua pigs based on a knowledge graph so as to solve the problems. In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: in a first aspect, the application provides a method for predicting the breeding traits of Jinhua pigs based on a knowledge graph, which comprises the following steps: Basic data of the individual Jinhua pigs are obtained, wherein the basic data comprise whole genome sequencing sequences of the individual Jinhua pigs, daily weight and feed intake records from birth to delivery, pedigree of father parent and ancestor, carcass fat content and backfat thickness measured after slaughter, reproduction interval days recorded under a preset temperature interval and whole body three-dimensional point cloud scanning data of the individual Jinhua pigs; performing map construction processing of entities and relations based on the basic data, and constructing edges which are connected with the nodes and have directions and weights according to genetic logic, physiological time sequence, environmental influence and phenotype co-occurrence relations by defining each independent individual, each gene mutation site, each individual weight recording date, each preset temperature interval and each ketone body measurement value as nodes to obtain an initial knowledge map comprising the entities, the attributes and the association relations; Performing guided deep traversal according to the initial knowledge graph, searching a transmission chain which starts from an ancestor gene locus, passes through continuous growth date nodes, is finally connected to individual character nodes of the offspring, and identifying chains which pass through preset temperature interval nodes at the same time to obtain a group of character forming paths revealing the actions of crossing generations and crossing environments; Calculating the quantitative influence intensity of the characteristic forming path on the end point characteristic value according to the characteristic forming path, and carrying out competitive weighting and superposition on the influence intensities of different paths pointing to the same target characteristic in a global structure of a knowledge graph to obtain a probability graph marked with the contribution degree of each node and each side to the characteristic; And carrying out property prediction processing on the individual to be predicted according to the probability map, screening and activating a path sub-network matched with the individual to be predicted according to a preset environment node parameter, and estimating a target property predicted value of the individual under a preset condition by aggregating the combined effect of all contribution degrees in the sub-network. In a second aspect, the application also provides a system for predicting the breeding traits of Jinhua pigs based on a knowledge graph, which comprises the following steps: The ac