CN-122020580-A - Large model-based bid-bidding document knowledge graph construction and association analysis method
Abstract
The invention relates to the technical field of intelligent analysis of bidding documents, and discloses a method for constructing and relating to analysis of a knowledge graph of a bidding document based on a large model. The method comprises the steps of obtaining multi-source heterogeneous data related to a bid-bidding document, carrying out data cleaning and standardization on the multi-source heterogeneous data to generate data to be analyzed, inputting the data to be analyzed into a pre-trained field large model to construct an initial knowledge graph, periodically obtaining newly-added bid-bidding data and bidding data, preprocessing the newly-added bid-bidding data and inputting the newly-added bid-bidding data and bidding data into the field large model to generate a newly-added result, fusing the newly-added result into the initial knowledge graph to generate a bid-bidding knowledge graph, carrying out multi-mode association analysis on the bid-bidding knowledge graph, outputting a risk mode set, and generating and outputting a risk report according to the risk mode set. Automatic construction, dynamic updating and intelligent risk analysis of bidding knowledge are realized, and the efficiency and risk control capability of bidding management are improved.
Inventors
- LEI WENJUN
- ZHANG ZHIJIE
- XIANG GONGJIN
Assignees
- 华能招采数字科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251201
Claims (10)
- 1. The method for constructing and relating analysis of the knowledge graph of the bidding documents based on the large model is characterized by comprising the following steps: acquiring multi-source heterogeneous data related to a bidding document, and performing data cleaning and standardization on the multi-source heterogeneous data to generate data to be analyzed; inputting the data to be analyzed into a pre-trained field large model, and constructing an initial knowledge graph; periodically acquiring newly-added bid-inviting data and bid-bidding data, preprocessing the newly-added bid-inviting data and bid-bidding data, inputting the newly-added bid-inviting data and bid-bidding data into the large field model, and generating a newly-added result; fusing the new added result to the initial knowledge graph to generate a bidding knowledge graph; And carrying out multi-mode association analysis on the bidding knowledge graph, outputting a risk mode set, and generating and outputting a risk report according to the risk mode set.
- 2. The method for constructing and associating a knowledge graph of a bidding document based on a large model according to claim 1, wherein the steps of obtaining multi-source heterogeneous data related to the bidding document, and performing data cleaning and standardization on the multi-source heterogeneous data to generate data to be analyzed include: performing data cleaning and standardization on the obtained multi-source heterogeneous data to generate intermediate data; periodically updating new entity and new relation generated by the large domain model in the entity identification process to a preset bidding domain keyword and entity list after auditing; extracting key information fragments from the intermediate data according to the updated keyword and entity list of the bidding field; And carrying out structural recombination on the key information fragments to generate data to be analyzed.
- 3. The method for constructing and correlating knowledge graph of bidding documents based on large model as set forth in claim 2, wherein inputting the data to be analyzed into the pre-trained domain large model, when constructing the initial knowledge graph, comprises: inputting the data to be analyzed into a pre-trained domain big model; Performing entity identification and relation extraction by using the large domain model, and outputting a standardized triplet composed of entities, relations and attributes; Mapping the standardized triples to a graph database, and constructing the initial knowledge graph by taking the entity as a node, the relationship as an edge and the attribute as a characteristic of the node.
- 4. The method for building and correlating knowledge graph of bidding documents based on large model as claimed in claim 3, wherein the method for extracting entity identification and relationship by using the large model in the field, and outputting standardized triples composed of entity, relationship and attribute comprises: the large domain model performs entity identification and relation extraction through multi-step reasoning to generate an initial triplet set; based on bidding process logic and legal knowledge, performing verification on the initial triplet set, and identifying abnormal triples with conflicts; and correcting the abnormal triples through a conflict resolution sub-process aiming at the identified abnormal triples, and generating a standardized triplet.
- 5. The method for constructing and correlating knowledge graph of large model-based bidding documents according to claim 4, wherein for the identified abnormal triples, correcting the abnormal triples by conflict resolution sub-process, generating standardized triples comprises: backtracking the original text corresponding to the abnormal triplet, and positioning all associated evidence sentences; Calculating the semantic importance score of each evidence sentence by using the domain big model; sorting all the evidence sentences according to the semantic importance scores to generate ordered evidence sentences; correcting the abnormal triples based on the ordered evidence sentence to generate a plurality of correction candidate triples; Sorting the correction candidate triples based on an evaluation score; Selecting the correction candidate triplet with the highest evaluation score as a final correction result and updating the standardized triplet; Wherein the evaluation score is calculated by the following formula: ; Wherein E is an evaluation score, C is a model confidence coefficient, S is a semantic consistency score, lambda is a coefficient for balancing the model confidence coefficient and the semantic consistency weight, and the value range is [0,1].
- 6. The method for constructing and correlating knowledge graph based on big model of claim 5, wherein mapping the standardized triples to graph database, and constructing the initial knowledge graph by using entity as node, relationship as side, attribute as node, comprises: Converting the standardized triples into a graph data structure, wherein each entity corresponds to a node, each relation corresponds to an edge, and each attribute corresponds to a characteristic value of the node; Importing the graph data structure into a graph database, and constructing an index in the graph database based on entity types and relationship attributes in the graph data to generate an indexed graph; and identifying and correcting the index map, and outputting the initial knowledge map.
- 7. The method for building and correlating knowledge graph of bidding documents based on large model as claimed in claim 6, wherein periodically obtaining newly added bidding data and bidding data, preprocessing the newly added bidding data and inputting the newly added bidding data and bidding data into the large model in the field, and generating new added results, comprising: performing data cleaning and standardization processing on the newly added bid-inviting data and bid data to generate newly added intermediate data; Extracting key information fragments from the newly-added intermediate data based on the updated bidding-related field keywords and the entity list; carrying out structural recombination on the key information fragments to generate a standardized newly-added data batch; executing incremental data distribution processing on the newly added data batch, and distributing the newly added data batch to a corresponding extraction mode according to data characteristics; inputting the shunted data into the field large model, and executing entity identification and relation extraction matched with the distribution mode to generate an initial triplet set; And carrying out logic consistency check and conflict resolution on the initial triplet set, and outputting a standardized triplet set formed by newly added entities and relations as the newly added result.
- 8. The method for building and correlating knowledge graph of a large model-based bidding document according to claim 7, wherein the incremental data splitting process is performed on the newly added data batch, and the incremental data splitting process is assigned to a corresponding extraction mode according to data characteristics, and the method comprises: Calculating the semantic similarity between the newly added data batch and the existing item in the initial knowledge graph to obtain an overall similarity score; comparing the overall similarity score with a preset threshold value, and dividing the items into high-similarity items or low-similarity new items according to a comparison result; Based on the item type division result, a first extraction mode is allocated to the high-similarity item, and a second extraction mode is allocated to the low-similarity new item; The first extraction mode is a verification extraction mode based on an existing map, and the second extraction mode is an extraction mode facing to an unknown domain structure.
- 9. The method for building and correlating knowledge graph of bidding documents based on large model of claim 8, wherein fusing the new result to the initial knowledge graph, when generating bidding knowledge graph, comprises: identifying a newly added entity in the standardized triplet set; Performing entity alignment on the newly added entity and the existing entity in the initial knowledge graph, and detecting and resolving entity conflict; fusing successfully aligned entities, and adding the unaligned newly added entities as new nodes into the initial knowledge graph; Establishing corresponding edge connection based on the newly added relation, and updating the initial knowledge graph; and checking the updated knowledge graph to generate a bidding knowledge graph.
- 10. The method for building and associating knowledge graph of bidding documents based on large model according to claim 9, wherein the method for multi-modal associating analysis of the knowledge graph of bidding documents, outputting a risk pattern set, and generating and outputting a risk report according to the risk pattern set, comprises: Performing multi-mode association analysis on the bidding knowledge graph, and identifying a risk mode in the multi-mode association analysis; The identified risk modes are subjected to aggregation and de-duplication processing, and a risk mode set is generated; and generating and outputting a corresponding risk report based on the risk mode set.
Description
Large model-based bid-bidding document knowledge graph construction and association analysis method Technical Field The invention relates to the technical field of intelligent processing and risk analysis of bidding documents, in particular to a large-model-based bidding document knowledge graph construction and association analysis method. Background With the increasing digitization and complexity of bidding activities, the bidding documents and related data exhibit multi-source, heterogeneous, massive characteristics. The traditional risk analysis method mainly relies on expert manual review or keyword matching based on simple rules, has obvious defects, is difficult to accurately and efficiently extract entities and complex relations thereof from unstructured texts, and is incomplete in knowledge construction and low in efficiency, the traditional method lacks dynamic fusion and conflict resolution capability of newly-added data, a knowledge base is difficult to update in real time, and an information island is easy to form. Therefore, there is an urgent need in the art for a method that can automatically and accurately construct a bidding knowledge system, dynamically fuse new data, and perform deep association analysis to identify complex risk patterns. Disclosure of Invention The invention aims to provide a large-model-based bid-bidding document knowledge graph construction and association analysis method, which aims to solve the problems of low knowledge construction efficiency, weak dynamic updating capability, no deep risk analysis and the like in the prior art. In some embodiments of the present application, a method for constructing and associating knowledge graph of a bidding document based on a large model is provided, including: acquiring multi-source heterogeneous data related to a bidding document, and performing data cleaning and standardization on the multi-source heterogeneous data to generate data to be analyzed; inputting the data to be analyzed into a pre-trained field large model, and constructing an initial knowledge graph; periodically acquiring newly-added bid-inviting data and bid-bidding data, preprocessing the newly-added bid-inviting data and bid-bidding data, inputting the newly-added bid-inviting data and bid-bidding data into the large field model, and generating a newly-added result; fusing the new added result to the initial knowledge graph to generate a bidding knowledge graph; And carrying out multi-mode association analysis on the bidding knowledge graph, outputting a risk mode set, and generating and outputting a risk report according to the risk mode set. In some embodiments of the present application, obtaining multi-source heterogeneous data related to a bidding document, and performing data cleaning and standardization on the multi-source heterogeneous data to generate data to be analyzed includes: performing data cleaning and standardization on the obtained multi-source heterogeneous data to generate intermediate data; periodically updating new entity and new relation generated by the large domain model in the entity identification process to a preset bidding domain keyword and entity list after auditing; extracting key information fragments from the intermediate data according to the updated keyword and entity list of the bidding field; And carrying out structural recombination on the key information fragments to generate data to be analyzed. In some embodiments of the present application, inputting the data to be analyzed into a pre-trained domain big model, when constructing an initial knowledge graph, the method includes: inputting the data to be analyzed into a pre-trained domain big model; Performing entity identification and relation extraction by using the large domain model, and outputting a standardized triplet composed of entities, relations and attributes; Mapping the standardized triples to a graph database, and constructing the initial knowledge graph by taking the entity as a node, the relationship as an edge and the attribute as a characteristic of the node. In some embodiments of the present application, when entity identification and relationship extraction are performed by using the domain big model, and a standardized triplet composed of an entity, a relationship and an attribute is output, the method includes: the large domain model performs entity identification and relation extraction through multi-step reasoning to generate an initial triplet set; based on bidding process logic and legal knowledge, performing verification on the initial triplet set, and identifying abnormal triples with conflicts; and correcting the abnormal triples through a conflict resolution sub-process aiming at the identified abnormal triples, and generating a standardized triplet. In some embodiments of the present application, for an identified abnormal triplet, the abnormal triplet is corrected by a conflict resolution sub-flow, and when generating a standardized t