Search

CN-121998060-A - Academic information conflict resolution method based on evidence intensity index grading

CN121998060ACN 121998060 ACN121998060 ACN 121998060ACN-121998060-A

Abstract

The invention discloses an academic information conflict resolution method based on evidence intensity index grading. The method comprises the steps of adopting a full-scale retention, non-merging and non-deleting strategy to retain all conflict views and traceable chains thereof, calculating multidimensional weighted evidence intensity indexes of comprehensive journal influence factors, the number of times of introduction and the academic influence of authors for each document, judging conflict severity according to the evidence intensity index differences of the conflict parties, judging that the severity is higher when the difference is smaller, modeling a conflict cluster as an auxiliary node layer in a graph database, overlapping the auxiliary node layer on a core map but not interfering with a body frame, and driving calibration reduction of downstream weight calculation by a conflict severity grading result. The invention can be used for information conflict management in academic knowledge graph construction and provides interpretable conflict severity grading reference.

Inventors

  • ZHANG JUNLING
  • CHEN HUAYONG
  • ZHOU HAOYANG
  • XIAO BING
  • WANG HAO
  • WU CHAO

Assignees

  • 中国科学院广州地球化学研究所

Dates

Publication Date
20260508
Application Date
20260317

Claims (8)

  1. 1. The academic information conflict resolution method based on evidence intensity index grading is characterized by comprising the following steps of: S1, describing contradictions of the same facts supported by multiple sources in a knowledge graph, identifying the contradiction description of the same facts supported by multiple sources in the knowledge graph in a triad extraction stage of the knowledge graph, storing all conflict views by adopting a full-scale retention strategy, prohibiting automatic merging and deleting of any conflict views in the extraction stage, and attaching a complete document tracing chain to each conflict view; s2, de-duplicating the document, wherein when the document has the DOI mark, DOI is used as a unique main key of the document, if the DOI is the same, different language versions of the same document are regarded as the same document, and when the document does not have the DOI mark, a document fingerprint is generated by the combination of the document title, the publication year and the first author to be used as a de-duplication key; S3, calculating a multidimensional weighted evidence intensity index ESI for each document supporting the conflict viewpoint, and carrying out weighted summation on the three dimensions of an influence factor normalization value, a document total introduction times normalization value and academic influence index normalization values of a first author and a communication author of the journal in which the evidence intensity index comprehensive document is located; S4, calculating the difference value between evidence intensity indexes of corresponding documents according to the conflict views of the same facts, and judging the conflict severity level according to the difference value, wherein the smaller the difference value is, the higher the severity is; S5, modeling each group of conflict views as conflict auxiliary nodes in the graph database, wherein the conflict views are connected to the conflict auxiliary nodes through the establishment of the dispute view relationship, the conflict views are established to be two-way opposite links through the contradictory relationship, and the conflict auxiliary nodes are overlapped on the core graph as labeling layers and do not participate in the definition of a body TBox framework of the core graph; S6, driving the conflict severity classification result to calibrate and reduce the calculation of the downstream weight.
  2. 2. The academic information conflict resolution method based on evidence intensity index ranking according to claim 1, wherein the complete version of evidence intensity index ESI in S3 is calculated by the following formula: ESI = α×IF_norm + β×Citation_norm + γ×H_norm(1) the IF_norm is an influence factor normalization value of a journal in which a document is located, citation _norm is a total introduced times normalization value of the document, H_norm is an H index normalization value of a first author or a communication author of the document, alpha, beta and gamma are weight coefficients, and alpha+beta+gamma=1; When the non-zero filling rate of the H index data is lower than a preset threshold value, automatically switching to a simplified version calculation formula: ESI_simple = α'×IF_norm + β'×Citation_norm(2) Where α 'and β' are renormalized from α and β in the full version in the original ratio, i.e., α '=α/(α+β), β' =β/(α+β).
  3. 3. The method for solving the academic information conflict based on the evidence intensity index ranking according to claim 1, wherein the step S4 of calculating the difference between the evidence intensity indexes of the corresponding documents is to obtain the evidence intensity indexes of the documents supporting the views of both conflicting documents, respectively, and the evidence intensity indexes of the documents supporting both conflicting documents are denoted as ESI_A and ESI_B, and the absolute value of the difference between the two is calculated as the ESI difference, namely DeltaESI= |ESI_A-ESI_B|.
  4. 4. The academic information conflict resolution method based on evidence intensity index ranking according to claim 1, wherein said judgment of the severity level of conflict in S4 adopts anti-intuitional logic, and the specific judgment rules are as follows: S41, high level (high) is that ESI difference is less than or equal to 0.30, the authoritativeness of the evidences of the two documents is close, the divergence is difficult to automatically judge through the intensity of the evidences, and manual expert intervention is needed; s42, medium grade (medium) that ESI difference value is less than or equal to 0.30 and less than or equal to 0.50, and the evidence of the two parties has a certain gap, but still forms substantial divergence, and manual review is suggested; S43, low grade (low), wherein ESI difference is more than 0.50, evidence of one party is obviously more authoritative, conflict can be naturally relieved through evidence grading, and the viewpoint of the high ESI party can be used as a preferential reference.
  5. 5. The academic information conflict resolution method of claim 1 wherein said graph modeling of conflict auxiliary nodes in S5 includes creating a dispute cluster auxiliary node for each conflict cluster, creating a dispute cluster auxiliary node in the graph database for node attributes in the form of a conflict fact description, a conflict severity level, an ESI difference value, and a conflict state.
  6. 6. The academic information conflict resolution method based on evidence intensity index grading is characterized in that the establishing mode of the dispute view relationship in S5 is that in a graph database, a conflict view triple node is taken as a starting node, a dispute cluster auxiliary node corresponding to a conflict cluster to which the conflict view belongs is taken as a target node, a directed relationship creation operation is executed, a relationship type is designated as a dispute view, and an evidence intensity index score, an evidence intensity grade and a document tracing chain of a document corresponding to the view are written into attribute fields of the relationship.
  7. 7. The academic information conflict resolution method based on evidence intensity index grading is characterized in that the contradiction relation in S5 is established by executing two symmetrical relation creation operations in a graph database for each pair of triplet nodes which hold contradiction points in the same dispute cluster, wherein the first time uses a first triplet node as a starting node and a second triplet node as a target node to create contradiction relation, the second time uses the second triplet node as the starting node and the first triplet node as the target node to create contradiction relation, and the attributes of the two relations comprise contradiction types.
  8. 8. The method for solving the academic information conflict based on the evidence intensity index ranking according to claim 1, wherein the downstream influence calculation formula of the calibration tradeoff in S6 is: IWI_final = FW × ConflictCalibration ···(3) FW is the basic fusion weight of the entity node, conflictCalibration is the conflict calibration factor, 0.85 is taken at the high level, 1.0 is taken at the middle and low levels, and IWI_final is the final comprehensive weight index after calibration.

Description

Academic information conflict resolution method based on evidence intensity index grading Technical Field The invention relates to the technical field of knowledge maps, in particular to an academic information conflict resolution method based on evidence intensity index grading. Background With the deep development of knowledge graph technology in academic research and industrial application, automatic extraction of structured knowledge from multi-source heterogeneous documents has become a core link of knowledge graph construction. However, in the actual construction process, the descriptions of the same fact by different documents frequently contradict, especially in the field of experience science such as geology, medicine, ecology and the like, academic consensus is dynamically evolved, and different research teams often give distinct judgments on the cause, attribute or classification of the same geologic body based on different samples, methods and theoretical frameworks. The existing knowledge graph conflict processing method mainly comprises a voting mechanism and a confidence ordering. The voting mechanism adopts a majority winning strategy, takes the views supported by a plurality of sources as final selection, and takes the view of the source with the highest confidence as a result in the confidence ordering. However, in the academic field, both methods are unreasonable. The voting mechanism implies the assumption of "majority, i.e., correct", but the science history repeatedly indicates that the minority group perspective eventually replaces the majority group consensus as the norm of academic evolution, e.g., the plate construction theory was long-term negated by the majority group of geosyncline theory. Confidence ranking ignores that the higher cited documents may accumulate more citations due to earlier publication times, and the lower cited latest studies may contain more accurate insights. In addition, the method based on knowledge graph embedding (such as TransE and other link prediction models) can implicitly judge the rationality of the triples in the vector space, and although the consistency of knowledge can be evaluated to a certain extent, the complete views of both conflict parties cannot be reserved, and the interpretable judgment of the conflict severity cannot be given. SHACL, et al, constraint verification tools are aimed at "correcting errors," treating inconsistencies as defects that need to be corrected, but in academic contexts, collisions are not errors, but rather expressions of cognitive uncertainty, forcing "correction" to remove a significant minority of views. Therefore, the prior art lacks a systematic method which can not only keep the full-scale conflict view, but also provide evidence intensity differentiation grading reference so as to support the reliable application of the knowledge graph in academic information management. Aiming at the problems, the invention provides an academic information conflict solving method based on evidence intensity index grading, which aims to solve the technical problem that multi-source document conflict in the knowledge graph construction process in the prior art cannot be considered to be full-scale reserved and differentiated grading management. Disclosure of Invention The invention aims to provide an academic information conflict resolution method based on evidence intensity index grading, which specifically comprises the following steps of: S1, describing contradictions of the same facts supported by multiple sources in a knowledge graph, identifying the contradiction description of the same facts supported by multiple sources in the knowledge graph in a triad extraction stage of the knowledge graph, adopting a full-scale retention strategy to store all conflict views, prohibiting automatic merging and deleting of any conflict views in the extraction stage, and attaching a complete document tracing chain to each conflict view, wherein the document DOI, fingerprint deduplication, document original paragraph positioning, extraction time stamp and extraction model version are included; s2, de-duplicating the document, wherein when the document has the DOI mark, DOI is used as a unique main key of the document, if the DOI is the same, different language versions of the same document are regarded as the same document, and when the document does not have the DOI mark, a document fingerprint is generated by the combination of the document title, the publication year and the first author to be used as a de-duplication key; S3, calculating a multidimensional weighted evidence intensity index ESI for each document supporting the conflict viewpoint, and carrying out weighted summation on three dimensions of an influence factor normalization value, a document total introduction times normalization value and an academic influence index normalization value of a first author or a communication author of the document of the journal in whic