Search

CN-121681812-B - Contract term risk intelligent identification and evaluation method and system

CN121681812BCN 121681812 BCN121681812 BCN 121681812BCN-121681812-B

Abstract

The invention discloses an intelligent contract term risk identification and assessment method and system, and relates to the technical field of intelligent contract examination. The method comprises the steps of obtaining a target contract document, carrying out self-adaptive segmentation on clause boundaries and identifying legal attribute categories by adopting a contract field pre-training language model, identifying a risk mode by semantic similarity matching based on a multi-dimensional risk feature library, carrying out structural flaw detection and calculating right obligation equivalent indexes, constructing a two-dimensional matrix of risk severity and occurrence probability to calculate clause risk grade scores, automatically generating modification suggestion texts aiming at high-risk clauses to search a legal provision knowledge base, and outputting a contract risk assessment report, clause annotation labels and a negotiation point prompt list. The invention solves the problems of inaccurate cutting of clause boundaries, missing equivalent analysis of right obligations, single risk assessment dimension and lacking generation capability of modification suggestions in the prior art.

Inventors

  • GAO YUXIN
  • XU JUN
  • DU BOWEN

Assignees

  • 南京信息工程大学

Dates

Publication Date
20260512
Application Date
20260210

Claims (9)

  1. 1. The contract term risk intelligent identification and assessment method is characterized by comprising the following steps: S1, analyzing multi-granularity clauses, namely acquiring a target contract document to be inspected, carrying out clause boundary self-adaptive segmentation on the target contract document by adopting a contract field pre-training language model to obtain a clause boundary segmentation result, and identifying legal attribute categories corresponding to each clause text segment, wherein the legal attribute categories comprise at least one of main clauses, target clauses, price clauses, fulfillment clauses, default responsibility clauses, dispute resolution clauses, confidentiality clauses and unreliability clauses; s2, risk identification, namely carrying out risk identification processing on each clause text segment in the clause boundary segmentation result based on a multidimensional risk feature library, identifying a risk pattern matching result from each clause text segment through semantic similarity matching, wherein the risk pattern matching result comprises at least one of a right obligation unequal clause, a unilateral disclaimer trap clause, an infinite link obligation clause, a jurisdiction adverse clause and an intellectual property attribution fuzzy clause; the right obligation peer index value, the calculating comprising: Extracting a right element set and an obligation element set from each clause text segment; respectively calculating the right intensity value of the right element set and the obligation intensity value of the obligation element set; calculating a rights obligation equivalent index value according to the weighted ratio of the rights strength value and the obligation strength value; when the right obligation peer index value is lower than a preset peer threshold value, judging that the right obligation unequal risk exists in the clause text segment; S3, risk assessment, namely determining the risk severity level of each clause text segment based on a risk pattern matching result and a structural flaw detection result, determining the occurrence probability level of each clause text segment based on the reference probability and the right obligation equivalent index value corresponding to the legal attribute category, constructing a risk assessment two-dimensional matrix according to the risk severity level and the occurrence probability level, and calculating the clause risk level score of each clause text segment; S4, generating modification advice, namely screening high-risk clauses with risk grade scores exceeding a preset risk threshold, searching a legal provision knowledge base aiming at each high-risk clause to obtain matched legal provision basis information, and automatically generating modification advice texts based on the legal provision basis information; and S5, outputting results, namely generating a contract risk assessment report based on the clause risk grade scores of the clause text segments, generating clause annotation data based on the risk pattern matching results and the modification suggestion text, and generating a negotiation gist prompt list based on the high-risk clauses and legal provision according to the information.
  2. 2. The method of claim 1, wherein in step S1, the term boundary adaptive segmentation comprises: performing text extraction pretreatment on the target contract document to obtain a document text sequence; Calculating semantic consistency scores between adjacent text segments on a document text sequence by adopting a sliding window mechanism; Determining a position with the semantic consistency score lower than a preset consistency threshold as a clause boundary candidate point; and carrying out boundary confidence assessment on the clause boundary candidate points based on the contract field pre-training language model, reserving the candidate points with the boundary confidence exceeding a preset confidence threshold as final clause boundaries, and generating clause boundary segmentation results.
  3. 3. The method for intelligently identifying and evaluating risk of contract terms according to claim 2, wherein the method for calculating semantic consistency scores comprises the following steps: Respectively inputting the preamble text segment and the postamble text segment in the sliding window into a pre-training language model in the contract field to obtain a preamble semantic vector and a postamble semantic vector; the cosine similarity of the preamble semantic vector and the postamble semantic vector is calculated as a semantic consistency score, and the formula is as follows: ; Wherein, the Is indicated in the position The semantic continuity score of the text segment is the value range of [ -1,1], and the higher the value is, the stronger the semantic relevance of the text segment before and after the text segment is; Representing the position The semantic vector of the previous text segment, Representing the position Then the semantic vector of the text segment; Representing the L2 norm of the vector.
  4. 4. The method for intelligently identifying and evaluating risk of contract terms according to claim 1, wherein in step S2, the semantic similarity matching includes: Text segments of each clause Inputting the contract field pre-training language model to obtain clause and semantic vectors ; Template each risk mode in multi-dimensional risk feature library Inputting the contract field pre-training language model to obtain a risk mode semantic vector ; Calculating a semantic similarity value between the clause semantic vector and each risk pattern semantic vector, wherein the formula is as follows: ; Wherein, the Representing clause text segments Risk pattern template Semantic similarity between the two values is within the range of [ -1,1]; When the semantic similarity value exceeds a preset similarity threshold, judging that the corresponding risk mode exists in the clause text segment, and generating a risk mode matching result.
  5. 5. The method of claim 4, wherein in step S3, the calculating the term risk level score comprises: determining risk severity level based on type and number of risk patterns in risk pattern matching results and structural flaw detection results ; Determining the occurrence probability level according to the deviation degree of the reference probability and the right obligation equivalent index value corresponding to the legal attribute category ; Taking the risk severity level and the occurrence probability level as row-column indexes of a risk assessment two-dimensional matrix; Calculating a clause risk level score based on the risk assessment two-dimensional matrix, the clause risk level score being equal to a weighted sum of the risk severity level and the occurrence probability level, the formula being: ; Wherein, the Representing clause text segments Is used for the comprehensive risk level scoring of (a), The weight of the severity is indicated as a weight of the severity, The probability weight is represented as a function of the probability, Representing the interaction term coefficients.
  6. 6. The method according to claim 1, wherein in step S2, the structural flaw detection comprises: Detecting whether each clause text segment lacks a requisite contract clause, wherein the requisite contract clause comprises a main body information clause, a target clause, a price clause and a fulfillment deadline clause; detecting whether the expression of key elements in each clause text segment is complete, wherein the key elements comprise time elements, money elements, main body elements and place elements; At least one of missing requisite clauses and key element expression imperfections is taken as a structural flaw detection result.
  7. 7. The method for intelligently identifying and evaluating risk of contract term according to claim 1, wherein in step S4, the modification advice generating method includes: Determining a modification direction label based on the risk pattern matching result of the high risk clause; Retrieving the matched legal provision basis information from the legal provision knowledge base according to the modification direction label; Generating a modification suggestion text based on the original text of the high risk clause and legal provision according to the information by adopting a text generation model; and carrying out association labeling on the modification suggestion text and legal provision according to the information.
  8. 8. The method for intelligently identifying and evaluating risk of contract terms according to claim 1, wherein in step S5, the negotiation gist prompt list is generated by the method comprising: Extracting a core dispute focus from each high risk clause; Prioritizing the core dispute focus according to the clause risk level scores; Generating negotiation strategy suggestions corresponding to each core dispute focus based on legal provision according to the information; the core dispute focus, prioritization, and negotiation strategy suggestions are integrated into a negotiation gist hint list.
  9. 9. A contract term risk intelligent identification assessment system for implementing the method of any of claims 1-8, comprising: The multi-granularity clause analysis module is used for acquiring a target contract document to be inspected, performing clause boundary self-adaptive segmentation on the target contract document by adopting a contract field pre-training language model to obtain a clause boundary segmentation result, and identifying legal attribute categories corresponding to each clause text segment; The risk identification module is used for carrying out risk identification processing on each clause text segment in the clause boundary segmentation result based on the multidimensional risk feature library, identifying a risk pattern matching result from each clause text segment through semantic similarity matching, carrying out structural flaw detection on each clause text segment to obtain a structural flaw detection result, and calculating the right obligation equivalent index value of each clause text segment; the risk assessment module is used for determining the risk severity level of each clause text segment based on the risk pattern matching result and the structural flaw detection result, determining the occurrence probability level of each clause text segment based on the legal attribute category and the right obligation equivalent index value, constructing a risk assessment two-dimensional matrix according to the risk severity level and the occurrence probability level, and calculating the clause risk level score of each clause text segment; The modification suggestion generation module is used for screening high-risk clauses with the clause risk grade scores exceeding a preset risk threshold, searching a legal provision knowledge base aiming at each high-risk clause to obtain matched legal provision basis information, and automatically generating modification suggestion texts based on the legal provision basis information; And the output module is used for generating a contract risk assessment report based on the clause risk grade scores of the clause text segments, generating clause annotation data based on the risk pattern matching result and the modification suggestion text, and generating a negotiation gist prompt list based on the high-risk clauses and legal provision according to the information.

Description

Contract term risk intelligent identification and evaluation method and system Technical Field The invention relates to the technical field of contract intelligent examination, in particular to an intelligent contract term risk identification and evaluation method and system. Background In the current business environment, contracts are used as core legal documents for economic activities among enterprises, and the contents of the clauses are directly related to rights and interests guarantee and risk management of the enterprises. With the expansion of business scale and the improvement of transaction complexity of enterprises, contract examination workload faced by enterprises presents explosive growth situation, and the traditional mode of relying on manual examination by law enforcement personnel is difficult to meet actual demands. The Chinese patent publication No. CN118761735B discloses an electronic contract management method and system based on a deep learning model. According to the scheme, the contract clause classification is identified, the risk identification is carried out by using a clause risk information identification model matched with the clause classification so as to extract key risk information, the clause risk score is determined based on the risk score model, an audit interaction control is generated when the score exceeds a threshold value, and finally the contract classification is stored in an archive database according to an archive label. The scheme realizes automatic classification and risk identification of contract clauses to a certain extent, and improves the automation level of contract examination. However, the prior art still has the technical defects that firstly, in terms classification level, the prior scheme adopts a mode of matching a preset keyword list after the extraction of keywords by adopting a TF-IDF algorithm and a RAKE algorithm, the classification method based on keyword matching is difficult to accurately identify terms boundaries, and particularly, the condition that the terms are cut incorrectly is easy to cause under the conditions of various expression forms of contract terms and fuzzy boundaries, so that the accuracy of subsequent risk identification is influenced. Secondly, in the aspect of risk identification, although the prior proposal introduces a knowledge graph embedding technology to enhance semantic understanding, the risk identification range is limited to the risk information extraction of the clause text, the peer-to-peer of the right obligations of the two parties of the contract can not be deeply analyzed, and the unbalanced risk of the right obligations which are hidden in the contract clauses and are unfavorable to the own party is difficult to identify. And thirdly, in the risk assessment level, the existing scheme adopts a single-dimension risk scoring model, key risk information and clause classification are used as input to generate scores, the single scoring mode cannot comprehensively reflect multi-dimensional characteristics of risks, and particularly the two key dimensions of the severity and occurrence probability of the risks cannot be distinguished, so that the guiding value of a risk assessment result is limited. In addition, at the output level, the existing scheme stops to generate auditing controls and archive storage, and cannot automatically generate modification suggestions with legal basis for the identified high-risk clauses, so that law workers still need to spend a great deal of time researching how to modify the risk clauses, and the overall efficiency of contract auditing is reduced. Therefore, there is a need for a technical scheme for intelligent recognition and evaluation of contract term risks, which can realize precise multi-granularity term segmentation, equivalent deep analysis of entitlement obligations, two-dimensional evaluation of risk severity and occurrence probability, and automatic generation of modification suggestions. Disclosure of Invention Aiming at the technical problems of inaccurate boundary segmentation of contract clauses, missing equal analysis of right obligations, single risk assessment dimension and lacking automatic generation capability of modification suggestions in the prior art, the invention provides an intelligent identification assessment method and system for contract clause risks. In a first aspect, the present invention provides a method for intelligently identifying and evaluating risk of contract terms, including: S1, analyzing multi-granularity clauses, namely acquiring a target contract document to be inspected, carrying out clause boundary self-adaptive segmentation on the target contract document by adopting a contract field pre-training language model to obtain a clause boundary segmentation result, and identifying legal attribute categories corresponding to each clause text segment, wherein the legal attribute categories comprise at least one of main clauses, tar