CN-118568246-B - Civil law strip recommendation method based on two-stage case matching
Abstract
The invention discloses a civil law recommendation method based on two-stage case matching, which comprises the steps of taking legal information of each civil judgment case as a characteristic, dividing the case into case subsets, recognizing characteristic words of each case subset, carrying out synonymous expansion on the characteristic words, calculating the matching degree sim g of a new civil dispute scene and the characteristic words of each case example set, selecting a case subset with the maximum sim g value, calculating the similarity of the new civil dispute scene and each case i in the case subsets, calculating Step4, obtaining a preliminary recommendation result of the new civil dispute scene applicable law according to the law referenced by the case in the similar case set, and correcting Step6 to obtain a final law recommendation result. The method can quickly and effectively find the similar cases in a large number of cases, and based on the similar cases, the more accurate legal recommendation is carried out, and the process of the method has stronger interpretability.
Inventors
- CUI SHIBO
- LI XINRAN
- WANG NING
- ZHANG YI
- WANG RUNZHE
- ZHANG JIELIN
- YAN ZIYU
- ZHANG LEI
- YE XIN
- SU XIAOYAN
Assignees
- 大连理工大学
Dates
- Publication Date
- 20260508
- Application Date
- 20240524
Claims (9)
- 1. A civil law recommendation method based on two-stage case matching is characterized by comprising the following steps: step1, clustering cases by using legal information quoted by each civil decision case as a characteristic and adopting a spectral clustering algorithm to divide the cases into k case subsets; step2, identifying characteristic words of each case subset by adopting TGF-IGF algorithm according to each case set g Carrying out synonymous expansion on the feature words through Word2Vec algorithm to obtain an expanded feature Word set ; The TGF-IGF algorithm is for each vocabulary in each class of cases By simultaneously measuring the occurrence of vocabulary in each class of cases Frequency TGF (Term-Group Frequency) of documents of (a) and vocabulary Inverse document class frequency IGF (Inverse Group Frequency) in other class cases to obtain the final weight of the word in a certain class case, wherein for the word It is at the first The class calculation method is as follows: Wherein, the Is the first Vocabulary appears in class cases Is used for the number of cases of (a), Is the first Total number of cases of class cases, words The IGF value of the word is obtained by taking the logarithm after taking the reciprocal of the average value of the case ratio of the word in other various cases, and the calculation method is as follows: Wherein, the For the total number of case categories, Representative except the first The total number of case categories other than the class case, For the occurrence of words in a certain class of cases The calculation formula for obtaining TGF-IGF is as follows: aiming at the original complaint content in the case subset g, obtaining the score of each word by adopting a TGF-IGF algorithm, and selecting Top-N words as a characteristic word set of the part; step3, calculating new civil dispute scene Expanding feature word sets with each case instance set Similarity of (2) Select and make Maximum value subset of cases As a case set for subsequent similar case matching; Step4, calculating the similarity between the new civil dispute scene and each case i in the case subset According to Value acquisition final set of similar cases ; Step 5. According to the similar case set Obtaining a preliminary recommendation result of new legal rules for civil dispute situations by using legal rules quoted by medium cases ; Step6, utilizing the Apriori algorithm to mine out The French in (1) refers to an association rule set R, and the association rule set pair is utilized Correcting to obtain the final legal recommended result 。
- 2. The civil law recommendation method based on two-stage case matching as claimed in claim 1, wherein the Step1 specifically comprises: Step1-1, setting that legal information quoted by a certain case comprises legal unit information and legal strip information, wherein the "legal unit" is a modification of a concept of "law" and is specifically defined in such a way that each chapter of the national code is regarded as one legal unit, and each other legal unit is regarded as one legal unit as a whole, namely, for a certain legal unit Corresponding legal units The method comprises the following steps: each legal unit is composed of several legal strips, so that the legal units cited in a case are expressed as The cited laws can be expressed as ; Step1-2 measuring different cases with Jaccard coefficient And cases Legal unit similarity and legal strip similarity between And The method comprises the following steps: based on this, the overall similarity of cases Can be expressed as: Wherein, the , Is a coefficient, and ; Step1-3, based on Jaccard similarity, performing first stage clustering by adopting a spectral clustering algorithm for the case corpus A, wherein the difference of legal units and the difference of laws of different cases are equally focused at the moment, so that the method is set , The method comprises the following steps: clustering the case corpus according to the flow of a spectral clustering algorithm, and dividing the case into k_1 primary case subsets by taking the contour coefficient as an evaluation index of clustering effect under different category numbers; Step1-4, carrying out second stage clustering by adopting a spectral clustering algorithm aiming at the cases in each first stage case set, wherein the focus is on the difference of laws of different cases, so that the method is set , The method comprises the following steps: clustering cases in the primary case example set according to the flow of a spectral clustering algorithm, and dividing each primary case subset into k_2 secondary subsets by taking a contour coefficient as an evaluation index of clustering effect under different category numbers, thereby, for the case corpus Dividing it into two-stage clusters A subset of cases, namely: 。
- 3. the civil law recommendation method based on two-stage case matching according to claim 1 or 2, wherein the Step2 specifically comprises: Step2-1, extracting two parts of original complaints and court recognized facts of each case through a regular expression, and acquiring specific legal text quoted by the cases according to legal serial numbers quoted in the cases and the published legal data; step2-2 obtaining original complaint feature words of case subset g by adopting TGF-IGF algorithm ; Step2-3 obtaining legal and actual characteristic words of the case subset g by adopting TGF-IGF algorithm ; Step2-4 obtaining quotation French text feature words of case subset g by adopting TGF-IGF algorithm ; Step2-5, merging the three words to obtain a final word set The method comprises the following steps: step2-6 for feature word set Each feature word in (a) Five words most similar to the Word2Vec model are obtained by using the Word2Vec model, and an expanded Word set related to the feature Word is formed as The method comprises the following steps: the expanded word set for all feature words is: 。
- 4. The civil law recommendation method based on two-stage case matching according to claim 1 or 2, wherein the Step3 specifically comprises: step3-1 for New civil dispute scenario After the operations of word segmentation and removal of stop words, a new scene word set is obtained ; Step3-2 for each Calculating a new scene vocabulary Extending feature word sets First calculate the correlation of (a) Expanded word set associated with each feature word in the feature word set Similarity of (2) The calculation formula is as follows: Wherein, the For scene word sets Expanded word set associated with each feature word Is the number of words shared; step3-3, calculating a scene word set based on the Step Extended word set with all feature words Similarity of (2) The specific calculation formula is as follows: The result is a new civil dispute scenario And case subset Is selected to make Maximum value subset of cases As a dataset for subsequent similarity case matching.
- 5. The civil law recommendation method based on two-stage case matching as claimed in claim 3, wherein the Step3 specifically comprises: step3-1 for New civil dispute scenario After the operations of word segmentation and removal of stop words, a new scene word set is obtained ; Step3-2 for each Calculating a new scene vocabulary Extending feature word sets First calculate the correlation of (a) Expanded word set associated with each feature word in the feature word set Similarity of (2) The calculation formula is as follows: Wherein, the For scene word sets Expanded word set associated with each feature word Is the number of words shared; step3-3, calculating a scene word set based on the Step Extended word set with all feature words Similarity of (2) The specific calculation formula is as follows: The result is a new civil dispute scenario And case subset Is selected to make Maximum value subset of cases As a dataset for subsequent similarity case matching.
- 6. The civil law recommendation method based on two-stage case matching according to claim 1, 2 or 5, wherein the Step4 is specifically: Step4-1 for subset of cases Each case in (3) The Roformer-Sim is adopted to respectively measure two parts and cases of the new input scene in the original litigation and the legal facts of the court Text similarity of corresponding parts And And respectively obtaining similarity sets And ; Step4-2, based on the thought of entropy weight method, calculate out And Information entropy of (2) And And then obtain the weight And Cases of failure And Similarity of (2) The calculation formula of (2) is as follows: Step4-3, obtaining a group by calculation Setting the similarity threshold of two-stage matching as Acquisition of A value of greater than or equal to If the number of cases meeting the requirements exceeds 10, taking The top 10, incorporating the final set of similar cases If not, the whole composition is incorporated.
- 7. The civil law recommendation method based on two-stage case matching according to claim 1, 2 or 5, wherein the Step5 is specifically: step5-1 will According to the cases in (a) Traversing from high to low in sequence, wherein the first traversing case is that The French collection quoted by it is Will then The recommendation index of all the laws in the list is recorded as 1, and the laws are counted into the recommendation set, and the second traversal case is that The French cited therein Counting the recommended indexes of the same legal standard into a recommended set, and superposing the recommended indexes of the same legal standard, and so on, wherein the recommended indexes are reduced by 0.1 each time, but the lowest recommended index is not lower than 0.1; Step5-2, screening out legal strips with recommendation indexes more than or equal to 2 after traversing, and if the number of legal strips meeting the requirements exceeds the number of legal strips meeting the requirements Then take the highest recommendation index Forming a preliminary French recommendation set by using the French strips 。
- 8. The civil law recommendation method based on two-stage case matching according to claim 1, 2 or 5, wherein the Step6 is specifically: Step6-1 for subset of cases The reference rule set in the rule set is used for acquiring all rule association rules meeting the requirements by using an association rule algorithm based on Apriori And construct a rule set of rule association ; Step6-2, sorting the rules according to the confidence level, traversing If the rule in the rule antecedent is that In the process, the then determine if the legal rules for the rule's postamble are In the case that there is a legal rule in the latter item In (1), then add it to In the middle, if the rule-front law is not in If so, ignoring the rule; step6-3, after traversing, counting If the number of legal strips exceeds the number of legal strips Then take the highest recommendation index Forming final French recommendation set by individual French And finally, the legal recommendation process is completely finished.
- 9. The civil law recommendation method based on two-stage case matching according to any one of claims 3 or 4, wherein Step6 is specifically: Step6-1 for subset of cases The reference rule set in the rule set is used for acquiring all rule association rules meeting the requirements by using an association rule algorithm based on Apriori And construct a rule set of rule association ; Step6-2, sorting the rules according to the confidence level, traversing If the rule in the rule antecedent is that In the process, the then determine if the legal rules for the rule's postamble are In the case that there is a legal rule in the latter item In (1), then add it to In the middle, if the rule-front law is not in If so, ignoring the rule; step6-3, after traversing, counting If the number of legal strips exceeds the number of legal strips Then take the highest recommendation index Forming final French recommendation set by individual French And finally, the legal recommendation process is completely finished.
Description
Civil law strip recommendation method based on two-stage case matching Technical Field The invention belongs to the technical field of intelligent laws and the field of case management, and relates to a civil law strip recommendation method based on two-stage case matching. Background Under the large background of intelligent judicial and intelligent social management, the efficient solution to civil disputes has very important value. The applicable legal laws are intelligently recommended for the new dispute scene, so that legal cognition of the dispute party can be improved, and the dispute party can be promoted to automatically solve the civil dispute. For disputes that have partially entered the litigation trial link, intelligent legal recommendations can assist judicial personnel in achieving decide a case more efficient. Obtaining experience from similar decision cases and assisting in the decision of new scenarios is a practice with a strong interpretability. How to find out the cases similar to the new dispute scene from the cases quickly and effectively, how to more accurately conduct legal recommendation based on the similar cases is a core problem to be solved. Disclosure of Invention Based on the technical problems in the background technology, the invention provides a civil law strip recommendation method based on two-stage case matching. Aiming at a new civil dispute scene ak new, a recommendation result of an applicable rule is obtained according to the following steps: step1, clustering cases by using legal information quoted by each civil decision case as a characteristic and adopting a spectral clustering algorithm to divide a large number of cases into k case subsets; Step2, recognizing the feature words KEY g of each case subset by adopting a TGF-IGF algorithm according to each case set g, and synonymously expanding the feature words by a Word2Vec algorithm to obtain KEY_MORE g; Step3, calculating the matching degree sim g of the new civil dispute scene ak new and the characteristic words of each case set, and selecting a case subset AS with a sim g value being maximum AS a case set for carrying out similar case matching subsequently; Step4, calculating the similarity sim (new, i) of each case i in the new civil dispute scene and the case subsets, and acquiring a final similar case set AS match according to the sim (new, i) value; step5, obtaining a preliminary recommendation result P new of a new civil dispute scenario applicable law according to the law of case reference in the similar case set AS match; Step6, utilizing an Apriori algorithm to dig out a French quotation association rule set R in the AS, and utilizing the association rule set to correct the P new so AS to obtain a final French recommendation result P new. Further, the Step1 specifically includes: Step1-1, setting that legal information cited by a certain case comprises legal unit information and legal strip information, wherein the legal unit is a correction of a legal concept, and the purpose of the legal unit is to divide and express more fine granularity of the more integrated law. The specific definition mode of the legal unit is that each chapter of the code of the national act of the people's republic of China (hereinafter referred to as the national act of the people) is regarded as one legal unit, and each legal unit is regarded as one legal unit. That is, for a certain law l, its corresponding legal unit l' is: Each legal unit is made up of several legal strips. Thus, legal units referenced by a case may be represented as l= { L 1',l2',...,lm' }, and the legal strips referenced may be represented as P={p1,p2,...,pr,pr+1,pr+2,...,ps,...,pt+1,,pt+2,...,py}; Step1-2 legal unit similarity and French similarity between different cases a i and a j, J (L i,Lj) and J (P i,Pj), are measured by Jaccard coefficients, namely: on this basis, the overall similarity J (a i,aj) of the case can be expressed as: J(ai,aj)=α*J(Li,Lj)+β*J(Pi,Pj) Wherein α, β are coefficients, and α+β=1; step1-3, carrying out first-stage clustering by adopting a spectral clustering algorithm aiming at the case corpus A on the basis of Jaccard similarity. At this time, focusing equally on the differences of legal units and the differences of laws of different cases, α=0.5, β=0.5 is set, namely: J(ai,aj)=0.5*J(Li,Lj)+0.5*J(Pi,Pj) clustering the case corpus according to the flow of a spectral clustering algorithm, and dividing the case into k_1 primary case subsets by taking the contour coefficient as an evaluation index of clustering effect under different category numbers; step1-4, carrying out second stage clustering by adopting a spectral clustering algorithm according to the cases in each stage of case example set. Focusing on differences in different case laws, a=0.2, β=0.8 is set, namely: J(ai,aj)=0.2*J(Li,Lj)+0.8*J(Pi,Pj) And clustering cases in the primary case example set according to the flow of a spectral clustering algorithm, and dividing