CN-121210457-B - Method and system for structuring special appointment data in non-vehicle insurance policy

CN121210457BCN 121210457 BCN121210457 BCN 121210457BCN-121210457-B

Abstract

The invention provides a method and a system for structuring specially appointed data in a non-car insurance policy, which relate to the technical field of data processing and comprise the steps of calling a preset partitioning mechanism to divide a target non-car insurance policy into regions to obtain a dividing result, and clearing a non-text region to obtain a target text region; the method comprises the steps of collecting two-dimensional characteristics to obtain first two-dimensional characteristics, traversing in a preset non-vehicle insurance single structure database to obtain first connected domain elements, correspondingly storing first text contents obtained by identifying a first text connected domain into a preset tree-shaped memory, and carrying out structured visual display on specially contracted data of a target non-vehicle insurance single. The invention solves the technical problems that the special appointed part related to personalized clauses in the non-vehicle insurance policy in the prior art is still commonly recorded in an unstructured natural language text form, an insurance company cannot quickly respond to the change of dynamic conditions, the execution efficiency of the insurance policy clauses is affected, and then the claims settlement and the verification and protection flow are delayed.

Inventors

WANG HUI
Peng Yancheng

Assignees

凯泰铭科技（北京）有限公司

Dates

Publication Date: 20260505
Application Date: 20251030

Claims (8)

1. A method for structuring specially agreed data in non-vehicular insurance policies, the method comprising: Calling a preset partitioning mechanism to partition a target non-vehicle insurance policy into regions to obtain a partitioning result, and clearing a non-text region in the partitioning result to obtain a target text region; performing two-dimensional feature collection on a first text connected domain in the target text region to obtain a first two-dimensional feature; traversing the first two-dimensional feature in a preset non-vehicle insurance single structure database to obtain a first communication domain element; Correspondingly storing first text contents obtained by identifying the first text connected domain into a preset tree-shaped memory by taking the first connected domain element as a reference; According to the preset tree memory, carrying out structural visual display on the special appointed data of the target non-vehicle insurance policy; traversing the first two-dimensional feature in a preset non-vehicle insurance single structure database to obtain a first communication domain element, wherein the first communication domain element comprises: extracting a first predetermined two-dimensional feature of a first predetermined connected domain in the predetermined non-vehicle insurance single structure database; sequentially carrying out vectorization processing on the first two-dimensional feature and the first preset two-dimensional feature to respectively obtain a first vector and a first preset vector; When the cosine similarity of the first vector and the first preset vector reaches a preset similarity limit value, taking a first preset element corresponding to the first preset connected domain as the first connected domain element; The predetermined non-car insurance policy structure database includes at least an insurance policy basic information element, a principal information element, an insurance scheme element, an insurance liability element, an exclusionary liability element, a claims processing flow element, and a dispute handling element.
2. The method for structuring special contract data in a non-vehicular insurance policy according to claim 1, wherein before a predetermined partitioning mechanism is called to partition a target non-vehicular insurance policy into a partition result, and a non-text area in the partition result is cleared, the method comprises: Rasterizing the target non-vehicle insurance policy based on a preset unit length to obtain a target grid set; introducing an edge direction distinguishing mechanism to distinguish and analyze a first grid in the target grid set to obtain a first category; performing discriminant analysis on a second grid in the target grid set according to the edge direction discriminant mechanism to obtain a second category, wherein the second grid refers to any grid adjacent to the first grid in the target grid set; If the first category is consistent with the second category, constructing a first area based on the first grid and the second grid according to the preset partitioning mechanism; and if the first area meets the preset clearing condition, adding the first area to the non-text area.
3. The method for structuring special contract data in non-vehicular insurance policies according to claim 2, wherein a minimum font of said target non-vehicular insurance policies is obtained, and a font side length of said minimum font is used as said predetermined unit length.
4. The method for structuring special contract data in non-car insurance policies according to claim 2, wherein the edge direction distinguishing mechanism refers to a scheme for distinguishing content categories of corresponding grids based on features of edge direction histograms of the grids, and text edge direction features, table edge direction features, formula edge direction features and picture edge direction features are prestored in the edge direction distinguishing mechanism.
5. The method of claim 2, wherein if the first category is inconsistent with the second category, a second area is respectively built based on the first grid, and a third area is built based on the second grid; based on the second region and the third region, obtaining the dividing result; Wherein, include: adding the second region to the non-text region if the second region meets the predetermined clearing condition, and adding the second region to the target text region if the second region does not meet the predetermined clearing condition; and if the third area meets the preset clearing condition, adding the third area to the non-text area, and if the third area does not meet the preset clearing condition, adding the third area to the target text area.
6. The method for structuring special engagement data in a non-vehicular insurance policy according to claim 1, wherein the two-dimensional features include layout features and style features, and performing two-dimensional feature collection on the first text connected domain in the target text region to obtain a first two-dimensional feature comprises: acquiring a first position coordinate of the first text communication domain; Acquiring a first style characteristic of the first text communication domain, wherein the first style characteristic comprises a first font, a first font size, a first indentation amount and a first line spacing; the first two-dimensional feature is constructed based on the first position coordinates, the first font size, the first indentation, and the first line spacing.
7. The method for structuring special contract data in non-vehicular insurance policies according to claim 1, further comprising, after structuring visual presentation of the special contract data of the target non-vehicular insurance policies according to the predetermined tree memory: Manually marking the target non-vehicle insurance policy, and correspondingly storing the target non-vehicle insurance policy into the preset tree-shaped memory to obtain a manual data structured visual view; performing color block area superposition analysis on the artificial data structured visual view and the intelligent data structured visual view to obtain color block area superposition rate; And characterizing the data structuring accuracy of the target non-vehicle insurance policy by using the color block area coincidence rate after normalization processing.
8. A system for structuring special conventions in non-vehicular insurance policies, for implementing a method for structuring special conventions in non-vehicular insurance policies according to any of claims 1 to 7, said system comprising: the regional division module is used for calling a preset partitioning mechanism to divide the target non-vehicle insurance policy into regions to obtain a division result, and clearing a non-text region in the division result to obtain a target text region; the feature collection module is used for carrying out two-dimensional feature collection on the first text connected domain in the target text region to obtain a first two-dimensional feature; The traversing module is used for traversing the first two-dimensional characteristic in a preset non-vehicle insurance single structure database to obtain a first communication domain element; The storage module is used for correspondingly storing first text contents obtained by identifying the first text connected domain into a preset tree-shaped memory by taking the first connected domain element as a reference; and the structured visual display module is used for carrying out structured visual display on the special appointed data of the target non-car insurance policy according to the preset tree memory.

Description

Method and system for structuring special appointment data in non-vehicle insurance policy Technical Field The invention relates to the technical field of data processing, in particular to a method and a system for structuring special appointment data in a non-vehicle insurance policy. Background The insurance claim settlement refers to the process that an insurance company examines the claim request submitted by an insured person or beneficiary according to the agreement of an insurance contract and pays corresponding reimbursement after confirming that the claim condition is met, and an insurance policy is a formal written document of the insurance contract, which records the right obligation relation between an applicant and the insurance company in detail, and in the non-car insurance business of the insurance company, the non-car insurance business (including financial insurance, responsibility insurance, agricultural insurance, health insurance and the like) has the characteristics of complex objects and diversified risk scenes, and the text complexity and the data value of special agreement in the insurance policy are higher than those of the car insurance field. However, the "special commitment" part of the policy related to personalized terms is still commonly recorded in unstructured natural language text form, and this technical bottleneck severely restricts the calculation of the loss amount of the loss evaluation process and the determination of insurance responsibility in the insurance claim process. The existing technical scheme for converting the special contract unstructured data into structured data mainly comprises manual processing, a regular expression rule engine mode and a Natural Language Processing (NLP) model, wherein a regular expression library needs to establish specific template rules for each term, such as numerical values or terms like 'free of claims' and 'responsibility limits', the diversity and complexity of the policy terms lead to huge regular expression rules and difficult maintenance, while NLP technology can identify and extract key information to a certain extent, when the NLP model is used for analyzing the special contract terms, low identification rate and high error rate often occur, for example, the policy terms in non-car insurance relate to medical, legal, agricultural and other expertise, in medical responsibility risk test, the special term identification F1 value is only 58.3%, the deficiency of the NLP technology leads to inaccurate identification of a large number of terms, and especially in the case of complex content and nested multi-layer terms, the NLP model cannot effectively process the complex terms, so that the final logic extraction and the accuracy of the final contract rules are affected. Disclosure of Invention The application provides a special appointment data structuring method and a special appointment data structuring system in a non-vehicle insurance policy, and aims to solve the technical problems that special appointment parts related to personalized clauses in the non-vehicle insurance policy in the prior art still commonly adopt unstructured natural language text form records, insurance companies cannot respond to changes of dynamic conditions quickly, execution efficiency of the insurance policy clauses is affected, and claim settlement and verification and insurance flow are delayed. The application discloses a first aspect of a method for structuring special contract data in a non-vehicle insurance policy, which comprises the steps of calling a preset partitioning mechanism to partition a target non-vehicle insurance policy into a partitioning result, clearing a non-text region in the partitioning result to obtain a target text region, collecting two-dimensional features of a first text connected domain in the target text region to obtain a first two-dimensional feature, traversing the first two-dimensional feature in a preset non-vehicle insurance single structure database to obtain a first connected domain element, correspondingly storing first text content obtained by identifying the first text connected domain into a preset tree-shaped memory based on the first connected domain element, and structuring and visually displaying the special contract data of the target non-vehicle insurance policy according to the preset tree-shaped memory. The application discloses a second aspect, which provides a special contract data structuring system in a non-vehicle insurance policy, wherein the system is used for the special contract data structuring method in the non-vehicle insurance policy, and comprises a region dividing module, a feature collecting module, a traversing module and a storage module, wherein the region dividing module is used for calling a preset partitioning mechanism to divide a target non-vehicle insurance policy into regions to obtain a dividing result, clearing a non-text region in the dividing result to obtain a t