Search

CN-121980287-A - Intelligent matching method and related device for receipt file and credential file

CN121980287ACN 121980287 ACN121980287 ACN 121980287ACN-121980287-A

Abstract

The application relates to an intelligent matching method and a related device for a receipt file and a credential file. The method comprises the steps of obtaining a receipt file to be matched, carrying out hierarchical analysis on the receipt file from a visual mode to a semantic mode through a text mode to obtain each receipt data unit corresponding to the receipt file, obtaining a voucher file to be matched, carrying out structural analysis on the voucher file according to a table structure of the voucher file to obtain each voucher data unit corresponding to the voucher file, carrying out matching on each receipt data unit and each voucher data unit on a double-layer surface of rule matching and vector matching, and integrating each pair of matched receipt data units and voucher data units to obtain an archive file. By adopting the method, the matching accuracy and the processing efficiency under the complex format and semantic difference scene can be improved while the manual participation is reduced, so that the requirements of high-frequency service processing and compliance traceability are met.

Inventors

  • TANG JIN
  • ZHOU QIANG
  • LIN XIN
  • XIAO MENG

Assignees

  • 盛业信息科技服务(深圳)有限公司

Dates

Publication Date
20260505
Application Date
20260408

Claims (10)

  1. 1. An intelligent matching method for a receipt file and a credential file is characterized by comprising the following steps: Obtaining a receipt file to be matched, and carrying out hierarchical analysis from a visual mode to a semantic mode on the receipt file through a text mode to obtain each receipt data unit corresponding to the receipt file; Acquiring a credential file to be matched, and carrying out structural analysis on the credential file according to the table structure of the credential file to obtain each credential data unit corresponding to the credential file; and on a double-layer surface of rule matching and vector matching, matching each receipt data unit with each credential data unit, and integrating each pair of matched receipt data units with the credential data units to obtain an archive file.
  2. 2. The method of claim 1, wherein the performing hierarchical parsing from a visual mode to a semantic mode on the receipt file to obtain each receipt data unit corresponding to the receipt file includes: visual slicing is carried out on the receipt file according to the layout structural characteristics of the receipt file, so that each slicing area in the receipt file is obtained; Carrying out layout analysis on the original image information in each slicing area to obtain a layout analysis result, and carrying out text recognition on the corresponding original image information under the guidance of the layout analysis result to obtain text information corresponding to each slicing area; and carrying out up-down Wen Yuyi reasoning and structural extraction on each text information according to the semantic association relation among the text information to obtain each receipt data unit corresponding to the receipt file.
  3. 3. The method according to claim 2, wherein the text recognition is performed on the corresponding original image information under the guidance of the layout analysis result to obtain text information corresponding to each slice region, including: Determining the characteristic focusing condition and the text arrangement characteristic corresponding to each slice area according to the layout analysis result; aiming at the slicing areas with abnormal text arrangement characteristics, carrying out text recognition on corresponding original image information according to corresponding characteristic focusing conditions, and carrying out up-down Wen Yuyi error correction on a text recognition result to obtain text information corresponding to each slicing area; And aiming at the slicing areas with normal text arrangement characteristics, carrying out text recognition on corresponding original image information according to corresponding characteristic focusing conditions, and carrying out context semantic enhancement on text recognition results to obtain text information corresponding to each slicing area.
  4. 4. The method of claim 2, wherein the performing up-down Wen Yuyi reasoning and structured extraction on each text message according to the semantic association relationship between each text message to obtain each receipt data unit corresponding to the receipt file comprises: Under the constraint of a preset dynamic prompt word, carrying out up-down Wen Yuyi reasoning on each text message according to the semantic association relation to obtain a semantic interpretation result of each text message, wherein the dynamic prompt word represents control information for carrying out constraint and rechecking on a semantic attention range in a semantic reasoning process; And rechecking each semantic interpretation result according to the dynamic prompt word, and carrying out structural extraction on the rechecked semantic interpretation result to obtain each receipt data unit corresponding to the receipt file.
  5. 5. The method of claim 1, wherein matching each receipt data unit with each credential data unit on a two-tier of rule matching and vector matching, comprises: Screening each receipt data unit and each voucher data unit according to preset service attributes, and calculating service correlation between each screened receipt data unit and each screened voucher data unit under a preset time window to obtain rule matched unit pairs; converting each receipt data unit into a first vector unit respectively, converting each credential data unit into a second vector unit respectively, and calculating semantic relativity between each first vector unit and each second vector unit to obtain a vector matched unit pair; and obtaining each matched receipt data unit and credential data unit according to the repeated unit pairs between the rule matched unit pairs and the vector matched unit pairs.
  6. 6. The method of claim 5, wherein converting each receipt data unit into a first vector unit and converting each credential data unit into a second vector unit, respectively, comprises: identifying each transaction scene corresponding to the receipt file according to the transaction characteristics of each data unit, and determining a scene processing mode corresponding to each transaction scene; and according to the scene processing mode, vectorizing and splicing transaction main body information, mechanism category information and abstract information corresponding to each data unit to obtain a first vector unit corresponding to each receipt data unit and a second vector unit corresponding to each voucher data unit.
  7. 7. The method of claim 1, wherein integrating each pair of matching receipt data units with the credential data units to obtain an archive file comprises: generating corresponding log information aiming at each pair of matched receipt data units and credential data units, and carrying out chain association on each log information to obtain a log sequence; packaging each pair of matched receipt data units and voucher data units, and carrying out electronic signature on the packaged result according to the log sequence to obtain an electronic voucher file; And checking the electronic certificate file according to a pre-configured detection action and obtaining an archive file according to a checking result, wherein the detection action comprises at least one of authenticity detection, integrity detection, availability detection and security detection.
  8. 8. An intelligent matching device for receipt files and credential files, the device comprising: The first analysis module is used for acquiring a receipt file to be matched, and carrying out hierarchical analysis from a visual mode to a semantic mode on the receipt file through a text mode to obtain each receipt data unit corresponding to the receipt file; the second analysis module is used for acquiring the voucher file to be matched, and carrying out structural analysis on the voucher file according to the table structure of the voucher file to obtain each voucher data unit corresponding to the voucher file; And the matching module is used for matching each receipt data unit with each credential data unit on a double-layer surface of rule matching and vector matching, and integrating each pair of matched receipt data units with the credential data units to obtain an archive file.
  9. 9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
  10. 10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.

Description

Intelligent matching method and related device for receipt file and credential file Technical Field The application relates to the technical field of financial bill processing, in particular to an intelligent matching method and a related device for a receipt file and a credential file. Background In the technical field of financial bill processing, the data matching of a receipt and a certificate is involved so as to realize the consistency check and electronic filing of accounts. In the related matching method, the data matching of the receipt and the certificate is realized by manual checking or based on the matching of OCR and rules, however, the method has the defects of dependence on manual participation, low processing efficiency and insufficient matching accuracy under the scene of complex format and inconsistent semantics, thereby causing easy mismatch and missed matching and being difficult to meet the requirements of high-frequency service processing and compliance tracing. Disclosure of Invention Based on the foregoing, it is necessary to provide an intelligent matching method, device, computer device and computer readable storage medium for a receipt file and a credential file, so as to solve the problems that mismatch and mismatch are easy to occur and high-frequency service processing and compliance tracing requirements are difficult to satisfy in a scene of matching the receipt file and the credential file. In a first aspect, the present application provides an intelligent matching method for a receipt file and a credential file, including: Obtaining a receipt file to be matched, and carrying out hierarchical analysis from a visual mode to a semantic mode on the receipt file through a text mode to obtain each receipt data unit corresponding to the receipt file; Acquiring a credential file to be matched, and carrying out structural analysis on the credential file according to the table structure of the credential file to obtain each credential data unit corresponding to the credential file; and on a double-layer surface of rule matching and vector matching, matching each receipt data unit with each credential data unit, and integrating each pair of matched receipt data units with the credential data units to obtain an archive file. In a second aspect, the present application further provides an intelligent matching device for a receipt file and a credential file, including: The first analysis module is used for acquiring a receipt file to be matched, and carrying out hierarchical analysis from a visual mode to a semantic mode on the receipt file through a text mode to obtain each receipt data unit corresponding to the receipt file; the second analysis module is used for acquiring the voucher file to be matched, and carrying out structural analysis on the voucher file according to the table structure of the voucher file to obtain each voucher data unit corresponding to the voucher file; And the matching module is used for matching each receipt data unit with each credential data unit on a double-layer surface of rule matching and vector matching, and integrating each pair of matched receipt data units with the credential data units to obtain an archive file. In a third aspect, the present application also provides a computer device comprising a memory storing a computer program and a processor implementing the above steps when executing the computer program. In a fourth aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the above steps. According to the intelligent matching method, the intelligent matching device, the computer equipment and the computer readable storage medium for the receipt file, on one hand, the receipt file is subjected to hierarchical analysis from a visual mode to a semantic mode through a text mode to form a plurality of receipt data units, so that scattered and manually-identified contents in an original receipt file are converted into data units with uniform structures, on the other hand, the receipt file is subjected to structural analysis according to the table structure of the receipt file to form a plurality of receipt data units, so that the receipt file is directly converted into the data units with uniform structures from the table form, furthermore, each receipt data unit and each receipt data unit are matched on the two-layer surface of rule matching and vector matching, and matching results are integrated to obtain an archive file, so that mismatching conditions which only depend on simple matching in a complex scene are reduced, on the basis of the pertinence of the file and the face matching of the data units in the scene of matching the receipt file in the whole technical scheme, the manual participation is reduced, and matching efficiency and multi-layer difference processing efficiency and accuracy in the multi-layer tracing and high-freque