CN-121982733-A - OCR recognition result correction method and system
Abstract
The invention relates to the technical field of text processing, in particular to a method and a system for correcting an OCR (optical character recognition) result. The correction method of OCR recognition results provided by the invention takes the large language model as an inference engine, dynamically equips the large language model with a plurality of search intelligent agents and a circulating verification mechanism, and fundamentally compensates knowledge shortboards of the large model by introducing dynamic and accurate domain knowledge and context, so that correction decisions of the large language model can be dependent, and error correction can be greatly reduced. And the circulation verification mechanism ensures that each correction is subjected to cross verification, so that the output result is more reliable. The correction method of the embodiment does not depend on fixed rules or model parameters, and can be quickly adapted to a new professional field only by replacing or expanding and searching the data source connected with the intelligent agent, so that the correction method is high in universality and flexible in deployment.
Inventors
- MA JIAN
- ZHU ZIMIN
- DUAN QINGXI
- Zao Weihong
- ZHANG XINYU
- GUO QIBIN
- GUO JIAHAO
- Sun anle
Assignees
- 国网新疆电力有限公司电力科学研究院
Dates
- Publication Date
- 20260505
- Application Date
- 20251217
Claims (10)
- 1. A correction method of OCR recognition results is characterized by comprising the following steps: acquiring an OCR recognition text to be corrected, preprocessing the OCR recognition text, and acquiring corresponding confidence information; Constructing one or more search agents, wherein each search agent is respectively associated with a different private data source or a public data source; Searching search results related to the OCR recognition text from the data sources related to the search agents through the search agents, and fusing different search results to acquire enhanced context information related to the OCR recognition text; correcting the OCR recognition text by using a large language model according to the enhanced context information to obtain a corrected text; And searching again through the searching agent based on the corrected text, verifying whether the corrected text meets the preset output requirement according to the current searching result, if not, updating the corrected text according to the verification result until the corrected text meets the preset termination condition, and outputting the final corrected text.
- 2. The method for correcting OCR recognition results according to claim 1, wherein the method for correcting the OCR recognition text based on the enhanced context information using a large language model to obtain corrected text comprises the steps of: Analyzing the OCR recognition text by combining the enhanced context information, and outputting all suspected errors and suspected point reasons of the OCR recognition text; for each suspected error, one or more correction schemes are proposed based on the enhanced context information, and correction reasons are given; and selecting the correction scheme with the highest evaluation score as a candidate correction scheme according to a preset evaluation strategy, applying the candidate correction scheme to the OCR text, and generating the correction text.
- 3. The method for correcting OCR recognition results of claim 1, wherein retrieving search results related to the OCR recognition text from the data source with which the search agent is associated by the search agent comprises the steps of: Analyzing the OCR recognition text, and extracting analysis results comprising keywords, low-confidence fragments and text topics; constructing at least one query scheme containing explicit retrieval intention based on the parsing result; Each query scheme is searched through at least one search agent, and each search agent simultaneously performs semantic search and keyword search on the query scheme to obtain a plurality of different search results.
- 4. The method for correcting OCR recognition results according to claim 3, further comprising the steps of, prior to forming the enhanced context information: judging whether the correlation and information sufficiency between the current search result and the OCR recognition text are sufficient or not; if the query scheme is insufficient, generating a new query scheme based on the current query scheme and the search result; Retrieving again based on the new query plan until the current search results are sufficient to form enhanced context information.
- 5. The method for correcting OCR recognition results according to claim 1, wherein fusing different search results to obtain enhanced context information related to the OCR recognition text comprises the steps of: And de-duplicating, sorting and formatting the returned plurality of search results to form a section of coherent enhanced context information.
- 6. The method for correcting OCR recognition results according to claim 1, wherein verifying whether the corrected text meets a preset output requirement according to the search result comprises the steps of: Acquiring verification context information related to the corrected text; verifying whether the correction content in the correction text is consistent with the verification context information; verifying whether the corrected content conflicts with an unmodified high confidence portion of the OCR recognized text.
- 7. The method for correcting OCR recognition result according to claim 6, further comprising the steps of, if the corrected text is verified based on the current search result and the preset output requirement is not satisfied: generating a conflict report, wherein the conflict report at least comprises: The correction content of the conflict; An identification indicating a conflict type, the conflict type comprising inconsistent with the verification context information or inconsistent with a high confidence portion of the OCR recognized text; evidence information sources that lead to conflicts.
- 8. The method for correcting OCR recognition results according to claim 1, wherein the step of acquiring and preprocessing the OCR recognition text to be corrected comprises the steps of: carrying out structuring and blocking processing on the OCR recognition text, and outputting a logic block; Extracting the confidence coefficient of the logic block, and carrying out association labeling with the corresponding logic block; marking the coordinate position of each logic block on the original scanned image, and establishing a mapping relation; And based on a preset rule, performing preliminary inspection on the logic block, and marking text content conforming to the preset error rule.
- 9. The method for correcting OCR recognition results according to claim 1, wherein the termination condition includes that the corrected text meets the preset output requirement, reaches a preset maximum round, and the corrected content of the corrected text causes an uncorrectable conflict.
- 10. The system for correcting the OCR recognition result is characterized by comprising an input module, a correction module and a correction module, wherein the input module is used for acquiring an original scanning image and an OCR recognition text to be corrected, and preprocessing the OCR recognition text to acquire corresponding confidence information; The context enhancement module comprises a query constructor, a plurality of search agents associated with different private data sources or public data sources, and a result fusion device, wherein the query constructor is used for extracting key query words from the OCR recognition text, the search agents are used for executing search tasks according to the key query words, and the result fusion device is used for de-duplicating, sequencing and formatting search results output by different search agents to form a section of coherent enhanced context information; The multi-round reasoning correction engine comprises a potential error recognizer, a restoration proposal generator and a text synthesizer, wherein the potential error recognizer is used for analyzing the OCR recognition text in combination with the enhanced context information and outputting all suspected errors and suspected point reasons of the OCR recognition text; the text synthesizer is used for selecting the correction scheme with the highest evaluation score as a candidate correction scheme according to a preset evaluation strategy, and applying the candidate correction scheme to the OCR recognition text to generate a correction text; The iteration verification module comprises a secondary searcher, a conflict detector, a consistency detector and a judgment module, wherein the secondary searcher is used for sending the corrected text to the context enhancement module again to obtain verification context information based on the corrected text, the conflict detector is used for verifying whether the corrected content of the corrected text conflicts with an unmodified high confidence part in the OCR recognition text or not, the consistency detector is used for verifying whether the corrected content in the corrected text is consistent with the verification context information or not, the judgment module is used for outputting verification results of the conflict detector and the consistency detector, if one of the verification results does not pass, the corrected text is updated according to the verification results, and the updated corrected text is sent to the multi-round reasoning correction engine again until verification passes, and finally the corrected text is output.
Description
OCR recognition result correction method and system Technical Field The invention relates to the technical field of text processing, in particular to a method and a system for correcting an OCR (optical character recognition) result. Background At present, OCR (Optical Character Recognition ) technology is widely used, however, conventional OCR text correction methods rely mainly on rule engines, dictionary matching or the generation capability of a single language model, and these methods have significant limitations. On one hand, the capability of actively and accurately searching related information from an external multi-source database is lacking, complex context and professional field texts are difficult to deal with, and necessary contexts are difficult to obtain for decision particularly when professional documents, fuzzy characters or new terms are processed, on the other hand, the correction process is usually one-way and single operation, the correction result is lack of effective verification closed loop, new semantic conflict can be introduced or contradiction with original reliable identification content can be caused, the correction accuracy and reliability are insufficient, and the increasing requirement on high-precision OCR text cannot be met. Disclosure of Invention In order to solve the technical problems of insufficient accuracy and reliability of OCR text correction in the prior art, the invention provides a correction method and a correction system of an OCR recognition result. The invention provides a correction method of OCR recognition results, which comprises the following steps: acquiring an OCR recognition text to be corrected, preprocessing the OCR recognition text, and acquiring corresponding confidence information; Constructing one or more search agents, wherein each search agent is respectively associated with a different private data source or a public data source; Searching search results related to the OCR recognition text from the data sources related to the search agents through the search agents, and fusing different search results to acquire enhanced context information related to the OCR recognition text; correcting the OCR recognition text by using a large language model according to the enhanced context information to obtain a corrected text; And searching again through the searching agent based on the corrected text, verifying whether the corrected text meets the preset output requirement according to the current searching result, if not, updating the corrected text according to the verification result until the corrected text meets the preset termination condition, and outputting the final corrected text. Preferably, the method corrects the OCR recognized text according to the enhanced context information by using a large language model to obtain corrected text, and includes the steps of: Analyzing the OCR recognition text by combining the enhanced context information, and outputting all suspected errors and suspected point reasons of the OCR recognition text; for each suspected error, one or more correction schemes are proposed based on the enhanced context information, and correction reasons are given; and selecting the correction scheme with the highest evaluation score as a candidate correction scheme according to a preset evaluation strategy, applying the candidate correction scheme to the OCR text, and generating the correction text. Preferably, retrieving, by the search agent, search results related to the OCR recognized text from the data source with which it is associated, comprising the steps of: Analyzing the OCR recognition text, and extracting analysis results comprising keywords, low-confidence fragments and text topics; constructing at least one query scheme containing explicit retrieval intention based on the parsing result; Each query scheme is searched through at least one search agent, and each search agent simultaneously performs semantic search and keyword search on the query scheme to obtain a plurality of different search results. Preferably, before forming the enhanced context information, the method further comprises the steps of: judging whether the correlation and information sufficiency between the current search result and the OCR recognition text are sufficient or not; if the query scheme is insufficient, generating a new query scheme based on the current query scheme and the search result; Retrieving again based on the new query plan until the current search results are sufficient to form enhanced context information. Preferably, fusing different search results to obtain enhanced context information related to the OCR-recognized text, including the steps of: And de-duplicating, sorting and formatting the returned plurality of search results to form a section of coherent enhanced context information. Preferably, verifying whether the corrected text meets a preset output requirement according to a search result comprises the following