CN-122024254-A - Verification system and method suitable for license in bidding document
Abstract
The invention relates to the field of computers, in particular to a verification system and method suitable for a license in a bidding document. The method comprises the steps of carrying out image quality assessment according to the definition, illumination uniformity and text region integrity of a license image to obtain an image quality score, carrying out dynamic selection of an OCR processing strategy according to the image quality score to obtain an OCR recognition result adopting the selected strategy, analyzing an operation result and operation content of the OCR recognition result according to a user to obtain an operation type, a modified content type and an influence weight, carrying out verification scene identification according to the license type and business context of the license image to obtain a scene type and complexity grade, carrying out dynamic weight distribution and multi-dimensional information fusion according to the confidence distribution of the OCR recognition result, the influence weight and the complexity grade to obtain the final confidence of the license image, and improving the accuracy and efficiency of confidence assessment.
Inventors
- WANG RUI
- PU JINGJING
- HUANG JINSONG
Assignees
- 四川建设网有限责任公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260409
Claims (10)
- 1. A validation system for a license in a bid document, comprising: The image quality analysis module is used for carrying out image quality assessment according to the definition, illumination uniformity and text region integrity of the license image to obtain an image quality score; The dynamic OCR strategy selection module is used for dynamically selecting an OCR processing strategy according to the image quality score, selecting a first OCR processing strategy when the image quality score is higher than a first threshold value, selecting a second OCR processing strategy when the image quality score is between the first threshold value and a second threshold value, and selecting a third OCR processing strategy when the image quality score is lower than the second threshold value to obtain an OCR recognition result adopting the selected strategy, wherein the second OCR processing strategy comprises a voting mechanism of a multi-OCR engine; The user verification module is used for analyzing the verification operation result and the operation content of the OCR recognition result according to a user to obtain an operation type, a modification content type and an influence weight, wherein the operation type comprises confirmation, modification and marking of doubt, the modification content type comprises format correction and content correction, and the influence weight is obtained by calculation based on the user history accuracy and the difference degree between the verification operation result and the OCR recognition result; the scene recognition module is used for carrying out verification scene recognition according to the license type and the service context of the license image to obtain a scene type and a complexity rating; And the self-adaptive confidence coefficient fusion module is used for carrying out dynamic weight distribution and multidimensional information fusion according to the confidence coefficient distribution, the influence weight and the complexity rating of the OCR recognition result to obtain the final confidence coefficient of the license image.
- 2. The validation system for a license in a bid document of claim 1, wherein the image quality analysis module is further configured to: Performing definition quantification calculation according to the edge sharpness, the text contrast and the noise level of the license image to obtain a definition score; Carrying out illumination uniformity evaluation according to the standard deviation of the image brightness distribution and the local brightness extremum difference to obtain an illumination uniformity score; detecting the text integrity according to the text area occupation ratio and the text area connectivity to obtain a text integrity score; and carrying out comprehensive quality assessment according to the weighted sum of the definition score, the illumination uniformity score and the text integrity score to obtain the image quality score.
- 3. The validation system for a license in a bid document of claim 1, wherein the dynamic OCR policy selection module: the first OCR processing strategy is to process by adopting a single quick OCR engine; the second OCR processing strategy adopts at least two OCR engines to process in parallel and vote the results; and the third OCR processing strategy is to adopt at least two OCR engines to process the license image in parallel and vote the result after performing image restoration preprocessing on the license image.
- 4. A validation system for a license in a bid document according to claim 3, wherein the employing at least two OCR engines to process and vote for results in parallel comprises: Inquiring an engine performance database according to the license type of the license image, and selecting at least two OCR engines which are more than a preset threshold in the history accuracy and are currently available to obtain an engine selection list; Simultaneously calling the OCR engines in the engine selection list to identify by using a multithreading or asynchronous calling mode to obtain the identification result of each OCR engine; performing position matching on corresponding fields of the recognition results of all OCR engines, and establishing field corresponding relations; And according to the corresponding relation of each field, adopting the field corresponding to the corresponding relation of the fields when the identification of most OCR engines in the corresponding relation of the fields is consistent, otherwise adopting the field identified by the OCR engine with the highest credibility value in the corresponding relation of the fields.
- 5. The system for verifying a license in a bid document of claim 1, wherein the user verification module is further configured to: Responding to the verification operation result of the user on the OCR recognition result to obtain an operation type and a modified content type; counting accuracy data according to the user history verification record to obtain a user history accuracy score; calculating the difference degree according to the text similarity of the user operation content and the OCR recognition result to obtain an operation difference degree score; Determining a basic weight value according to the operation type; And carrying out weighted calculation according to the user historical accuracy score, the operation difference score and the basic weight value to obtain an influence weight.
- 6. The validation system for a license in a bid document of claim 1, wherein the scene recognition module is further configured to: Comparing and identifying the visual characteristics and the content characteristics of the license image with a preset template library to obtain a standard license type identifier; Acquiring project information and bidding enterprise information of a current verification task from a bid evaluation system to obtain project grades and enterprise credit ratings; Performing scene matching classification according to the standard license type identifier and the item level to obtain a scene type; and weighting calculation is carried out according to the scene type, the image quality score and the enterprise credit rating, so that the complexity rating is obtained.
- 7. The validation system for a license in a bid document of claim 6, wherein the scene recognition module is further configured to: performing edge detection and area analysis on the license image, and extracting layout size proportion, color area distribution and visual element position characteristics to obtain a visual characteristic vector; keyword scanning is carried out on the OCR recognition result, license names and issuing authorities are extracted, and a content keyword list is obtained; Performing similarity calculation and keyword matching on the visual feature vector and the content keyword list and features in a preset license template library to obtain a matched license type code and matching confidence; And directly confirming that the license type code is a standard license type identifier when the matching confidence coefficient is higher than a threshold value according to a comparison result of the matching confidence coefficient and a preset threshold value, otherwise, marking the license type code as the license type to be manually specified.
- 8. The validation system for a license in a bid document of claim 1, wherein the adaptive confidence fusion module is further configured to: determining a basic weight distribution proportion according to the complexity grade, wherein the higher the complexity grade is, the higher the weight duty ratio of the user operation influence is; determining confidence stability scores according to the confidence distribution of the OCR recognition results, wherein the more concentrated the confidence distribution is, the higher the stability scores are; dynamically adjusting weights according to the basic weight distribution proportion and the confidence stability score to obtain a real-time weight distribution scheme; And carrying out multi-dimensional information weighted fusion according to the real-time weight distribution scheme to obtain the final confidence coefficient.
- 9. The validation system for a license in a bid document of claim 1, further comprising a conflict detection and sample collection module for: Performing conflict detection according to comparison of the OCR recognition result text and the text modified by the user; When a conflict is detected, classifying conflict types according to similarity characteristics and position characteristics of the conflict text to obtain a conflict type identifier; And carrying out sample collection according to the conflict detection result and the conflict type identifier, and adding the conflict sample into a training data set of a corresponding type, wherein the training data set is used for OCR engine optimization.
- 10. A method of verifying a license applicable to a bidding document, comprising: performing image quality assessment according to the definition, illumination uniformity and text region integrity of the license image to obtain an image quality score; Performing dynamic selection of OCR processing strategies according to the image quality scores, selecting a first OCR processing strategy when the image quality scores are higher than a first threshold value, selecting a second OCR processing strategy when the image quality scores are between the first threshold value and a second threshold value, and selecting a third OCR processing strategy when the image quality scores are lower than the second threshold value to obtain OCR recognition results adopting the selected strategies, wherein the second OCR processing strategy comprises a voting mechanism of a plurality of OCR engines; Analyzing an operation result and operation content of the OCR recognition result according to a user to obtain an operation type, a modification content type and an influence weight, wherein the operation type comprises confirmation, modification and marking of doubt, the modification content type comprises format correction, content correction and major change, and the influence weight is obtained by calculation based on the historical accuracy of the user and the difference degree between the operation result and the OCR recognition result; Performing verification scene identification according to the license type and the service context of the license image to obtain scene type and complexity rating; and carrying out dynamic weight distribution and multidimensional information fusion according to the confidence distribution, the influence weight and the complexity rating of the OCR recognition result to obtain the final confidence of the license image.
Description
Verification system and method suitable for license in bidding document Technical Field The invention relates to the field of computers, in particular to a verification system and method suitable for a license in a bidding document. Background In an electronic bidding system, the authenticity verification of various licenses (such as business licenses, qualification certificates, safe production licenses and the like) in a bidding document is a core link of bid evaluation work. These certificates are important vouchers for bidding enterprises to qualify and ability accordingly, the authenticity of which is directly related to the successful implementation of bidding projects and the efficient configuration of public resources. The traditional method mainly relies on manual visual check, and has the problems of low verification efficiency, strong subjectivity and difficult scale. In order to solve the problems, the prior art adopts an OCR engine to recognize and extract the text information in the license image to replace most of manual verification work. However, in the existing bidding document license verification system, the accuracy of OCR recognition is seriously dependent on the image quality of the license, and the image quality uploaded by bidding enterprises is uneven, and the system adopts a single fixed OCR processing strategy, so that the recognition result is unstable and frequent manual intervention is required. Disclosure of Invention The invention aims to provide a verification system and method suitable for a license in a bidding document, and solves the problems in the prior art. The invention is realized by the following technical scheme: in a first aspect, an embodiment of the present invention provides a verification system for a license in a bidding document, including: The image quality analysis module is used for carrying out image quality assessment according to the definition, illumination uniformity and text region integrity of the license image to obtain an image quality score; The dynamic OCR strategy selection module is used for dynamically selecting an OCR processing strategy according to the image quality score, selecting a first OCR processing strategy when the image quality score is higher than a first threshold value, selecting a second OCR processing strategy when the image quality score is between the first threshold value and a second threshold value, and selecting a third OCR processing strategy when the image quality score is lower than the second threshold value to obtain an OCR recognition result adopting the selected strategy, wherein the second OCR processing strategy comprises a voting mechanism of a multi-OCR engine; The user verification module is used for analyzing the verification operation result and the operation content of the OCR recognition result according to a user to obtain an operation type, a modification content type and an influence weight, wherein the operation type comprises confirmation, modification and marking of doubt, the modification content type comprises format correction and content correction, and the influence weight is obtained by calculation based on the user history accuracy and the difference degree between the verification operation result and the OCR recognition result; the scene recognition module is used for carrying out verification scene recognition according to the license type and the service context of the license image to obtain a scene type and a complexity rating; And the self-adaptive confidence coefficient fusion module is used for carrying out dynamic weight distribution and multidimensional information fusion according to the confidence coefficient distribution, the influence weight and the complexity rating of the OCR recognition result to obtain the final confidence coefficient of the license image. Preferably, the image quality analysis module is further configured to: Performing definition quantification calculation according to the edge sharpness, the text contrast and the noise level of the license image to obtain a definition score; Carrying out illumination uniformity evaluation according to the standard deviation of the image brightness distribution and the local brightness extremum difference to obtain an illumination uniformity score; detecting the text integrity according to the text area occupation ratio and the text area connectivity to obtain a text integrity score; and carrying out comprehensive quality assessment according to the weighted sum of the definition score, the illumination uniformity score and the text integrity score to obtain the image quality score. Preferably, in the dynamic OCR policy selection module: the first OCR processing strategy is to process by adopting a single quick OCR engine; the second OCR processing strategy adopts at least two OCR engines to process in parallel and vote the results; and the third OCR processing strategy is to adopt at least two OCR engines to process