CN-121981923-A - Bill correction method based on multi-branch model
Abstract
The application discloses a bill correction method based on a multi-branch model, which comprises the steps of S1, obtaining an original bill image and preprocessing, S2, extracting a high-dimensional feature image from preprocessed image data in S1 by adopting a DeiT trunk feature extractor, S3, estimating the geometric center position of the bill image based on the high-dimensional feature image extracted in S2, S4, predicting two-dimensional offset of four corner points of the bill image relative to the geometric center position based on the high-dimensional feature image extracted in S2 and the geometric center position of the bill image estimated in S3, S5, carrying out sub-pixel fine adjustment on the geometric center position of the bill image estimated in S3 to correct center point deviation caused by image quality reduction, bill deformation, local shielding or background interference factors, S6, carrying out perspective transformation on the original bill image based on the geometric center position of the bill image adjusted in S5 and the two-dimensional offset predicted in S4, and finally outputting the corrected image.
Inventors
- ZHENG MINGMING
- LIU TIANYI
- YANG LIU
- XUAN WUJING
Assignees
- 中国人寿保险股份有限公司新疆维吾尔自治区分公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251211
Claims (7)
- 1. A bill correction method based on a multi-branch model is characterized by comprising the following steps: S1, acquiring an original bill image and preprocessing; S2, extracting a high-dimensional feature map from the image data preprocessed in the S1 by adopting a DeiT trunk feature extractor; S3, estimating the geometric center position of the bill image by adopting a lightweight ResNet-18 solution terminal based on the high-dimensional feature map extracted in the S2; S4, carrying out feature fusion on the high-dimensional feature map extracted in the S2 and the geometric center position of the bill image estimated in the S3 so as to predict two-dimensional offset of four corner points of the bill image relative to the geometric center position; s5, carrying out sub-pixel level fine adjustment on the geometric center position of the bill image estimated in the S3 so as to correct center point deviation caused by image quality reduction, bill deformation, local shielding or background interference factors; s6, performing perspective transformation on the original bill image based on the geometric center position of the bill image adjusted in the S5 and the two-dimensional offset predicted in the S4 so as to realize geometric correction of the image, and finally outputting the corrected image.
- 2. The bill correction method based on the multi-branch model according to claim 1, wherein the preprocessing of the original bill image in S1 includes: S1.1, for original image Carrying out standardization treatment; s1.2, the original image after the standardization processing in the S1.1 Unified scaling to a fixed resolution.
- 3. The ticket correction method based on the multi-branch model as claimed in claim 2, wherein the step of extracting the high-dimensional feature map of the image data in S2 includes: S2.1, dividing an input image into non-overlapped 16 multiplied by 16 pixel image blocks, and linearly projecting and mapping each block into a vector with a fixed dimension so as to convert a two-dimensional space structure into a one-dimensional sequence; S2.2, introducing a learnable absolute position code, and reserving a space relative relation between image blocks so that the model can sense layout information of bill elements; s2.3, extracting bill positions and shape features of the image blocks; s2.4 rearranging the image blocks into a two-dimensional feature map The original spatial topology is preserved.
- 4. The bill correction method based on the multi-branch model as claimed in claim 3, wherein the estimating of the geometric center position of the bill image in S3 includes: s3.1, giving the coordinates of the true center point Generating corresponding thermodynamic diagram tags using gaussian kernels ; S3.2, adopting a mean square error or cross entropy loss as an optimization target, and comparing and predicting thermodynamic diagrams With real labels Differences between; s3.3, selecting a pixel point with the maximum response value in the thermodynamic diagram as a predicted central point coordinate by an argmax method ; And S3.4, calculating a weighted average value of all pixel point coordinates in the thermodynamic diagram to obtain a smoother and accurate center point estimation result.
- 5. The bill correction method based on the multi-branch model according to claim 4, wherein predicting the two-dimensional offset of the four corner points of the bill image with respect to the geometric center position in S4 includes: s4.1, calculating offset relative to the center point according to the real coordinates of the four corners of the bill image Converting the real ground label into a corresponding real ground label; s4.2, adopting the smooth L1 loss or the mean square error as an optimization target, and measuring the difference between the predicted offset and the real offset.
- 6. The bill correction method based on the multi-branch model according to claim 5, wherein the sub-pixel level fine tuning of the geometric center position of the bill image in S5 includes: S5.1, mapping the features in S2.4 Optionally fusing the center point thermodynamic diagram in S3 Or the initial result of angular point offset is used as an auxiliary context to enhance the understanding of the local structure of the central area; S5.2, restoring the low-resolution feature map to the spatial scale of the original input image based on the S5.1, so as to obtain a dual-channel offset map Wherein the 1 st channel represents the correction amount of the center point in the x direction Channel 2 represents the correction of the center point in the y-direction 。
- 7. The bill correction method based on the multi-branch model according to claim 6, wherein the performing perspective transformation on the original bill image in S6 includes: S6.1, determining the positions of four target corner points according to the actual size or the expected output size of the bill; S6.2, enabling a cv2.getPerspolectTransform () function provided by a computer vision library to calculate a perspective transformation matrix from four detected corner points in an original image to the standard rectangle defined above; And S6.3, performing perspective transformation on the original image by using a cv2.warp Perselected () function, wherein the input parameters comprise the original image, a transformation matrix M and the size of the output image, and finally returning the corrected image.
Description
Bill correction method based on multi-branch model Technical Field The application belongs to the technical field of computer vision and artificial intelligence, and particularly relates to a bill correction method based on a multi-branch model. Background In the industries of finance, government affairs, medical treatment, logistics and the like, the automatic identification and verification of bills and certificates (such as invoices, identity cards, driving certificates, insurance policies and the like) are key links of an intelligent flow. The prior art generally relies on OCR (optical character recognition) systems for text extraction of ticket images, but the process is highly dependent on the quality of the input image. When there is tilting, bending, creasing, perspective distortion or non-rigid deformation of the ticket, OCR recognition accuracy is significantly reduced, even key fields cannot be located. The research of bill correction methods is developed from manual correction methods to automatic correction techniques, and is mainly realized around computer graphics methods or deep learning methods except methods relying on expert experience. The computer graphics method mainly aims at correcting the bill by searching the edges, corners and center points of the bill as key points, and by projection transformation of OpenCV and other methods. The deep learning method is mainly characterized in that a bill picture of a correct sample is used for training, a bill to be corrected is input into a semantic segmentation model, correction is performed after a correction matrix is obtained, or distortion parameters are directly obtained through a feature extraction network, and correction is performed according to a direction sensitive loss value. In the prior art, the method for correcting by directly acquiring the deformation parameters is too dependent on the quality of the original image, namely the deformation degree of the original image cannot be too large, if the deformation degree is too large, the problem of irrecoverability possibly occurs, and the method is suitable for bill correction scenes under slightly deformed scenes. The method for extracting distortion parameters through the semantic segmentation model relies on a large number of samples for training, the training set needs to cover the deformed data under various conditions, otherwise, the problem of matrix correction error can occur when the model performs semantic segmentation, and bill correction work cannot be completed. Therefore, we need further research. Disclosure of Invention The application aims to provide a bill correction method based on a multi-branch model, which comprises the following specific technical scheme: A bill correction method based on a multi-branch model comprises the steps of S1, obtaining an original bill image and preprocessing, S2, extracting a high-dimensional feature image from preprocessed image data in S1 by adopting a DeiT trunk feature extractor, S3, estimating the geometric center position of the bill image by adopting a lightweight ResNet-18 solution terminal based on the high-dimensional feature image extracted in S2, S4, predicting two-dimensional offset of four corner points of the bill image relative to the geometric center position based on the high-dimensional feature image extracted in S2 and the geometric center position of the bill image estimated in S3, S5, performing sub-pixel level fine adjustment on the geometric center position of the bill image estimated in S3 to correct center point deviation caused by image quality reduction, bill deformation, local shielding or background interference factors, S6, performing perspective transformation on the original bill image based on the geometric center position of the bill image adjusted in S5 and the two-dimensional offset predicted in S4 so as to realize geometric correction of the image, and finally outputting the corrected image. The preprocessing of the original bill image in S1 comprises S1.1, preprocessing the original bill imageCarrying out standardization processing, S1.2, carrying out standardization processing on the original image in S1.1Unified scaling to a fixed resolution. The step S2 of extracting the high-dimensional feature map of the image data comprises the steps of S2.1 of dividing an input image into non-overlapped 16X 16 pixel image blocks, linearly projecting and mapping each block into vectors with fixed dimensions so as to convert a two-dimensional space structure into a one-dimensional sequence, S2.2 of introducing a learnable absolute position code, reserving a space relative relation among the image blocks to enable a model to perceive layout information of bill elements, S2.3 of extracting bill positions and shape features of the image blocks, S2.4 of rearranging the image blocks into a two-dimensional feature mapThe original spatial topology is preserved. The step S3 of estimating the geometric center positi