CN-121999498-A - Image correction and paper identification method and system

CN121999498ACN 121999498 ACN121999498 ACN 121999498ACN-121999498-A

Abstract

The invention discloses an image correction and paper identification method and system, which are used for preprocessing an original image to obtain a preprocessed image by obtaining at least two original images with adjacent shooting time, wherein the original image is a reflection imaging image, an initial correction module is used for carrying out initial correction on the preprocessed image to obtain a first correction image, a linkage correction module is used for linking spatial information and time information of the first correction image, carrying out self-adaption on the first correction image for a second time to obtain a second correction image, and an image restoration module is used for carrying out perspective restoration on the original image based on the second correction image to obtain a restored paper image. The invention guarantees four dimensions from function integration, environmental adaptability, terminal adaptability and precision, comprehensively solves the defects of the prior art, provides a high-efficiency, low-cost and easy-to-land technical scheme for image processing of intelligent learning equipment, and has remarkable technical value and application prospect.

Inventors

ZHANG XIAO
ZHANG LI
ZHOU JIN
SONG CHUNWEN

Assignees

北京简单科技有限公司

Dates

Publication Date: 20260508
Application Date: 20260203

Claims (10)

1. An image correction and paper identification method, comprising: Step S1, obtaining at least two original images with adjacent shooting time, and preprocessing the original images to obtain preprocessed images, wherein the original images are reflection imaging images; s2, carrying out initial correction on the preprocessed image to obtain a first corrected image; Step S3, linking the spatial information and the time information of the first corrected image, and adaptively correcting the first corrected image for the second time to obtain a second corrected image; and S4, performing perspective restoration on the original image based on the second corrected image to obtain a restored paper image.
2. The method for image correction and sheet identification according to claim 1, wherein said step S2 comprises: Performing binary segmentation on the preprocessed image to obtain a segmented image, wherein the segmented image comprises a plurality of first segmentation masks; Performing connected domain analysis on the plurality of first segmentation masks, and determining the first segmentation mask with the largest area as a paper main body area; Performing morphological opening operation on the paper main body area to remove isolated noise points, and performing closing operation on the paper main body area to fill gaps so that the boundary of the paper main body area is smoother and more continuous, thereby obtaining a morphological optimized paper area; The method comprises the steps of carrying out edge correction and contour fitting on a morphology optimization paper area, specifically, carrying out contour extraction on the morphology optimization paper area to obtain a first paper area edge, carrying out edge smoothing treatment on the first paper area edge to remove local sharp areas to obtain a second paper area edge, and taking an image marked with the second paper area edge as a first correction image.
3. The method for correcting and identifying paper according to claim 2, wherein at least two frames of original images are respectively processed by the step S2 to obtain at least two frames of first corrected images, and the step S3 includes: taking one frame of original image with the shooting time later in at least two frames of original images as a current frame of image; The central point offset monitoring calculation is carried out, namely the central point offset between the edge of the second paper area of the current frame image and the edge of the second paper area of the previous frame image is obtained; If the offset is smaller than the set value, the paper is considered to be not actually moved, and the edge detection result of the previous frame of image is taken as the current edge detection result; Estimating the region overlapping degree, namely obtaining the intersection ratio between the edge of the second paper region for obtaining the current frame image and the edge of the second paper region for obtaining the previous frame image, and evaluating whether the paper shapes between the two frame images are consistent according to the intersection ratio; Multiplexing historical results and memorizing states, namely taking the edge detection result of the previous frame of image as a current edge detection result under the condition that two frames of images are continuous observation images, wherein the current edge detection result comprises a current detection paper mask area; and extracting contour feature points of the current edge detection result.
4. The method for image correction and paper identification according to claim 3, wherein said extracting contour feature points of the current edge detection result comprises: Mask candidate region selection and geometric smoothing, namely performing morphological processing on a current edge detection result to obtain a smooth boundary image, performing contour extraction in the smooth boundary image to obtain a paper main boundary, calculating a convex hull based on the main boundary, and reducing the influence of local concave and noise points through geometric convergence to obtain a stable main boundary; And extracting the characteristic points of the stable main boundary to obtain the contour characteristic points.
5. The method for image correction and paper identification according to claim 4, wherein said adaptively correcting contour feature points comprises: Filling a region formed by the contour feature points into a binary region to obtain an initial contour binary region; If the difference set is not empty, identifying pixels in the difference set as uncovered pixels; If the area proportion of the difference set to the current detection paper mask area exceeds the tolerance threshold, carrying out local correction on the initial contour binary area based on an edge normal offset algorithm, wherein the method specifically comprises the following steps: shifting, namely translating each edge of the binary area of the initial contour according to the normal direction of the edge and a preset step length to obtain a translation edge; correcting, namely obtaining intersection points of adjacent translation edges to update vertexes of the initial contour binary region and obtain a corrected contour binary region; And repeating the translation-correction process until the area ratio between the corrected contour binary area and the current detection paper mask area is lower than a threshold value or reaches the preset iteration times, stopping self-adaptive correction, and obtaining the final contour feature points.
6. The method for correcting and identifying paper according to claim 2, wherein the step of binary-dividing the preprocessed image to obtain the divided image is implemented by a lightweight model, specifically, the preprocessed image is input into the lightweight model, and the lightweight model performs binary-division on the preprocessed image.
7. The method of image correction and paper identification of claim 6, wherein the lightweight model includes an adaptive processing module, an encoder, a decoder, and an edge weighted loss optimization module; The input of the adaptive processing module is the preprocessed image in step S1, the input of the encoder is the output of the adaptive processing module, the input of the decoder is the output of the encoder, the input of the edge weighted loss optimization module is the output of the decoder, the decoder outputs a segmented image, the segmented image comprises one or more first segmentation masks; the self-adaptive processing module is used for carrying out self-adaptive processing on the preprocessed image to obtain a fusion image, and specifically comprises the following steps: Graying treatment is carried out on the preprocessed image, so that a gray image is obtained; setting a block kernel with the size of 8 multiplied by 8 pixels, and carrying out non-overlapping block on the gray level image by using the block kernel; Statistics of gray level mean value of each block And global mean ; Carrying out self-adaptive contrast enhancement processing on the gray level image to obtain an enhanced image, wherein the method specifically comprises the following steps: Wherein, the In order to adapt the contrast enhancement factor to the image, Representing pixel points in a gray scale image Is used for the display of the display panel, Representing pixel points in an enhanced image I and j respectively represent the values of the abscissa and the ordinate of the pixel point; Representation pair To perform numeric clipping/cutting operation, in particular, if Then If (1) Then If (1) Then ; Representation fetch And The maximum of the two; carrying out bilinear interpolation fusion on the enhanced image to obtain a fused image; The encoder is used for carrying out depth convolution space feature extraction and common point convolution cross-channel fusion on the fusion image to obtain an encoded image; the decoder is used for carrying out nearest neighbor interpolation and edge characteristic compensation processing on the coded image to obtain an output binarized segmented image; the edge weighted loss optimization module is used for calculating loss values of the segmented image to train the lightweight model.
8. The method for correcting and identifying paper according to claim 7, wherein the loss value calculation method of the edge weighted loss optimizing module is as follows: Wherein, the Representing the penalty value of the edge weighted penalty optimization module, The edge weight is represented as a function of the edge weight, , The sobel edge intensity value, which is a true mask, is used to strengthen the edge loss, ; Is a parameter of the equilibrium and is a function of the equilibrium, ; Representing the cross entropy loss value between the segmented image and the true label image, the true label image being provided in advance, , The coefficient of the degree of overlap of the regions is represented, , , Represents the edge weight of the kth pixel in the segmented image, , Represents the edge intensity value of the kth pixel point, , A pixel value representing a kth pixel in the segmented image, N representing the number of pixels of the segmented image, Representing the pixel value of the kth pixel in the real label image.
9. The method for image correction and sheet identification according to claim 7, wherein said step S4 comprises: and resolving a projective transformation matrix T based on the characteristic points in the second corrected image, and performing perspective correction on the original image based on the projective transformation matrix T to obtain a restored paper image.
10. An image correction and paper identification system, comprising: The preprocessing module is used for obtaining at least two original images with adjacent shooting time frames, and preprocessing the original images to obtain preprocessed images, wherein the original images are reflection imaging images; the initial correction module is used for carrying out initial correction on the preprocessed image to obtain a first corrected image; the linkage correction module is used for linking the spatial information and the time information of the first correction image, and adaptively correcting the first correction image for the second time to obtain a second correction image; And the image restoration module is used for performing perspective restoration on the original image based on the second correction image to obtain a restored paper image.

Description

Image correction and paper identification method and system Technical Field The invention relates to the technical field of computer vision, in particular to an image correction and paper identification method and system. Background With the popularization of intelligent learning devices such as learning machines and student boards, the functions of "photographing and searching questions", "homework modifying", "learning referring", "reading referring" and the like have become common applications. In order to enable a user to complete shooting without moving the device, some learning machines (electronic devices for assisting learning) are provided with front cameras, and the learning machines can shoot books and papers on a desktop by a method of adsorbing a reflecting mirror at the cameras. However, because of unavoidable inclination angles between the reflecting mirror and the desktop, the camera can introduce visual distortions such as perspective deformation, position deviation, local stretching and the like in the shooting process, so that the accuracy of subsequent operation correction and character recognition is affected. Meanwhile, in order to facilitate subsequent processing, the area range of the paper needs to be accurately identified, and useless background information in the picture is removed. To cope with this challenge, there are mainly the following solutions: 1. And the rear shooting scheme is that a user starts the equipment and directly uses a rear lens of the learning machine to shoot so as to avoid distortion. The scheme is high in imaging quality, but learning equipment is large in size and weight, and frequent lifting is complex in operation, and extra burden is given to students, so that the students are ache in arms and tired in bodies. It is difficult to adapt to long-term learning scenes requiring continuous shooting, and furthermore, this method does not provide image semantic information of paper boundaries. 2. The traditional image geometric transformation scheme is that the original form of a shooting plane is restored through technologies such as perspective transformation, graphic transformation matrix solving or edge detection. However, this method is highly dependent on stable, sharp, regular paper boundaries. In a practical complex learning environment, the phenomena of random placement of paper, uneven illumination, interference of table surface sundries and the like can cause the technology to become extremely unreliable, thereby influencing the robustness of the algorithm and making the algorithm difficult to be applied in a generalization way. 3. 3D reconstruction technology based on binocular camera: And estimating the pose and the inclination angle of the camera by utilizing binocular parallax of the binocular camera, so as to obtain an image transformation matrix. Its advantages are high real-time performance and high correction effect. But the defects are also obvious, the hardware cost is high, the popularization of products is not facilitated, and the burden of users is increased. And only the whole image can be restored, and no method for identifying the boundary semantic information of the image exists, so that the camera cannot effectively distinguish paper from the background or identify the left and right boundaries of the paper. 4. Transformers-based 3d reconstruction model: And directly inputting a distorted image and a corrected image learning correction map in the training process by using an ultra-large parameter deep learning model. This has the advantage of being able to deal with a certain degree of page quality problems including, but not limited to, creasing, bending, blurring. The method has the defects that the method is difficult to land on the terminal equipment, firstly, the model parameter amount is huge, a great amount of calculation force is consumed in reasoning, and the requirements of small precision and real-time performance required by the side equipment are difficult to meet, and secondly, the required data are special, so that great effort is often required to prepare the data. In summary, the prior art has the following disadvantages: The function is single, most schemes focus on geometric correction, lack recognition of paper boundaries, and educational image processing applications need to perform the functions of correction and paper recognition. The traditional geometric method is easy to be interfered by environmental factors such as boundary blurring, illumination change and the like, and is difficult to reliably work in complex and real scenes. Cost and function defects that the binocular camera scheme has high hardware cost and can not finish the task of paper identification, the large model scheme can not utilize the computing power advantage of the edge, the computing cost is high, and the required directional data cost is also very high. The root cause of these shortcomings is that the front shooting of the learning m