CN-121982729-A - Handwriting manifold field modeling-based handwritten character recognition method and system
Abstract
The embodiment of the invention provides a handwriting character recognition method and a handwriting character recognition system based on handwriting manifold field modeling, wherein the method comprises the following steps: and performing handwriting form enhancement on an input image, generating a handwriting skeleton diagram representing the stroke center track through a deformable convolution network, and constructing a handwriting manifold field containing directions, widths, pressures and speeds based on the handwriting skeleton diagram. And then setting a double-channel recognition structure, respectively carrying out vision and handwriting structure recognition through different models based on the input image and handwriting related characteristics to obtain two character probability distributions, and generating fusion weight weighting fusion according to the local quality information and the characteristic difference. And inputting the final recognition result into a lightweight text error correction model, and correcting the low-confidence characters. According to the invention, the handwriting manifold field is used for accurately capturing the handwriting structure and the writing dynamics characteristics, and the two-channel dynamic fusion mechanism is combined to adapt to different scenes, so that the difficulty in recognition of the bad characters and the continuous strokes is effectively broken through, the complex background interference is resisted, and the accuracy rate and the system robustness of the handwriting character recognition are improved.
Inventors
- SHEN ZONGREN
Assignees
- 璞华国际科技(武汉)有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251215
Claims (10)
- 1. A handwriting manifold field modeling-based handwritten character recognition method is characterized by comprising the following steps: performing handwriting form enhancement on an input image, generating a handwriting skeleton diagram representing a stroke center track, and generating a handwriting manifold field describing writing dynamics characteristics based on the handwriting skeleton diagram; Performing visual feature recognition through a first recognition model based on the input image to obtain first character probability distribution, and performing handwriting structure recognition through a second recognition model based on the handwriting skeleton diagram and the handwriting manifold field to obtain second character probability distribution; Based on the local quality information of the input image and the characteristic difference between the first character probability distribution and the second character probability distribution, generating a fusion weight through a dynamic confidence adjustment network, and carrying out weighted fusion on the first character probability distribution and the second character probability distribution by using the fusion weight to obtain a final recognition result.
- 2. The handwritten character recognition method based on modeling of handwriting manifold fields according to claim 1, wherein the performing handwriting morphological enhancement on the input image generates a handwriting skeleton map representing a stroke center trajectory, comprising: carrying out multi-scale feature extraction on an input image by utilizing a deformable convolution network to obtain a multi-scale handwriting candidate region feature map; Performing feature sparsification processing on the multi-scale handwriting candidate region feature map to obtain a pixel-level handwriting center line probability map; And optimizing the handwriting center line probability map based on a minimum cost path algorithm and combining with a direction consistency constraint to generate the handwriting skeleton map, wherein pixels of the handwriting skeleton map meet the maximum local gray gradient and the offset of the deformable convolution network is obviously converged.
- 3. The handwritten text recognition method based on handwriting manifold field modeling of claim 1, wherein generating a handwriting manifold field describing writing dynamics based on the handwriting skeleton map comprises: Estimating local structure parameters of handwriting on each skeleton point based on the handwriting skeleton diagram, wherein the local structure parameters comprise a direction vector, handwriting width, writing pressure and writing speed; The direction vector is estimated based on the main direction of the offset of offset convolution in a deformable convolution network, the handwriting width is estimated based on the full width at half height of the transverse gray level distribution of the local area of the input image, the writing pressure is estimated by combining a regression model of stroke thickness change and stroke color depth, and the writing speed is inferred by the direction change rate of adjacent skeleton points; And integrating local structural parameters of all skeleton points to form a handwriting manifold field describing writing dynamics characteristics.
- 4. A handwritten text recognition method based on handwriting manifold field modeling as claimed in claim 1, wherein after handwriting morphological enhancement of the input image, the method further comprises: Performing background noise adaptive separation and local text block super-resolution restoration on an input image: Dividing an input image into local blocks, outputting four types of noise probability of each local block through a network model, and splicing to form a noise probability map, determining the noise type of each region according to the noise probability map and dynamically selecting a corresponding processing mode, wherein the four types of noise probability comprise shadow, watermark, texture and non-noise probability; and restoring the low-quality stroke area by adopting a local self-adaptive super-resolution network, wherein the local self-adaptive super-resolution network takes a low-quality stroke block as input, introduces the handwriting manifold field as a structure prior, and self-adaptively adjusts the convolution kernel direction according to the stroke direction.
- 5. The handwritten text recognition method based on handwriting manifold field modeling according to claim 1, wherein the first recognition model is a Transformer architecture-based optical character recognition model, and the visual features extracted by the first recognition model comprise local texture features, spatial structure features, stroke weight features and global context features; The second recognition model is a stroke sequence modeling network based on a transducer architecture, training data of the second recognition model comprise a handwriting skeleton diagram and a handwriting manifold field of a real handwriting sample, and fake handwriting sequence data with thickness variation, jitter and speed noise are injected, and training targets comprise character category classification loss, track direction consistency loss and manifold consistency loss.
- 6. The handwritten text recognition method based on handwriting manifold field modeling of claim 1, wherein generating fusion weights through a dynamic confidence adjustment network based on local quality information of the input image and feature differences between the first character probability distribution and the second character probability distribution, comprises: Inputting local quality information of the input image and feature difference between the first character probability distribution and the second character probability distribution into the dynamic confidence level adjustment network, wherein the local quality information comprises local noise score definition scores and character continuity scores; Processing the convolution layer and the attention module in the dynamic confidence level adjusting network, and outputting a local fusion factor; and carrying out position weighted average on the local fusion factors to obtain fusion weights.
- 7. A handwritten text recognition method based on modeling of a handwriting manifold as claimed in claim 1, wherein after obtaining the final recognition result, the method further comprises: inputting a lightweight text error correction model into the character sequence, the confidence score of each character and the field label in the final recognition result; And the lightweight text error correction model executes mask language correction on character positions with confidence coefficient lower than a preset threshold value, and retains the original recognition result on the character positions with confidence coefficient higher than or equal to the preset threshold value.
- 8. A handwriting recognition system based on handwriting manifold modeling for performing the handwriting recognition method based on handwriting manifold modeling according to any of claims 1-7, the system comprising: The handwriting form enhancement module is used for performing handwriting form enhancement on an input image, generating a handwriting skeleton diagram representing a stroke center track, and generating a handwriting manifold field describing writing dynamics characteristics based on the handwriting skeleton diagram; The dual-channel recognition module is used for carrying out visual feature recognition through a first recognition model based on the input image to obtain first character probability distribution; And the dynamic probability fusion module is used for generating fusion weights through a dynamic confidence level adjustment network based on the local quality information of the input image and the characteristic difference between the first character probability distribution and the second character probability distribution, and carrying out weighted fusion on the first character probability distribution and the second character probability distribution by utilizing the fusion weights to obtain a final recognition result.
- 9. An electronic device is provided, which comprises a first electronic device, Characterized by comprising a memory and a processor, wherein, The memory is used for storing programs; the processor, coupled to the memory, is configured to execute the program stored in the memory to implement the steps in the handwriting recognition method based on handwriting manifold field modeling according to any of the preceding claims 1 to 7.
- 10. A computer readable storage medium storing a computer readable program or instructions which when executed by a processor is capable of carrying out the steps of a handwritten word recognition method based on modeling of a handwriting manifold as claimed in any one of claims 1 to 7.
Description
Handwriting manifold field modeling-based handwritten character recognition method and system Technical Field The embodiment of the invention relates to the technical field of optical character recognition, in particular to a handwriting manifold field modeling-based handwritten character recognition method and system. Background AIOCR (AI-based Optical Character Recognition) is widely applied to scenes such as bill processing, contract archiving, education operation modification and the like as an optical character recognition scheme integrating artificial intelligence, and the core requirement is to convert handwritten text images into digital texts so as to improve service efficiency. On one hand, handwriting styles of different writers are various, various forms such as tidy, sloppy, continuous writing and the like exist, the material difference of writing tools and paper further increases the complexity of handwriting forms, and on the other hand, the actually acquired images are often interfered by environmental factors such as shadows, watermarks, textures, stains and the like, so that the accurate recognition is hindered. The existing AIOCR handwritten character recognition scheme mainly relies on an end-to-end deep learning model, and character recognition is completed by directly learning image pixel-level features through constructing a neural network. The preprocessing link adopts a fixed parameter processing mode, noise is removed through fixed threshold filtering, the definition is improved through a global unified super-resolution technology, differential processing on different noise types and local image quality is lacked, the recognition reasoning link mainly comprises a single model, visual features and handwriting structural features are not separated, a multi-model collaborative reasoning mechanism is also not available, and the semantic correction module of a part of schemes adopts a general language model to carry out full text rewriting and lack of character confidence judgment and field suitability. The conventional AIOCR handwritten character recognition scheme has the following problems that handwriting structural characteristics such as stroke bending, thickness pressure and the like are difficult to capture effectively, adaptability to continuous writing and illegal character is poor, noise types such as shadows, watermarks and textures cannot be distinguished through fixed pretreatment, stroke information loss or noise residues are easy to cause, a single model lacks complementary reasoning capability for the conditions such as character boundary blurring and local shielding, full-text rewriting type semantic correction is easy to correct excessively, knowledge in the field is not combined, and accuracy and reliability of a final result are affected. Disclosure of Invention The embodiment of the invention provides a handwriting character recognition method and a handwriting character recognition system based on handwriting manifold field modeling, which are used for solving the problems in the prior art. In a first aspect, an embodiment of the present invention provides a handwritten text recognition method based on handwriting manifold field modeling, including: performing handwriting form enhancement on an input image, generating a handwriting skeleton diagram representing a stroke center track, and generating a handwriting manifold field describing writing dynamics characteristics based on the handwriting skeleton diagram; Performing visual feature recognition through a first recognition model based on the input image to obtain first character probability distribution, and performing handwriting structure recognition through a second recognition model based on the handwriting skeleton diagram and the handwriting manifold field to obtain second character probability distribution; Based on the local quality information of the input image and the characteristic difference between the first character probability distribution and the second character probability distribution, generating a fusion weight through a dynamic confidence adjustment network, and carrying out weighted fusion on the first character probability distribution and the second character probability distribution by using the fusion weight to obtain a final recognition result. Preferably, the step of performing handwriting form enhancement on the input image to generate a handwriting skeleton diagram representing a stroke center track includes: carrying out multi-scale feature extraction on an input image by utilizing a deformable convolution network to obtain a multi-scale handwriting candidate region feature map; Performing feature sparsification processing on the multi-scale handwriting candidate region feature map to obtain a pixel-level handwriting center line probability map; And optimizing the handwriting center line probability map based on a minimum cost path algorithm and combining with a direction consi