Search

CN-122024247-A - Power equipment fault case text detection and recognition method based on multi-scale convolution and attention mechanism

CN122024247ACN 122024247 ACN122024247 ACN 122024247ACN-122024247-A

Abstract

The invention discloses a method for detecting and identifying a fault case text of power equipment based on a multi-scale convolution and attention mechanism, which relates to the technical field of operation and maintenance of the power equipment and computer vision, and comprises the steps of constructing and training a text detection model, wherein the model integrates a multi-scale convolution attention module in a backbone network; performing text region detection on the target image by using the trained text detection model to generate a text image block to be identified; and inputting the text image block into an optical character recognition engine for recognition, and outputting structured text information. The method for detecting and identifying the fault case text of the power equipment based on the multi-scale convolution and attention mechanism can effectively restrain complex background interference, has stronger robustness to writing messy and irregular arranged texts, has higher precision and more reliability of an established end-to-end text information extraction system, and provides a powerful technical tool for knowledge mining and intelligent analysis in operation and maintenance of the power equipment.

Inventors

  • Tell Songjiang card day
  • ZHANG YONGKANG
  • WU JIAHUI
  • ZHANG ZIWEI
  • XIE LIRONG
  • Yili Ham Almemti
  • LI ZHENEN
  • MA XIAOJING
  • ZHANG ZHONGWEN

Assignees

  • 新疆大学
  • 清华四川能源互联网研究院

Dates

Publication Date
20260512
Application Date
20260123

Claims (10)

  1. 1. A method for detecting and identifying a power equipment fault case text based on a multi-scale convolution and attention mechanism is characterized by comprising the following steps: S1, model construction and training, namely constructing and training a text detection model, wherein the model is integrated with a multi-scale convolution attention module in a backbone network and is used for enhancing the extraction capacity and the spatial position sensing capacity of multi-scale text features in a power equipment fault case image; S2, text detection and preprocessing, namely performing text region detection on a target image by using the trained text detection model, and performing standardization processing on the detected text region to generate a text image block to be identified; and S3, text recognition and output, namely inputting the text image block into an optical character recognition engine for recognition, and outputting structured text information.
  2. 2. The method for detecting and identifying the fault case text of the electrical equipment based on the multi-scale convolution and attention mechanism according to claim 1, wherein the multi-scale convolution attention module in the step S1 is composed of a multi-scale convolution block, a channel attention block and a space attention block which are sequentially connected.
  3. 3. The method for detecting and identifying the fault case text of the power equipment based on the multi-scale convolution and attention mechanism according to claim 2, wherein the step of executing the multi-scale convolution block comprises the following steps: expanding the number of channels of the input feature map by point convolution; Using a plurality of parallel depth separable convolution layers with different convolution kernel sizes to respectively carry out convolution operation on the expanded feature images so as to capture multi-scale context information; channel shuffling the output characteristics of the plurality of depth separable convolutional layers; The number of channels is recovered by point convolution.
  4. 4. The method for detecting and identifying the fault case text of the power equipment based on the multi-scale convolution and attention mechanism as set forth in claim 3, wherein the executing step of the channel attention block includes: respectively carrying out global average pooling and global maximum pooling on the input feature map to obtain two channel feature vectors; respectively inputting the two channel feature vectors into a multi-layer perceptron formed by two convolution layers for processing; Adding the output characteristics of the two multi-layer perceptrons, and generating a channel attention weight matrix through a Sigmoid activation function; the channel attention weight matrix is multiplied element by element with the original input feature map.
  5. 5. The method for detecting and identifying the fault case text of the power equipment based on the multi-scale convolution and attention mechanism as set forth in claim 4, wherein the step of executing the spatial attention block includes: Respectively carrying out average pooling and maximum pooling on the input feature images along the channel dimension to obtain two-dimensional space feature images; Splicing the two dimensional space feature images; Processing the spliced feature map through a convolution layer of a large convolution kernel to generate a spatial feature map; generating a spatial attention weight matrix through the spatial feature mapping by a Sigmoid activation function; The spatial attention weight matrix is multiplied element by element with the original input feature map.
  6. 6. The method for detecting and identifying the fault case text of the power equipment based on the multi-scale convolution and attention mechanism according to claim 5, wherein the normalization processing in the step S2 comprises size normalization and graying.
  7. 7. The method for detecting and identifying the fault case text of the power equipment based on the multi-scale convolution and attention mechanism as set forth in claim 6, wherein the step S2 further comprises preprocessing and enhancing operation after clipping, specifically, affine transformation is performed on text regions in a non-horizontal direction to enable the text regions to be horizontally arranged, or histogram equalization processing is performed on image blocks.
  8. 8. A power equipment fault case text detection and recognition system based on a multi-scale convolution and attention mechanism, which is characterized by comprising the following components according to any one of claims 1-7, The model construction module is used for constructing and training a text detection model, and the model is integrated with a multi-scale convolution attention module in a backbone network and used for enhancing the extraction capacity and the spatial position sensing capacity of multi-scale text features in the power equipment fault case image; the text detection module is used for detecting a text region of the target image by using the trained text detection model, and normalizing the detected text region to generate a text image block to be identified; and the text recognition module is used for inputting the text image block into an optical character recognition engine for recognition and outputting structured text information.
  9. 9. A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the computer program is characterized in that the processor realizes the steps of the method for detecting and identifying the fault case text of the electric device based on the multi-scale convolution and attention mechanism according to any one of claims 1-7 when executing the computer program.
  10. 10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of a method for detecting and identifying a fault case text of an electrical device based on a multi-scale convolution and attention mechanism as set forth in any one of claims 1 to 7.

Description

Power equipment fault case text detection and recognition method based on multi-scale convolution and attention mechanism Technical Field The invention relates to the technical field of operation and maintenance of power equipment and computer vision, in particular to a method for detecting and identifying a fault case text of the power equipment based on a multi-scale convolution and attention mechanism. Background In the daily operation and maintenance and fault processing process of the power system, a large amount of unstructured text data such as fault case records, inspection reports, operation logs and the like are generated and accumulated, and the data usually exist in the form of images or scanning pieces, wherein precious equipment running state, fault symptom and treatment experience information are contained, so that the automation, high-precision detection and identification of the unstructured text data are realized, and the unstructured text data are converted into structured data which can be directly processed and analyzed by a computer, and the method has important significance for constructing a power equipment knowledge graph, realizing intelligent diagnosis and prediction of faults and improving the intelligent level of operation and maintenance management. At present, remarkable progress is made on text detection and recognition technology under natural scenes, when the text detection and recognition technology is directly applied to power equipment fault case text processing, a plurality of special terms, model codes, subscript symbols and small-size characters are still faced with a plurality of special challenges, the multi-scale feature extraction capability of a model is extremely high in requirements, due to the limitation of field recording conditions, a fault case text image always has the problems of handwriting and handwriting, irregular typesetting layout, complex background interference, even distortion, shielding and the like, the conventional text detection model is difficult to accurately position, misjudgment is easily generated by a recognition engine, the conventional general text detection model (such as a model based on a ResNet standard backbone network) is difficult to simultaneously consider the accurate detection of texts with different sizes and the robust processing of complex layouts, the overall accuracy and the practicability of the extraction of final text information are limited, the existing technical scheme is stopped at an algorithm level, a visual basic operation platform for the power operation personnel is lacking, the operation personnel usually needs to have a certain programming interface to call, the special OCR or the interactive interface can be easily achieved, and the technical scheme is very difficult to be applied to the severe, and the technical scheme is urgently solve the problem that the interactive process is difficult to be realized in a plurality of steps, and the technical scheme is very difficult to be applied to the field operation interface is very difficult. Disclosure of Invention This section is intended to outline some aspects of embodiments of the application and to briefly introduce some preferred embodiments. Some simplifications or omissions may be made in this section as well as in the description of the application and in the title of the application, which may not be used to limit the scope of the application. The invention is provided in view of the problems of the conventional method for detecting and identifying the fault case text of the power equipment based on the multi-scale convolution and attention mechanism. Therefore, the invention aims to provide a method for detecting and identifying a power equipment fault case text based on a multi-scale convolution and attention mechanism, which is suitable for solving the problems that the characteristic extraction structure of a general model of the existing text detection and identification technology is difficult to effectively cope with multi-scale differences among special terms, model codes and small-size characters in the power text, the lack of a targeted attention mechanism is used for inhibiting complex background interference and perceiving space geometric transformation of the text, so that robustness of conditions such as handwriting, irregular typesetting and inclined distortion is poor, the existing scheme is in a splitting state, and an end-to-end automatic flow facing the power field, integrated detection, preprocessing and identification is not formed, so that the overall processing efficiency is low, the use threshold is high, and the actual requirements of a power operation and maintenance site on high-precision and high-reliability text information extraction are difficult to meet. In order to solve the technical problems, the invention provides the following technical scheme: The embodiment of the invention provides a method for detecting and id