CN-121982408-A - Water leakage detection method and system based on multi-mode transducer
Abstract
The invention relates to a water leakage detection method and system based on a multi-mode transducer, and relates to the technical field of water leakage detection. The method comprises the steps of acquiring visible light images and infrared thermal imaging images of a region to be detected synchronously to obtain complementary appearance textures and temperature distribution information, designing independent coding networks for two modes respectively, combining a deformable attention mechanism to achieve deep extraction of multi-mode characteristics, further constructing a three-branch decoder comprising visible light, infrared and fusion branches, dynamically fusing the dual-mode information by using a loose coupling cross-mode attention mechanism to complete positioning and identification of a water leakage region, and finally introducing an instance-level dynamic weight adjustment mechanism in model training to enable the network to evaluate the reliability of each mode under different scenes in a self-adaptive mode. The method realizes accurate and robust automatic detection of the water leakage area through the steps.
Inventors
- ZHAO JIANHUA
- Zhai Baogang
- HUANG ZHONG
- Gao Jiumian
- WANG XINGXING
- YE SIYI
- XIA SHUJUN
Assignees
- 四川华能太平驿水电有限责任公司
- 四川九洲北斗导航与位置服务有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260127
Claims (10)
- 1. A multi-modal transducer-based water leakage detection method, the method comprising: the method comprises the steps of collecting a visible light image and an infrared image paired with a region to be detected, respectively extracting and encoding the characteristics of the visible light image and the infrared image through a convolutional neural network model and a transducer encoder, and obtaining corresponding modal encoding characteristics; Taking the modal coding features as the input of a trained multi-modal converter decoder, wherein the multi-modal converter decoder adopts a three-branch architecture of a visible light branch, an infrared branch and a fusion branch, and fuses and decodes the modal coding features based on a loose coupling cross-modal attention mechanism to obtain a water leakage prediction result of a region to be detected; In the training process of the multi-mode converter decoder, aiming at training data containing water leakage instance labels, calculating matching loss between prediction results and label information output by each branch in the three branches, generating dynamic weight coefficients corresponding to each branch according to the matching loss, weighting the matching loss of each branch by using the dynamic weight coefficients to obtain decoding loss corresponding to each branch, and optimizing parameters of the multi-mode converter decoder based on the decoding loss of the three branches.
- 2. The water leakage detection method based on the multi-mode transducer according to claim 1, wherein the features of the visible light image and the infrared image are extracted and encoded respectively by a convolutional neural network model and a transducer encoder, specifically: The method comprises the steps of adopting two independent convolution neural network trunks to respectively extract characteristics of a visible light image and an infrared image to obtain a visible light multi-scale characteristic image and an infrared multi-scale characteristic image, inputting the visible light multi-scale characteristic image into a visible light mode-specific transducer encoder to encode to obtain visible light encoding characteristics, and inputting the infrared multi-scale characteristic image into an infrared mode-specific transducer encoder to encode to obtain infrared encoding characteristics.
- 3. The multi-modal converter-based water leakage detection method of claim 2, wherein the modal-specific converter uses a deformable self-attention mechanism to aggregate the trans-scale information in the multi-scale feature map.
- 4. The multi-modal converter-based water leakage detection method according to claim 1, wherein the multi-modal converter decoder interacts with the inputted modal coding features using a preset content query vector and position code during decoding.
- 5. The multi-modal converter-based water leakage detection method as set forth in claim 4, wherein the step of fusing the modal coding features based on a loosely coupled cross-modal attention mechanism includes: For each query reference point characterized by the content query vector, generating a set of sampling offsets and modal attention weights corresponding to visible light and infrared modes through network learning; respectively sampling in the visible light coding feature and the infrared coding feature according to the query reference point and the corresponding sampling offset to obtain feature values; And carrying out weighted summation on the sampled characteristic value and the corresponding modal attention weight to obtain a characteristic representation after fusion of the query reference points.
- 6. The multi-mode transducer-based water leakage detection method according to claim 1, wherein the calculating of the matching loss between the prediction result and the labeling information of each branch output in the three branches is specifically: And for each branch, calculating the sum of the classification loss of the predicted result of the branch and the regression loss of the boundary frame as the matching loss of the branch according to the matching relation.
- 7. The multi-modal converter-based water leakage detection method as set forth in claim 6, wherein the classification loss is a Focal loss and the bounding box regression loss is a weighted sum of an L1 loss and a generalized cross-ratio loss.
- 8. The water leakage detection method based on the multi-mode transducer according to claim 1, wherein the generating the dynamic weight coefficient corresponding to each branch according to the matching loss is specifically as follows: And obtaining the corresponding values of the matching loss of the visible light branch, the infrared branch and the fusion branch corresponding to each water leakage example, inputting the corresponding values of the three branches into a Softmax function, and calculating to obtain the dynamic weight coefficients corresponding to the three branches respectively.
- 9. The multi-modal converter-based water leakage detection method according to claim 1, wherein the water leakage prediction result comprises a position bounding box of a water leakage area, a presence confidence and a water leakage intensity level.
- 10. A multi-modal transducer-based water leakage detection system, the system comprising: the data acquisition module is used for acquiring the visible light image and the infrared image paired with the region to be detected; The multi-mode feature extraction module is used for respectively extracting and encoding the features of the visible light image and the infrared image through a convolutional neural network model; The modal feature coding module is used for respectively coding the extracted visible light features and infrared features to obtain corresponding modal coding features; The multimode converter decoding module is internally provided with a trained multimode converter decoder, the multimode converter decoder adopts a three-branch architecture of a visible light branch, an infrared branch and a fusion branch, takes the modal coding characteristics as input, and fuses and decodes the modal coding characteristics based on a loose coupling cross-modal attention mechanism to obtain a water leakage prediction result of a region to be detected; the example-level modal balance optimization module is used for calculating the matching loss between the prediction result and the labeling information output by each branch in the three branches according to training data containing water leakage example labels in the training process of the multi-modal converter decoder, generating dynamic weight coefficients corresponding to each branch according to the matching loss, weighting the matching loss of each branch by using the dynamic weight coefficients to obtain decoding loss corresponding to each branch, and optimizing parameters of the multi-modal converter decoder based on the decoding loss of the three branches.
Description
Water leakage detection method and system based on multi-mode transducer Technical Field The invention relates to a water leakage detection method and system based on a multi-mode transducer, and belongs to the technical field of water leakage detection. Background The water leakage detection is a key link for guaranteeing the safety of buildings, industrial production and normal operation of infrastructure. The traditional visual inspection is low in efficiency, relies on manual experience seriously, and is difficult to realize all-weather and large-scale real-time accurate monitoring. With the development of computer vision technology, an automatic water leakage detection method based on images has been developed, and has become a hot spot for research and application. In the prior art, detection schemes based on traditional machine vision and feature engineering exist. For example, chinese patent application publication No. CN107833221a discloses a water leakage monitoring method based on multi-channel feature fusion and machine learning. According to the technical scheme, firstly, motion change areas are extracted from a video through inter-frame difference, then the areas are segmented, channel characteristics such as gradient, HOG, color and the like are designed and fused manually, and finally, a trained SVM classifier is utilized to judge water leakage of an image block. Although the technical scheme introduces a certain automatic flow, the core of the technical scheme depends on the characteristics of manual design, the expression capability and generalization of the characteristics are limited, and deep patterns of complex textures, forms and temperature distribution of the water leakage phenomenon under different scenes are difficult to fully learn. In addition, the technical scheme only depends on a single-mode visible light video, and when the light is dim, the background is complex or the initial temperature difference of water leakage is small, the detection performance is extremely easy to influence, and the robustness is insufficient. Meanwhile, the serial processing flow from motion detection, blocking and feature extraction to classification is complicated, is sensitive to motion change, is difficult to cope with static or slow leakage and other scenes, and the overall detection efficiency and the intelligent degree are to be improved. In view of the foregoing, there is a need for a water leakage detection method capable of automatically learning multi-modal complementary features and making end-to-end intelligent decisions, so as to overcome the problem of environmental sensitivity caused by dependence on artificial feature design and modal singleness in the prior art, thereby improving the accuracy, robustness and practicality of detection. Disclosure of Invention In order to solve the problems in the prior art, the invention provides a water leakage detection method and system based on a multi-mode transducer. The technical scheme of the invention is as follows: in one aspect, the invention provides a water leakage detection method based on a multi-mode transducer, which comprises the following steps: the method comprises the steps of collecting a visible light image and an infrared image paired with a region to be detected, respectively extracting and encoding the characteristics of the visible light image and the infrared image through a convolutional neural network model and a transducer encoder, and obtaining corresponding modal encoding characteristics; Taking the modal coding features as the input of a trained multi-modal converter decoder, wherein the multi-modal converter decoder adopts a three-branch architecture of a visible light branch, an infrared branch and a fusion branch, and fuses and decodes the modal coding features based on a loose coupling cross-modal attention mechanism to obtain a water leakage prediction result of a region to be detected; In the training process of the multi-mode converter decoder, aiming at training data containing water leakage instance labels, calculating matching loss between prediction results and label information output by each branch in the three branches, generating dynamic weight coefficients corresponding to each branch according to the matching loss, weighting the matching loss of each branch by using the dynamic weight coefficients to obtain decoding loss corresponding to each branch, and optimizing parameters of the multi-mode converter decoder based on the decoding loss of the three branches. Preferably, the features of the visible light image and the infrared image are respectively extracted and encoded by a convolutional neural network model and a transducer encoder, specifically: The method comprises the steps of adopting two independent convolution neural network trunks to respectively extract characteristics of a visible light image and an infrared image to obtain a visible light multi-scale characteristic image and an infrar