CN-121999375-A - Multi-mode large-model ground penetrating radar disease data intelligent interpretation method and device
Abstract
The invention discloses an intelligent interpretation method and device for ground penetrating radar disease data of a multi-mode large model, and belongs to the technology of calculation intelligence and information processing. The method is based on a computing device of machine learning, artificial intelligence and a specific mathematical model, and integrates an inference model of specific ground penetrating radar knowledge to process information. The method comprises the steps of collecting underground disease data by utilizing the multi-frequency ground penetrating radar equipment, converting the underground disease data into a serialization format adapting to a multi-mode large model, and then sequentially executing detection, segmentation and image description tasks. The method comprises the steps of generating a segmented label mask by a Bayesian optimization method based on a target detection result, constructing an image description data set by utilizing the segmentation result, and finally performing intelligent interpretation on data by utilizing a multi-mode large model to generate an interpretation report. The invention realizes the full-flow automatic interpretation from disease detection to semantic description through multi-task collaborative learning, improves the interpretation efficiency and reliability, and provides an intelligent basis for road maintenance decision.
Inventors
- YANG BINGXIN
- Hao Keran
- GUO CHEN
- JIN ZHAO
- HE ZHILI
- WANG SHUO
Assignees
- 长安大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260410
Claims (10)
- 1. The intelligent interpretation method for the ground penetrating radar disease data of the multi-mode large model is characterized by comprising the following steps of: Step 1, performing differential data acquisition aiming at different types of underground diseases by using a plurality of ground penetrating radars with different center frequencies to acquire original radar data; Step 2, preprocessing the original radar data, cutting the preprocessed original radar data into images with preset sizes, marking disease targets in the images, and constructing a target detection data set comprising a plurality of image data; Step 3, converting the image data in the target detection data set into data adapting to a serialization format of the multi-mode large model; Step 4, performing first fine tuning on the multi-mode large model by utilizing the target detection data set; Step 5, according to the target detection result of the target detection data set by the multi-mode large model subjected to fine tuning for the first time, performing binarization processing on the image data by utilizing a threshold segmentation technology of adaptive threshold optimization, and generating a binarization label mask for a semantic segmentation task; step 6, converting the image data and the corresponding binary label mask into a serialization format, constructing a semantic segmentation data set, and performing secondary fine tuning on the multi-mode large model by using the semantic segmentation data set; Step 7, according to the semantic segmentation result of the multi-modal large model subjected to the second fine tuning, carrying out text description labeling on a disease target, constructing an image description data set, and carrying out third fine tuning on the multi-modal large model by utilizing the image description data set; And 8, utilizing the multi-mode large model subjected to three fine tuning to realize intelligent interpretation analysis of the underground diseases and generate a comprehensive interpretation report.
- 2. The intelligent interpretation method for the data of the ground penetrating radar diseases of the multi-mode large model according to claim 1, wherein the plurality of ground penetrating radars with different center frequencies comprise a ground penetrating radar with a center frequency of 400MHz and a ground penetrating radar with a center frequency of 900MHz, wherein the ground penetrating radar with the center frequency of 400MHz scans for cavity diseases, and the ground penetrating radar with the center frequency of 900MHz scans for crack diseases.
- 3. The intelligent interpretation method of the ground penetrating radar disease data of the multi-modal large model according to claim 1, wherein the step 2 includes: step 2.1, removing direct waves, bandwidth filtering, signal gain and background filtering are sequentially carried out on the original radar data; 2.2, cutting the processed radar data into images with preset sizes; And 2.3, marking disease targets in the image by using LabelImg software, and constructing the target detection dataset.
- 4. The intelligent interpretation method of the ground penetrating radar disease data of the multi-modal large model according to claim 1, wherein in the step 3, the serialization format is Jsonl format, the Jsonl format includes three key fields of an image file name, a task prefix and task output content, wherein "image" represents the image file name, "prefix" represents the task prefix, and "suffix" represents the task output content; The task prefix comprises a target detection task prefix which is expressed as "< det >", a semantic segmentation task prefix which is expressed as "< segment >", and an image description task prefix which is expressed as "< cap >".
- 5. The intelligent interpretation method of ground penetrating radar disease data of a multi-modal large model according to claim 1, wherein the first fine tuning, the second fine tuning and the third fine tuning of the multi-modal large model all employ a parameter efficient fine tuning strategy, the parameter efficient fine tuning strategy comprising: And in the fine tuning process of the model, freezing all parameters of a visual encoder of the multi-mode large model, fine tuning an attention mechanism related layer in a decoder of the multi-mode large model, wherein the attention mechanism related layer comprises an attention output projection layer, a key value joint projection layer and a query projection layer, a trainable layer adopts float32 precision, a freezing layer keeps float16 precision, and a JAX framework is utilized to accelerate and optimize the fine tuning process.
- 6. The intelligent interpretation method of the ground penetrating radar disease data of the multi-modal large model according to claim 1, wherein the step 5 includes: 5.1, performing target detection on the image data in the target detection data set according to the multi-mode large model subjected to the first fine adjustment, outputting a target detection boundary box corresponding to each image data, and extracting a target area from the image data according to the target detection boundary box; Step 5.2, automatically optimizing key parameters of threshold segmentation by adopting a Bayesian optimization method based on a preset objective function and a Bayesian optimization termination condition; step 5.3, generating a binarization mask of the target area by using a self-adaptive threshold segmentation method according to the key parameters obtained by optimization; And 5.4, covering the binarization mask of the target area on the corresponding area in the image data, and taking the pixels of the non-target area in the image data as the background to obtain the binarization label mask.
- 7. The intelligent interpretation method of the ground penetrating radar disease data of the multi-modal large model according to claim 6, wherein the objective function is expressed as: ; in the formula, As a function of the object to be processed, As a key parameter, the key parameter is, For the gray-scale contrast ratio, In order for the edge to be smooth, In order to achieve the model connectivity, the method comprises the following steps, As a weight for the gray-scale contrast, As a weight for the edge smoothness, Is the weight of model connectivity.
- 8. The intelligent interpretation method of ground penetrating radar disease data of a multi-modal large model according to claim 1, wherein in the step 7, the text description includes a geometric feature description and a physical feature description; the geometric feature description comprises interpretation of the shape of the target hyperbola and characterization of the target position; the physical characteristic description comprises a signal distribution range, signal amplitude intensity, signal amplitude attenuation degree and the presence or absence of multiple waves.
- 9. The intelligent interpretation method of the ground penetrating radar disease data of the multi-modal large model according to claim 1, wherein the step 8 includes: Step 8.1, acquiring radar data of the ground penetrating to be detected; Step 8.2, sequentially removing direct waves, bandwidth filtering, signal gain and background filtering the ground penetrating radar data to be detected, and then cutting the data into images with preset sizes; Step 8.3, converting the image data obtained in the step 8.2 into data adapting to the serialization format of the multi-mode large model; And 8.4, inputting the data in the serialization format into a multi-mode large model subjected to three fine tuning to perform intelligent interpretation analysis on the underground diseases, and generating the comprehensive interpretation report, wherein the comprehensive interpretation report comprises disease types, spatial positions, geometric feature descriptions and physical feature descriptions.
- 10. An intelligent interpretation device for the data of the ground penetrating radar disease of the multi-mode large model, which is characterized in that the intelligent interpretation method for the data of the ground penetrating radar disease of the multi-mode large model is applicable to any one of claims 1 to 9, and comprises the following steps: the data acquisition module is used for carrying out differential data acquisition aiming at different types of underground diseases by utilizing a plurality of ground penetrating radars with different center frequencies to acquire original radar data; the data preprocessing module is used for preprocessing the original radar data, cutting the original radar data into images with preset sizes, marking disease targets in the images, and constructing a target detection data set comprising a plurality of image data; The model fine tuning module is used for carrying out primary fine tuning on the multi-mode large model by utilizing the target detection data set; according to the target detection result of the target detection data set by the multi-mode large model subjected to fine tuning for the first time, performing binarization processing on the image data by utilizing a threshold segmentation technology of adaptive threshold optimization, and generating a binarization label mask for a semantic segmentation task; converting the image data and the corresponding binary label mask into a serialization format, constructing a semantic segmentation data set, performing secondary fine tuning on the multi-modal large model by using the semantic segmentation data set, performing text description labeling on a disease target according to a semantic segmentation result of the semantic segmentation data set by using the multi-modal large model subjected to secondary fine tuning, constructing an image description data set, and performing tertiary fine tuning on the multi-modal large model by using the image description data set; and the interpretation analysis module is used for realizing intelligent interpretation analysis of the underground diseases by using the multi-mode large model subjected to three fine tuning and generating a comprehensive interpretation report.
Description
Multi-mode large-model ground penetrating radar disease data intelligent interpretation method and device Technical Field The invention belongs to the technical field of computer intelligent computation and information processing, and particularly relates to an intelligent interpretation method and device for disease data of a multi-mode large-model ground penetrating radar. Background Along with the continuous expansion of the scale of the traffic infrastructure in China, early identification and accurate treatment of road underground diseases become key links for guaranteeing public safety and prolonging the service life of the facilities. The ground penetrating radar is used as a rapid and nondestructive underground detection technology, can acquire a section image of an underground medium by analyzing radar wave reflection characteristics, and is widely applied to the fields of road disease detection, municipal engineering investigation and the like. However, the interpretation of ground penetrating radar data in the current industry mainly relies on manual interpretation, which is inefficient and results are greatly affected by personal experience, and consistency and reproducibility are difficult to guarantee. In recent years, intelligent computing technologies, typified by machine learning and artificial intelligence, have provided new paths for automated interpretation. Most of the existing researches adopt a target detection or semantic segmentation model based on a convolutional neural network to realize preliminary positioning and contour recognition of diseases. The method essentially belongs to the single-task application of a computing device based on a specific mathematical model in image processing, focuses on the extraction and classification of radar image bottom features, and does not form a collaborative intelligent system integrating multi-level information. Although the system can replace part of manual operation to a certain extent, the model is single in structure and isolated in task, and lacks knowledge guiding and multi-mode information interaction capability. Furthermore, the current method is mostly limited to the mapping learning from data to labels, and cannot effectively introduce industry prior knowledge and physical constraints, and a complete interpretation chain of 'perception-reasoning-description' conforming to human cognitive logic is not constructed. In the prior art, joint inference among disease positions, outline forms and physical parameters is difficult to realize, and a structured description text meeting industry specifications cannot be automatically generated, so that reliability and practicability of the structured description text under a complex scene are limited. Disclosure of Invention In order to solve the problems in the prior art, the invention provides an intelligent interpretation method and device for the ground penetrating radar disease data of a multi-mode large model. By constructing a multi-task cooperative model integrating visual perception, knowledge reasoning and text generation, not only is feature learning and pattern recognition carried out by utilizing a machine learning artificial intelligence technical means, but also a model based on physical knowledge of a ground penetrating radar is introduced to encode and reason a physical mechanism of a disease and industry specifications, so that high-precision and interpretable automatic interpretation from a radar image to structural description information is realized on the basis of a specific calculation model. The deep fusion and systematic application of the intelligent computing technology in the field of professional detection are promoted. The technical problems to be solved by the invention are realized by the following technical scheme: the invention provides an intelligent interpretation method for ground penetrating radar disease data of a multi-mode large model, which comprises the following steps: Step 1, performing differential data acquisition aiming at different types of underground diseases by using a plurality of ground penetrating radars with different center frequencies to acquire original radar data; Step 2, preprocessing the original radar data, cutting the preprocessed original radar data into images with preset sizes, marking disease targets in the images, and constructing a target detection data set comprising a plurality of image data; Step 3, converting the image data in the target detection data set into data adapting to a serialization format of the multi-mode large model; Step 4, performing first fine tuning on the multi-mode large model by utilizing the target detection data set; Step 5, according to the target detection result of the target detection data set by the multi-mode large model subjected to fine tuning for the first time, performing binarization processing on the image data by utilizing a threshold segmentation technology of adaptive threshold optim