CN-121982609-A - Patrol agent collaborative sensing system based on semantic driving

CN121982609ACN 121982609 ACN121982609 ACN 121982609ACN-121982609-A

Abstract

The invention discloses a patrol agent cooperative sensing system based on semantic driving, which relates to the technical field of intelligent patrol, and comprises a prototype mapping module, a semantic map module, a coupling and mutual sending module, a fusion and renovation module and a fusion and renovation module, wherein the prototype mapping module is used for collecting patrol target category information and monitoring indexes of the patrol targets, generating a patrol task semantic prototype set and a patrol task semantic mapping table through semantic conversion, the semantic map module is used for collecting patrol image data and patrol video data through the patrol agent, carrying out pixel-level visual feature matching based on the patrol task semantic prototype set, generating a semantic confidence map and a semantic request map, and executing sparse selection and directional mutual sending of semantic supply and demand coupling under the joint constraint of the semantic confidence map and the semantic request map, generating a multi-source sparse semantic feature queue, and carrying out position-level semantic attention fusion and detectable sensitivity recalibration on the multi-source sparse semantic feature queue, and generating a fusion probability map and a fusion example table.

Inventors

GAO SHENGZHE
YAN YUJIA
Zhao hengxing
YUAN ZHI

Assignees

西安圣瞳科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260126

Claims (10)

1. A patrol agent cooperative sensing system based on semantic driving is characterized by comprising, The prototype mapping module is used for collecting the inspection target category information and the monitoring index of the inspection target, and generating an inspection task semantic prototype set and an inspection task semantic mapping table through semantic conversion; The semantic map module is used for acquiring inspection image data and inspection video data through an inspection agent, and carrying out pixel-level visual feature matching based on an inspection task semantic prototype set to generate a semantic confidence map and a semantic request map; the coupling mutual transmission module is used for executing sparse selection and directed mutual transmission of semantic supply-demand coupling under the joint constraint of the semantic confidence map and the semantic request map to generate a multi-source sparse semantic feature queue; The fusion and recalibration module is used for performing position-level semantic attention fusion and measurable sensitivity recalibration on the multisource sparse semantic feature queue to generate a fusion probability map and a fusion instance table; And the diagnosis view finding module is used for performing equipment health state diagnosis and fault detection according to the fusion instance table, and generating a diagnosis list and a patrol view indication.
2. The system for sensing the patrol-target-class-information based on semantic-driving of claim 1, wherein the patrol-target-class information comprises equipment component class, defect type, surface quality problem and structural problem, each piece of patrol-target-class information comprises a unique class identifier and complete text description, the patrol-target-class information set is generated by carrying out standardization and integration according to the unique class identifier after the collection is completed, and the monitoring indexes of the patrol targets are collected one by one based on the patrol-target-class-information set; And establishing a one-to-one correspondence between the inspection target category information and the inspection target monitoring index, and generating a set corresponding to the inspection target category and the inspection target monitoring index.
3. The system for collaborative inspection of an object based on semantic driving according to claim 2, wherein the semantic conversion includes encoding a textual description of each piece of inspection target category information using a text encoder to generate a text semantic embedded vector; extracting a sample corresponding to the inspection target category information from the historical inspection image data and the historical inspection video data, extracting visual features by a visual encoder, and generating a visual embedded vector; fusing the text semantic embedded vector and the visual embedded vector to generate a patrol target category semantic prototype vector; executing text coding and visual coding which are the same as the type information of the inspection target on the monitoring index of the inspection target, and fusing to generate a semantic prototype vector of the monitoring index; Summarizing the inspection target category semantic prototype vector and the monitoring index semantic prototype vector to generate an inspection task semantic prototype set, and matching according to the one-to-one correspondence between the inspection target category and the monitoring index corresponding set of the inspection target to generate an inspection task semantic mapping table.
4. The system for collaborative sensing of a patrol agent based on semantic driving according to claim 3, wherein the pixel level visual feature matching is performed based on a set of patrol task semantic prototypes to generate a semantic confidence map and a semantic request map, steps of, Sequentially performing denoising, resolution standardization and contrast enhancement on the inspection image data and the inspection video data under the unified preprocessing specification; Performing visual embedding vector calculation on the preprocessed pixels in the inspection image data and the inspection video data, taking an inspection task semantic prototype set as a semantic reference, performing cosine similarity calculation on the visual embedding vector of the pixels and the semantic prototype vector in the inspection task semantic prototype set, and taking the maximum value to obtain semantic similarity; performing optical character recognition confidence normalization on the pixel neighborhood to obtain a character readability score, and monotonously mapping the spatial gradient definition measurement to obtain an imaging ambiguity score; forward accumulation and fuzzy suppression are carried out on the semantic similarity and the text readability score through the imaging ambiguity score, so that comprehensive semantic confidence is generated; And filling the integrated semantic confidence into a two-dimensional coordinate grid according to the pixel coordinate positions to generate a semantic confidence map, and filling the same pixel coordinate positions with a pixel value of which the integrated semantic confidence is subtracted to generate a semantic request map.
5. The system for collaborative sensing of a patrol agent based on semantic driving according to claim 4, wherein the sparse selection and directional interaction of semantic supply and demand coupling is performed as follows, Determining a region which is provided with the strength and needs the strength based on the semantic confidence map and the semantic request map; dividing candidate areas according to the fixed grids and connectivity rules, and calculating area confidence scores, area request mean values and transmission costs of the candidate areas; According to the region confidence score, the region request mean value and the transmission cost, performing value evaluation by adopting a single score, and calculating the benefit score of the candidate region; sorting the benefit scores from high to low, and carrying out secondary screening to obtain a candidate region adopted; Cutting the adopted candidate region into a minimum identifiable image block, binding and compactly embedding the minimum identifiable image block, and primarily judging semantic categories, wherein a timestamp, a pixel coordinate range and a source inspection agent identifier are attached; Intercepting subregions with the same pixel coordinate range on a semantic request graph of each patrol agent except the source patrol agent for each adopted candidate region and calculating a region request average value; Forming a target patrol agent set by the patrol agents with the regional request average value larger than zero, sequencing from high to low according to the regional request average value, and orderly listing the regional request average value and the upper limit of the total transmission amount into a transmission list according to the time window bandwidth budget; And performing directional mutual transmission according to the transmission list, and generating a transmission record.
6. The system of claim 5, wherein the multi-source sparse semantic feature queue comprises minimum identifiable image blocks, compact embedding, semantic class initiation and send records.
7. The semantically driven patrol agent co-sensing system according to claim 6, wherein said location-level semantic attention fusion and measurable sensitivity recalibration are performed by, Spatially aligning the multi-source sparse semantic feature queues on a unified reference pixel grid; After the space alignment, establishing a source inspection agent probability map and source inspection agent description information for the source inspection agent; Generating a candidate instance mask for the minimum identifiable image block based on the minimum identifiable image block in the multi-source sparse semantic feature queue; Calculating the average semantic confidence of the source inspection agent in the candidate instance mask, and generating a priori attention weight according to the normalized relation; weighting and synthesizing the source patrol agent probability map according to the prior attention weight to generate an initial fusion probability expression, and measuring the measurable sensitivity on the candidate instance mask pixel set by adopting a source patrol agent sensitivity check mode; And contracting and normalizing the prior attention weight according to the measurable sensitivity to obtain a recalibrated attention weight, and calculating a fusion probability value according to the recalibrated attention weight.
8. The system of claim 7, wherein the fusion probability map and fusion instance table comprises filling fusion probability values into fusion probability maps according to pixel coordinates and reference pixel grids, and performing binarization, connectivity analysis and geometric refinement after the fusion probability maps are generated to form the fusion instance table.
9. The system of claim 8, wherein performing the device health diagnosis and fault detection comprises establishing an instance time sequence based on a fusion instance table in adjacent time indexes, calculating normalized diagnosis indexes and uncertainty, and generating a class label and a diagnosis list according to the comparison result of the diagnosis indexes.
10. The system of claim 9, wherein the inspection view indication comprises generating a corresponding inspection view indication for the inspection agent on the candidate view set of the inspection agent based on the semantic request graph and the diagnostic manifest.

Description

Patrol agent collaborative sensing system based on semantic driving Technical Field The invention relates to the technical field of intelligent patrol, in particular to a patrol agent cooperative sensing system based on semantic driving. Background In the field of operation and maintenance of industrial equipment and inspection, the inspection agent is widely applied to tasks such as equipment part state detection, defect identification and surface quality evaluation. The conventional method generally relies on a preset detection model and task list, acquires field image and video data by an inspection agent, performs feature extraction and target identification after unified preprocessing, and completes state judgment by combining category information in a task work order with detection standards. In the information acquisition and processing process, the method is mainly based on static matching of equipment category, defect type and related monitoring indexes, and realizes identification and recording of the running state of the equipment through visual feature comparison, template detection and regularization judgment, so as to form a patrol report to support maintenance decision. However, the conventional method has limitations in the aspects of multi-agent collaborative inspection and cross-source information fusion, on one hand, the data matching is based on local features and static rules of single agents, unified coding and high-precision alignment of multi-source semantic information are lacked, and real-time semantic intercommunication and supplementation among the multi-agents are limited, and on the other hand, on the aspect of task driving and framing guidance, the conventional method is difficult to couple semantic demand intensity and evidence information intensity to be used for dynamic view angle planning, so that partial high-value information is not acquired timely and accurately. Disclosure of Invention The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides a collaborative perception system of tour-inspection agents based on semantic driving, which solves the problems of insufficient semantic alignment precision and insufficient dynamic framing guidance among multiple tour-inspection agents. In order to solve the technical problems, the invention provides the following technical scheme: the invention provides a patrol agent cooperative sensing system based on semantic driving, which comprises, The prototype mapping module is used for collecting the inspection target category information and the monitoring index of the inspection target, and generating an inspection task semantic prototype set and an inspection task semantic mapping table through semantic conversion; The semantic map module is used for acquiring inspection image data and inspection video data through an inspection agent, and carrying out pixel-level visual feature matching based on an inspection task semantic prototype set to generate a semantic confidence map and a semantic request map; the coupling mutual transmission module is used for executing sparse selection and directed mutual transmission of semantic supply-demand coupling under the joint constraint of the semantic confidence map and the semantic request map to generate a multi-source sparse semantic feature queue; The fusion and recalibration module is used for performing position-level semantic attention fusion and measurable sensitivity recalibration on the multisource sparse semantic feature queue to generate a fusion probability map and a fusion instance table; And the diagnosis view finding module is used for performing equipment health state diagnosis and fault detection according to the fusion instance table, and generating a diagnosis list and a patrol view indication. The invention relates to a semantic-driving-based patrol intelligent agent collaborative perception system, which comprises a plurality of pieces of patrol target category information, wherein each piece of patrol target category information comprises a unique category identifier and a complete text description, the patrol target category information is standardized and integrated according to the unique category identifier after collection is completed to generate a patrol target category information set, and monitoring indexes of the patrol targets are collected one by one based on the patrol target category information set; And establishing a one-to-one correspondence between the inspection target category information and the inspection target monitoring index, and generating a set corresponding to the inspection target category and the inspection target monitoring index. The semantic conversion comprises the steps of encoding word descriptions of each piece of inspection target category information by adopting a text encoder to generate text semantic embedded vectors; extracting a sample corresponding t