CN-121981420-A - Intelligent inspection task planning method with autonomous learning and dynamic policy optimization
Abstract
The invention discloses an intelligent patrol task planning method with autonomous learning and dynamic policy optimization, and relates to the technical field of industrial automation monitoring. The method comprises the steps of firstly constructing a heterogeneous association map comprising equipment nodes, environment nodes, space adjacent sides and process coupling sides, identifying an initial risk source by using an abnormal feature extraction algorithm, executing risk potential energy propagation calculation to generate a global dynamic risk distribution map, then generating a reference inspection path comprising non-uniform sampling parameters with the maximization of high risk area information gain as a target, and triggering event-driven local re-planning in real time based on the deviation degree of actual measurement and predicted data in inspection execution to generate a temporary verification path aiming at an abnormal association neighborhood. The invention realizes the pre-judgment of potential hidden trouble and the real-time response of sudden abnormal by quantifying the process and space risk propagation, and effectively improves the inspection efficiency and the security in complex industrial scenes.
Inventors
- CHEN SHUNQING
- BAI ZHAOYU
- TANG HAIFENG
- AN QI
- ZHAO ZHIYONG
- SHI KEQIN
Assignees
- 国能河北沧东发电有限责任公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251203
Claims (10)
- 1. An intelligent patrol task planning method with autonomous learning and dynamic policy optimization is characterized by comprising the following steps: S1, constructing an equipment-environment heterogeneous association map of a patrol area, wherein the map comprises equipment nodes, environment nodes and space adjacent edges and process coupling edges for connecting the nodes; S2, collecting historical operation data and real-time monitoring data of each node, and identifying an initial risk source node by using an abnormal characteristic extraction algorithm; s3, based on the heterogeneous association map, performing risk potential energy propagation calculation, simulating the process of spreading the risk potential energy of the initial risk source node to the association node along the process coupling edge, and generating a global dynamic risk distribution map; S4, based on the global dynamic risk distribution map, aiming at maximizing the information gain of the high risk area, and generating a reference inspection path containing non-uniform sampling parameters; And S5, calculating the deviation degree of actual measurement data and predicted data of the current inspection point in real time in the process of executing the inspection task, and triggering an event-driven local re-planning mechanism when the deviation degree exceeds a preset threshold value to generate a temporary verification path aiming at the current node association neighborhood and dynamically inserting an execution sequence.
- 2. The method according to claim 1, wherein the constructing the spatial association construction rule of the heterogeneous association map in step S1 includes: calculating Euclidean distance between any two nodes, if the Euclidean distance is smaller than a preset neighborhood threshold, establishing a bidirectional space adjacent edge between the two nodes, and taking the inverse distance as the basic weight of the edge; The process association construction rule is to identify the fluid flow direction, the electric driving relation and the control signal dependency relation among the devices by analyzing the process flow data of the industrial control system, establish one-way or two-way process coupling edges according to the relation, and match corresponding propagation intensity coefficients for each process coupling edge based on a history fault association library.
- 3. The method according to claim 1, wherein the initial risk source identification in step S2 specifically comprises: Performing frequency domain transformation on the time sequence data monitored in real time, and extracting frequency band energy characteristics; inputting the extracted features into a pre-trained random forest or long-short term memory network (LSTM) model; If the abnormal confidence coefficient output by the model exceeds a set threshold value, marking the corresponding equipment node as a risk source node, and normalizing the confidence coefficient to serve as an initial risk potential energy value.
- 4. The method according to claim 1, wherein the specific calculation model of risk potential energy propagation in step S3 is: Wherein, the As the integrated risk value of node j at time t, As an inherent risk to the node, For a process-associated set of neighbors, For a set of spatially-correlated neighbors, In order to achieve the process coupling strength, As a result of the spatial attenuation factor, Is a normalized weight coefficient.
- 5. The method according to claim 1, wherein the generating the reference inspection path in step S4 uses a modified genetic algorithm, and the multi-objective fitness function is constructed as follows: Wherein, the Entropy gain of information of each inspection point on the path is proportional to the comprehensive risk value of the node; For the total length of the path, To be expected to be time consuming.
- 6. The method according to claim 1, wherein the non-uniform sampling strategy in step S4 specifically comprises: establishing a mapping table of risk level and a patrol mode, wherein the patrol mode comprises a moving speed, a zooming multiple and a sensor opening combination; When the planned path passes through the region with the comprehensive risk value higher than the high risk threshold, automatically matching a fine inspection mode, reducing the moving speed and activating an infrared thermal imaging and partial discharge detection sensor; When the planned path is in a low risk region, the high-speed cruise mode is maintained and only the base obstacle avoidance sensor is turned on.
- 7. The method according to claim 1, wherein the deviation calculation model for triggering the re-planning in step S5 employs KL-divergence (Kullback-Leibler Divergence): Wherein, the The probability distribution of the data is acquired in real time for the inspection points, Is a predictive probability distribution generated based on historical data.
- 8. The method according to claim 1, wherein the verification target screening policy of the local re-planning mechanism in step S5 is: the verification target screening comprises the steps of suspending the robot to move along a reference path, taking a node of a current trigger deviation threshold as a center, performing breadth-first search in a heterogeneous association graph to obtain an association subgraph with depth of k, calculating potential information gain of each non-access node in the subgraph, and reserving nodes with gain values ordered in the front N bits as verification target points; and controlling the robot to execute the verification task along the closed-loop sub-path, returning to the breakpoint after the task is completed, and updating the global dynamic risk distribution map by using new data acquired in the verification task.
- 9. The method of claim 1, further comprising a model adaptive correction step: After each inspection task is finished, comparing the node predicted risk value with the actually measured fault result; if a missing report node exists, namely the predicted risk is lower than a threshold value but the actual measurement has a fault, increasing the propagation intensity coefficient of the process coupling edge in the node incidence direction; if a false alarm node exists, namely the prediction risk is higher than the threshold value but the actual measurement is fault-free, the propagation intensity coefficient of the process coupling edge in the node incidence direction is reduced.
- 10. The method of claim 1, wherein the method operates in Yun Bian co-architecture: The heterogeneous association map construction, the global risk propagation calculation and the reference path generation are executed on a cloud server, and the global topological relation is processed by using high calculation power of the cloud; the real-time deviation calculation and the local re-planning are executed by an edge calculation module deployed on the inspection robot.
Description
Intelligent inspection task planning method with autonomous learning and dynamic policy optimization Technical Field The invention relates to the technical field of industrial automation monitoring, in particular to an intelligent patrol task planning method with autonomous learning and dynamic policy optimization. Background With the advancement of industrial automation and intelligent manufacturing, various track-type or wheel-type intelligent inspection robots have been widely used in substations, chemical plants and large-scale energy facilities. Particularly in the field of intelligent management of fuel in a thermal power plant, a coal conveying system is used as a 'large artery' of the thermal power plant and bears the key task of efficiently conveying coal from a discharging end to a raw coal bin. The system usually covers a large number of electromechanical devices such as a belt conveyor, a stacker-reclaimer, crushing equipment, screening equipment, a transfer station and the like, is distributed in a closed trestle or an underground corridor which is up to several kilometers, and has the remarkable characteristics of high dust, high noise and long span. At present, aiming at the robot inspection mode of the long-distance conveying system, a fixed route and a schedule which are preset manually are mainly relied on, and the robot is controlled to perform infrared temperature measurement or audio acquisition on a carrier roller, a motor and a speed reducer along the route according to a set track so as to replace manual labor to complete heavy routine inspection. However, existing inspection mission planning methods still have significant technical limitations in coping with such complex industrial scenarios with strong process correlations. First, the prior art ignores upstream-downstream coupling objectively existing in the process flow, and risk assessment is characterized by islanding. In continuous fluid conveying processes such as coal conveying systems, strict sequential control interlocking and material conduction relation exists between equipment. For example, when a slight vibration abnormality occurs in the upstream crushing apparatus, there is a high possibility that a blockage occurs in the downstream blanking pipe or that the amount of the upper-stage feed material fluctuates. In the existing inspection system, each device is generally regarded as an independent monitoring point, and only whether the temperature of a single motor or a roller exceeds the standard is judged, so that a logic chain with risks propagating along the flow direction of the process materials is cut off. The evaluation mode of the association of the splitting physics and the process ensures that the system cannot predict the diffusion direction of risks, so that the head pain is easy to cause, and the systematic hidden danger cannot be identified at the initial stage of the fault. Moreover, the existing path planning strategy is stiff, and is difficult to cope with complex and changeable burst environments of the industrial field. The actual production field environment is very unstable, and sudden material scattering and leakage, belt deviation, local dust concentration exceeding or abnormal noise are often caused. At present, most of inspection paths are static scripts generated before the task starts, and a robot only serves as an executor of a machine. When the robot detects that an abnormal peak appears in the sound spectrum of a certain section of area or an infrared image displays a local hot spot during inspection, the existing control logic cannot trigger real-time path reconstruction due to lack of an event-driven stress decision mechanism, and the robot still continues to go to the next preset punching point according to the original plan. This stiff mode results in the robot missing a golden window period for short-range, multi-angle review of the sudden risk points. Existing systems also lack adaptive patrol strategies for critical devices. In large conveyor systems, there is often a large difference in the load conditions and wear rates of the different segment equipment. The existing system often adopts a uniform constant-speed inspection mode, the allocation of inspection resources cannot be dynamically adjusted according to the health degree or importance of equipment, so that the low-risk area is excessively inspected to cause the waste of calculation power and electric quantity, and key rotating parts in a high-abrasion state cannot be monitored in a high-frequency key manner. In summary, for complex industrial scenarios such as large-scale coal conveying systems, there is a need to develop an intelligent planning method that can understand the process topology relationship, quantify the risk propagation effect along the process chain, and reconstruct the inspection strategy in real time in a complex environment. Disclosure of Invention In order to overcome the defects existing at present, the