CN-121979259-A - Intelligent cleaning control method, system and equipment for unmanned aerial vehicle

CN121979259ACN 121979259 ACN121979259 ACN 121979259ACN-121979259-A

Abstract

The invention discloses an intelligent cleaning control method, system and equipment for an unmanned aerial vehicle, and relates to the technical field of cleaning control. The method comprises the steps of constructing a cooperative control model according to a comprehensive optimization target, constructing and training a deep reinforcement learning agent based on the cooperative control model, deciding and outputting an action instruction by the agent based on real-time state information in cleaning operation, controlling the unmanned aerial vehicle and a cleaning mechanism to cooperatively operate according to the action instruction, and continuously fine-adjusting a decision strategy according to an updated state and rewarding information. The technical problems of low operation efficiency, high energy consumption and difficult cooperation caused by adopting a static or sectional control strategy when the existing cleaning unmanned aerial vehicle faces a dynamic complex environment are solved, and the technical effects of realizing the overall self-adaptive cooperative control of unmanned aerial vehicle flight control, cleaning parameters and mooring cable management through deep reinforcement learning are achieved, so that the energy consumption is reduced, and the cleaning effect and the operation safety are improved.

Inventors

GAO YONGFU

Assignees

江苏新力特科技(集团)有限公司

Dates

Publication Date: 20260505
Application Date: 20260126

Claims (10)

1. The intelligent cleaning control method for the unmanned aerial vehicle is characterized by comprising the following steps of: Constructing a cooperative control model according to a comprehensive optimization target, wherein the comprehensive optimization target at least comprises total operation energy consumption minimization and cleaning effect optimization in unit area, and input variables of the cooperative control model comprise real-time environment state variables and operation equipment state variables; Based on the cooperative control model, constructing and training a deep reinforcement learning intelligent agent, wherein the state space of the deep reinforcement learning intelligent agent maps the input variable, the action space of the deep reinforcement learning intelligent agent comprises unmanned aerial vehicle flight control parameters, cleaning execution mechanism parameters and mooring cable management parameters, and a reward function of the deep reinforcement learning intelligent agent is designed based on the comprehensive optimization target and preset constraint conditions; in the cleaning operation process, the deep reinforcement learning intelligent agent performs decision analysis according to the state space information of the current moment acquired in real time and outputs a corresponding action space instruction; based on the action space instruction, controlling the unmanned aerial vehicle and a cleaning execution mechanism thereof to execute cooperative operation, acquiring updated state space information and rewarding information after action execution, and continuously fine-tuning the decision strategy of the deep reinforcement learning intelligent agent.
2. The intelligent cleaning control method for an unmanned aerial vehicle of claim 1, wherein constructing a cooperative control model for intelligent cleaning of the unmanned aerial vehicle comprises: modeling the cooperative control model, and establishing a Markov decision process framework guided by the comprehensive optimization target; Under the Markov decision process framework, the coupling relation between each input variable and the comprehensive optimization target in the cooperative control model is initialized and modeled by utilizing historical operation data and simulation data.
3. The intelligent cleaning control method for the unmanned aerial vehicle according to claim 1, wherein the real-time environment state variables at least comprise real-time wind speed and direction information acquired by an onboard sensor, three-dimensional geometrical characteristics of a working surface and stain distribution information identified by a visual sensor, and real-time tension information of a mooring cable acquired by a tension sensor; The operating equipment state variables at least comprise the battery electric quantity and the motor temperature of the unmanned aerial vehicle, the residual capacity of the water tank and the working pressure and flow of the high-pressure water pump.
4. The intelligent cleaning control method for an unmanned aerial vehicle of claim 1, wherein designing and initializing the bonus function based on the comprehensive optimization objective and preset safety and performance constraints comprises: Assigning a positive prize weight to a cleaning performance indicator, the cleaning performance indicator comprising an effective cleaning area per unit time and a stain removal rate based on visual feedback; Distributing forward rewarding weight for energy consumption efficiency indexes, wherein the energy consumption efficiency indexes comprise flight track smoothness, motor torque output stability and high-pressure water pump working point efficiency; negative rewards are distributed for behaviors against preset safety and efficiency constraints, wherein the behaviors comprise predicted collision risks, excessive mooring tension, insufficient cleaning effect threshold and excessive redundancy of operation path planning.
5. The intelligent cleaning control method for an unmanned aerial vehicle of claim 1, wherein the training of the deep reinforcement learning agent is accomplished in a high-fidelity digital twin simulation environment, comprising: constructing a digital twin environment comprising a three-dimensional model of a target operation scene, a hydrodynamic wind field model and an equipment dynamic model; And in the digital twin environment, performing interactive exploration and trial and error through the deep reinforcement learning agent and the simulation environment, and performing offline pre-training on a strategy network until strategy convergence to obtain an initial agent model.
6. The intelligent cleaning control method for an unmanned aerial vehicle of claim 1, wherein the method further comprises an initial coverage path generation step of: Before the operation starts, generating an initial covering path which can completely cover a target operation area and has no collision according to a three-dimensional point cloud model of the target operation area and by combining a preset standard cleaning width; And converting the initial coverage path into an initial state space sequence as a reference state input for the first decision of the deep reinforcement learning agent.
7. The intelligent cleaning control method for an unmanned aerial vehicle according to claim 6, wherein during the cleaning operation, the deep reinforcement learning agent performs decision analysis according to the state space information of the current time acquired in real time, and outputs a corresponding action space instruction, comprising: The deep reinforcement learning agent receives and processes the fused current time state space information, wherein the current time state space information comprises environment state information, equipment state information and deviation information input relative to the reference state; The deep reinforcement learning intelligent agent performs coupling relation deduction and multi-step length prediction according to the state space information at the current moment, takes the minimum current comprehensive cost as a target, and autonomously decides to generate a global operation strategy, wherein the global operation strategy comprises a group of real-time unmanned aerial vehicle flight control parameters, cleaning execution mechanism parameters and mooring cable management parameters; And outputting the global operation strategy as the action space instruction.
8. The intelligent cleaning control method for an unmanned aerial vehicle according to claim 1, wherein obtaining updated state space information and rewarding information after execution of the action, continuously fine-tuning the decision strategy of the deep reinforcement learning agent, comprises: after the action is executed, the deep reinforcement learning intelligent agent collects updated state space information and generates instant rewarding information according to the rewarding function; the state space information, the action space instruction, the obtained instant rewarding information and the updated state space information of the next decision period in the current decision period are stored together as one piece of experience data; and based on the experience data, carrying out gradient update on the strategy network parameters of the deep reinforcement learning agent so as to optimize the decision strategy of the next round of cleaning.
9. An intelligent cleaning control system for an unmanned aerial vehicle, for implementing the intelligent cleaning control method for an unmanned aerial vehicle according to any one of claims 1 to 8, the system comprising: The model construction module is used for constructing a cooperative control model according to a comprehensive optimization target, wherein the comprehensive optimization target at least comprises minimization of total operation energy consumption and optimization of cleaning effect in unit area, and input variables of the cooperative control model comprise real-time environment state variables and operation equipment state variables; The intelligent body training module is used for constructing and training a deep reinforcement learning intelligent body based on the cooperative control model, wherein the state space of the deep reinforcement learning intelligent body is mapped with the input variables, the action space of the deep reinforcement learning intelligent body comprises unmanned aerial vehicle flight control parameters, cleaning execution mechanism parameters and mooring cable management parameters, and the reward function of the deep reinforcement learning intelligent body is designed based on the comprehensive optimization target and preset constraint conditions; the decision analysis module is used for carrying out decision analysis on the deep reinforcement learning intelligent agent according to the state space information of the current moment acquired in real time in the cleaning operation process and outputting a corresponding action space instruction; and the collaborative operation module is used for controlling the unmanned aerial vehicle and the cleaning execution mechanism thereof to execute collaborative operation based on the action space instruction, acquiring updated state space information and rewarding information after action execution, and continuously fine-tuning the decision strategy of the deep reinforcement learning intelligent agent.
10. An electronic device, the electronic device comprising: a memory for storing executable instructions; A processor for implementing the intelligent cleaning control method for an unmanned aerial vehicle of any one of claims 1-8 when executing executable instructions stored in the memory.

Description

Intelligent cleaning control method, system and equipment for unmanned aerial vehicle Technical Field The invention relates to the technical field of cleaning control, in particular to an intelligent cleaning control method, system and equipment for an unmanned aerial vehicle. Background Along with the acceleration of the urban process, the cleaning operation requirement of the outer wall of the high-rise building is growing increasingly, and the traditional manual hanging basket or spider man operation mode is low in efficiency and high in cost, and has obvious safety risks. The development of unmanned aerial vehicle technology provides an innovative solution for aloft work. In particular, the unmanned cleaning vehicle adopting the tethered power supply technology can continuously transmit power and cleaning media through the ground base station, can theoretically realize long-time and large-range continuous operation, and has great potential in efficiency and safety. However, the prior art still faces serious challenges in practical applications. Currently mainstream tethered unmanned cleaning systems, their flight control, high pressure cleaning execution and tethered cable management typically employ discrete or simple tandem control strategies. For example, the flight path is often based on a preset static trajectory plan, and the cleaning parameters (such as water pressure and flow rate) are empirically set to a fixed value or simply adjusted in segments. This control mode appears to be stiff and poorly adaptable in a dynamic, complex high-altitude facade work environment. The wind speed and wind direction abrupt change, the geometric structure change of the vertical face, the uneven distribution of stains and other interference factors can not be cooperatively handled in real time, a series of problems are caused, the energy consumption is high, potential safety hazards can be caused by tension fluctuation of the mooring cable, the cleaning effect is uneven, the overall operation efficiency is not expected, and the like, so that the unexpected complex working condition is difficult to handle. Disclosure of Invention The application provides an intelligent cleaning control method, system and equipment for an unmanned aerial vehicle, which solve the technical problems of low operation efficiency, high energy consumption and difficult cooperation caused by adopting a static or sectional control strategy when the existing cleaning unmanned aerial vehicle faces a dynamic complex environment. In a first aspect of the present application, there is provided an intelligent cleaning control method for an unmanned aerial vehicle, the method comprising: The method comprises the steps of constructing a cooperative control model according to a comprehensive optimization target, wherein the comprehensive optimization target at least comprises total operation energy consumption minimization and unit area cleaning effect optimization, input variables of the cooperative control model comprise real-time environment state variables and operation equipment state variables, constructing and training a deep reinforcement learning intelligent body based on the cooperative control model, mapping state space of the deep reinforcement learning intelligent body into the input variables, designing a reward function based on the comprehensive optimization target and preset constraint conditions, and carrying out decision analysis on the deep reinforcement learning intelligent body according to current time state space information acquired in real time in the cleaning operation process, outputting corresponding action space instructions, controlling the unmanned aerial vehicle and a cleaning execution mechanism thereof to execute cooperative operation based on the action space instructions, acquiring updated state space information and reward information after action execution, and carrying out continuous fine adjustment on the decision strategy of the deep reinforcement learning intelligent body. In a second aspect of the application, there is provided an intelligent cleaning control system for an unmanned aerial vehicle, the system comprising: The system comprises a model construction module, an intelligent training module, a decision analysis module and an intelligent training module, wherein the model construction module is used for constructing a cooperative control model according to a comprehensive optimization target, the comprehensive optimization target at least comprises total operation energy consumption minimization and unit area cleaning effect optimization, input variables of the cooperative control model comprise real-time environment state variables and operation equipment state variables, the intelligent training module is used for constructing and training a deep reinforcement learning intelligent body based on the cooperative control model, the state space of the deep reinforcement learning intelligent body is