CN-121984571-A - Satellite-assisted multi-unmanned aerial vehicle calculation unloading method, device and program product

CN121984571ACN 121984571 ACN121984571 ACN 121984571ACN-121984571-A

Abstract

The invention relates to the technical field of computing resource allocation and discloses a satellite-assisted multi-unmanned aerial vehicle computing and unloading method, device and program product, wherein the method comprises the steps of acquiring a state sample set generated by interaction between each unmanned aerial vehicle stored in an experience playback buffer zone and a satellite mobile edge computing network; when the number of the state samples in the state sample set is larger than or equal to the preset batch size, updating the initial calculation unloading strategy of each unmanned aerial vehicle by using a near-end strategy optimization algorithm until a satellite mobile edge calculation network is in a stable state, so as to obtain the optimal calculation unloading strategy of the unmanned aerial vehicle, ensure the low energy consumption and the low calculation delay of the unmanned aerial vehicle in a high dynamic network environment, and reduce the conflict among the unmanned aerial vehicles when the unmanned aerial vehicle acquires calculation resources.

Inventors

LIANG HUI
YUAN RUNQI
MU GUOCAI
LI HAOMING

Assignees

东莞理工学院

Dates

Publication Date: 20260505
Application Date: 20260214

Claims (10)

1. A satellite-assisted multi-drone computation offloading method, the method comprising: Acquiring a state sample set generated by interaction between each unmanned aerial vehicle stored in the experience playback buffer zone and a satellite mobile edge computing network; judging whether the number of the state samples in each state sample set is larger than or equal to the preset batch size; And when the number of the state samples in the state sample set is larger than or equal to the preset batch size, updating the initial calculation unloading strategy of each unmanned aerial vehicle by using a near-end strategy optimization algorithm until the satellite mobile edge calculation network is in a stable state, and obtaining the optimal calculation unloading strategy of the unmanned aerial vehicle.
2. The method of claim 1, wherein when the number of state samples in the set of state samples is greater than or equal to a preset batch size, updating an initial computation offload policy for each of the drones with a near-end policy optimization algorithm until the satellite mobile edge computing network is in a steady state, obtaining an optimal computation offload policy for the drone, comprising: Acquiring a new real-time computing and unloading strategy of each unmanned aerial vehicle according to the old initial computing and unloading strategy of each unmanned aerial vehicle, wherein the initial network weight parameters of the real-time computing and unloading strategy are consistent with the network weight parameters of the initial computing and unloading strategy; based on the state sample set, the initial calculation unloading strategy and the real-time calculation unloading strategy of each unmanned aerial vehicle, constructing a total loss function of each unmanned aerial vehicle through the near-end strategy optimization algorithm; Optimizing the total loss function of each unmanned aerial vehicle by utilizing a gradient clipping and back propagation algorithm, and iteratively updating the initial network weight parameters of the real-time calculation unloading strategy; and taking the updated real-time calculation unloading strategy as a new initial calculation unloading strategy, returning to the step of constructing the total loss function of each unmanned aerial vehicle, and repeatedly iterating until the satellite mobile edge calculation network is in a stable state, so as to obtain the optimal calculation unloading strategy of each unmanned aerial vehicle.
3. The method of claim 2, wherein constructing a total loss function for each of the drones based on the set of state samples, the initial computational offload policy, and the real-time computational offload policy for each of the drones, processed by the near-end policy optimization algorithm, comprises: Inputting each state sample set into a value network in the near-end strategy optimization algorithm to calculate to obtain a plurality of dominant values, wherein each dominant value is the degree of merit of the current action state of each unmanned aerial vehicle relative to the average value of the calculated unloading strategy; Calculating probability policy ratio of each initial calculation unloading policy and each new real-time calculation unloading policy on the same action; based on a plurality of the dominance values, limiting each probability strategy ratio in a preset interval by using a preset clipping function and constructing a target clipping function of each unmanned aerial vehicle; Obtaining a cost loss function of a criticism network in the near-end strategy optimization algorithm; And constructing the total loss function of each unmanned aerial vehicle based on the target clipping function and the cost loss function of each unmanned aerial vehicle.
4. A method according to claim 3, wherein inputting each of the state sample sets into the value network in the near-end policy optimization algorithm for calculation results in a plurality of dominance values, comprising: Calculating a strategy utility value of the initial calculation unloading strategy of each unmanned plane by using a weighted utility function according to the position of each unmanned plane and the target beam selected by each unmanned plane; calculating a beam load rewarding value of each target beam based on the load value of the target beam selected by each unmanned aerial vehicle; calculating a plurality of flight distance penalty values based on the distance of each unmanned aerial vehicle from each target beam; calculating a plurality of low-delay rewards based on the time delay of the current calculation unloading strategy of each unmanned aerial vehicle and the time delay of the local calculation strategy of each unmanned aerial vehicle; Determining a plurality of instant prize values for the drone based on a plurality of the strategic utility values, a plurality of the beam load prize values, the plurality of flight distance penalty values, and the plurality of low-latency prize values; And calculating a value network in the near-end strategy optimization algorithm based on a plurality of the state sample sets and the plurality of instant rewards to obtain a plurality of advantage values.
5. The method of claim 4, wherein calculating a policy utility value for the initial calculation offloading policy for each of the drones using a weighted utility function based on the location of each of the drones and the target beam selected by each of the drones, comprises: Acquiring the calculation resource cost generated by the satellite mobile edge calculation network; according to the decision type of each initial calculation unloading strategy, the position relation between each unmanned aerial vehicle and each target beam, calculating the total time delay and total energy consumption of each unmanned aerial vehicle for executing the initial calculation unloading strategy; The strategic utility value of each of the drones is calculated using the weighted utility function based on the computational resource cost, the total time delay and the total energy consumption of each of the drones.
6. The method according to claim 1, wherein the method further comprises: acquiring an initial upper limit value of a single wave beam load and a real-time load value of each wave beam in the satellite mobile edge computing network; judging whether the real-time load values of all beams in the satellite mobile edge computing network are equal to the initial upper limit value or not; and when the real-time load values of all the beams are equal to the initial upper limit value, determining that the optimal calculation unloading strategy is a local calculation unloading strategy.
7. A satellite assisted multi-drone computing offloading device, the device comprising: The acquisition module is used for acquiring a state sample set generated by interaction between a plurality of unmanned aerial vehicles stored in the experience playback buffer area and the satellite mobile edge computing network; the judging module is used for judging whether the number of the state samples in the state sample set is larger than or equal to the preset batch size; and the updating module is used for updating the initial calculation unloading strategy of each unmanned aerial vehicle by utilizing a near-end strategy optimization algorithm when the number of the state samples in the state sample set is larger than or equal to the preset batch size until the satellite mobile edge calculation network is in a stable state, so as to obtain the optimal calculation unloading strategy of the unmanned aerial vehicle.
8. An electronic device, comprising: A memory and a processor communicatively coupled to each other, the memory having stored therein computer instructions that, upon execution, perform the satellite assisted multi-drone computation offload method of any of claims 1 to 6.
9. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the satellite-assisted multi-drone computation offload method of any of claims 1 to 6.
10. A computer program product comprising computer instructions for causing a computer to perform the satellite assisted multi-drone calculation offloading method of any one of claims 1 to 6.

Description

Satellite-assisted multi-unmanned aerial vehicle calculation unloading method, device and program product Technical Field The invention relates to the technical field of computing resource allocation, in particular to a satellite-assisted multi-unmanned aerial vehicle computing unloading method, device and program product. Background Satellite mobile edge computing (SatMEC) networks provide wide area coverage and low latency services for internet of things (IoT), real-time communications, etc. scenarios through the cooperation of Low Earth Orbit Satellites (LEOs) with ground-based devices. However, as the devices of the internet of things increase, the load of the ground base station for distributing resources is high for dynamic computing resource requirements, and in addition, the coverage area of the ground base station is limited, so that a low-altitude area (such as an unmanned aerial vehicle working airspace, a high-altitude logistics channel and the like) is difficult to cover, and in recent years, the unmanned aerial vehicle is rapidly developed, and the low altitude is a key area for development. In this context, computing resource maldistribution may result in increased device power consumption. At present, research on resource allocation of SatMEC networks is continuously advanced in multiple directions, and aims to ensure that SatMEC networks are efficient in cross-domain cooperation and reasonable in resource allocation, and the research comprises the following steps: 1.‌ integration of the emerging technologies, namely, with the continuous development of the technologies, the emerging technologies such as artificial intelligence, edge computing and the like are gradually integrated into the satellite network security field. The artificial intelligence and machine learning algorithm is excellent in SatMEC network resource allocation, and can cope with the high dynamic performance of the network through reinforcement learning or evolutionary game theory, or combine reinforcement learning with predictive scheduling algorithm to deploy computing resources to high-demand areas in advance. The introduction of the edge computing technology enables data to be processed at edge nodes close to a data source or a user, reduces data transmission delay and improves network response speed. 2. The optimization based on digital twin is that the digital twin provides a virtual environment for resource allocation by constructing a virtual mirror image which is synchronous with a physical network in real time, and is a powerful tool for realizing global optimization and algorithm verification. The resource allocation strategy is firstly tested, evaluated and optimized in the digital twin body, and then the optimal strategy is issued to the physical network for execution. This solves the difficult problem that physical networks are difficult to perform high-risk or large-scale algorithmic tests. However, the prior art has the defects that on one hand, a network with insufficient cross-domain resource coordination SatMEC relates to coordination of heterogeneous resources such as low-orbit satellites, ground base stations and unmanned aerial vehicles, the prior method is difficult to realize global optimization, the problem of resource allocation in a multi-conflict-domain scene needs to consider the dynamic property of overall resources and data transmission rate, and the capacity of the traditional game model for coordinating and optimizing the multi-conflict domain is limited ‌ ‌. On the other hand, conventional resource allocation schemes lack flexibility to cope with dynamic changes in the network. The service request of the satellite network has randomness and volatility, but the existing static resource allocation strategy cannot be flexibly adjusted according to the real-time network condition, and the resource utilization efficiency is difficult to improve. In addition, in a complex satellite network environment, the existing method is insufficient in balance and robustness, and the stability of game balance is limited by network topology change and external interference ‌. For example, conventional resource allocation methods such as Q-learning, optimization algorithms, etc., are not feasible in high-dimensional state and action spaces, or are slow to make decisions. Disclosure of Invention The invention provides a satellite-assisted multi-unmanned aerial vehicle calculation unloading method, a satellite-assisted multi-unmanned aerial vehicle calculation unloading device and a program product, which are used for solving the problem of high-dynamic cross-domain network resource allocation optimization in the existing SatMEC network resources. In a first aspect, the present invention provides a satellite-assisted multi-unmanned aerial vehicle computing offloading method, the method comprising: The method comprises the steps of obtaining a state sample set generated by interaction of each unmanned aerial veh