CN-121585983-B - Post-disaster emergency communication system

CN121585983BCN 121585983 BCN121585983 BCN 121585983BCN-121585983-B

Abstract

The invention discloses a post-disaster emergency communication system, which relates to the field of emergency communication, and provides a combined optimization strategy by comprehensively considering multidimensional decision variables such as unmanned aerial vehicle flight path planning, STAR-RIS beam forming, user task unloading proportion, communication and computing resource allocation and the like so as to minimize the maximum processing time delay of all user tasks and unmanned energy consumption, thereby prolonging the endurance time of an unmanned aerial vehicle while meeting the time-efficiency requirement of emergency service. The system can meet multiple requirements of post-disaster communication, calculation, energy supply and the like, and meanwhile, the communication recovery capability and emergency response efficiency of disaster areas are remarkably improved, and the system has the advantages of being rapid in deployment, flexible, high in reliability and the like.

Inventors

WU WENJIE
LUO ZHONGQIANG

Assignees

四川轻化工大学

Dates

Publication Date: 20260505
Application Date: 20260128

Claims (8)

1. The post-disaster emergency communication system is characterized by comprising a reconfigurable intelligent surface, an unmanned aerial vehicle and a control module, wherein: The unmanned aerial vehicle is provided with an edge server for adjusting the spatial position of the edge server, wherein the edge server is used for communicating with the reconfigurable intelligent surface and/or a user; The reconfigurable intelligent surface is deployed on the ground and responds to the communication demands of users and/or unmanned aerial vehicles, and diffraction or avoidance of obstacles in a scene after disaster is realized by regulating and controlling transmission and reflection of incident signals in an air-ground link, so that signal coverage is improved; The control module is used for acquiring and generating control actions of one or more devices based on the running state of each device in the system and taking the maximum processing time delay of all user tasks and the energy consumption of an unmanned aerial vehicle as targets on the basis of ensuring the electric quantity requirement of the user, so as to realize the self-adaption capability to the environmental change after disaster, wherein the energy consumption of the unmanned aerial vehicle comprises the energy consumption of an edge server; the expression targeting the minimization of the maximum processing delay and unmanned energy consumption of all user tasks is: Wherein the method comprises the steps of The ratio of tasks offloaded to the edge server for time slot k; for the position of the drone at time slot k, i.e. the position of the edge server, ; The flying speed of the unmanned aerial vehicle is the flying speed of the edge server when the time slot k is; The flight angle of the unmanned aerial vehicle is the flight angle of the unmanned aerial vehicle in the time slot k, namely the flight angle of the edge server; And The reflection phase matrix and the transmission phase matrix of the reconfigurable intelligent surface are used for representing phase modulation applied to the reflection signal and the transmission signal by each reconfigurable intelligent surface respectively, K is the total number of time slots of the whole communication period T, K is the time slot index, I is the total number of users, Indexing for the user; Time slot k time-slot Maximum processing delay of individual user tasks; processing the first time slot for time slot k Unmanned energy consumption of individual user tasks; Representing constraints; representing an initial position of the unmanned aerial vehicle; Is the time of flight; The maximum flying speed of the unmanned aerial vehicle; And Respectively unmanned plane is at Shaft and method for producing the same Furthest flight distance of the shaft; Time slot k time-slot The location of the individual user(s), ; And Respectively the first Individual user is at Shaft and method for producing the same The furthest distance of movement of the shaft; The flight energy consumption of the unmanned aerial vehicle is the time slot k; the energy consumption of the unmanned aerial vehicle for data transmission in time slot k; calculating energy consumption for the edge server in time slot k; The maximum energy of the battery of the unmanned aerial vehicle; And Respectively limiting the minimum electric quantity and the maximum electric quantity of the user; Time slot k time-slot The electric quantity of each user; representing the reflected phase offset at time slot k; Representing the transmission phase shift at time slot k; represent the first Total task amount of individual users; D represents all tasks; A specific method for generating control actions of one or more devices with the aim of minimizing the maximum processing delay and unmanned energy consumption of all user tasks comprises the following steps: S1, constructing a decision model based on an Actor network and a double Q network, wherein the double Q network comprises two sets of Critic networks; S2, acquiring physical constraints and setting decision model super-parameters, wherein the physical constraints comprise spatial positions of all users, positions of reconfigurable intelligent surfaces, initial positions and total time length of an unmanned aerial vehicle, maximum available energy of the unmanned aerial vehicle and user electric quantity threshold values, the decision model super-parameters comprise a reward function, discount factors, learning rates of an Actor network and two sets of Critic networks, target network soft update coefficients, noise scales, cutting boundaries and strategy delay steps, and the target network comprises a target Actor network and a target Critic network; s3, initializing a reinforcement learning main body, namely constructing an experience pool for storing samples generated by interaction, and completely emptying or setting the experience pool and a priority array thereof to be uniform initial values; S4, initializing a system state, including initializing a user position, a task queue, battery power, a channel state and resetting the unmanned aerial vehicle to a starting position; S5, normalizing the running states of all the devices in the system according to a preset minimum value and a preset maximum value to obtain normalized states; s6, sampling actions at the current moment from the window state through an Actor network, and interacting with the environment by using the actions to acquire rewards at the next window state and the current moment; s7, storing the sample obtained in the current time step into an experience pool, wherein the sample comprises a current window state, a current action, rewards at the current moment and a next window state; S8, judging whether the number of samples in the experience pool reaches a set value, if so, entering a step S9, otherwise, returning to the step S5; s9, sampling is carried out from the experience pool according to the priority, and a training batch is obtained; S10, carrying out importance sampling weight calculation on samples in the same training batch, and normalizing the importance sampling weights to obtain a group of normalized importance weights; S11, constructing a target Q value of TD3 through a target network based on samples in the training batch; S12, weighting MSE training is carried out on two sets of Critic networks based on a target Q value of TD3, and TD errors are obtained; s13, updating the priority of the sample in the experience pool through TD errors; S14, judging whether to update the Actor network in the current learning step based on a strategy delay mechanism, if so, entering a step S15, otherwise, returning to the step S9; s15, updating parameters of an Actor network by using a current first Critic network, and outputting the updated Actor network; s16, performing soft update on the target Actor network and the target Critic network, and outputting a group of updated target Actor network and target Critic network; S17, judging whether the target Actor network and the target Critic network reach the end training condition, if so, taking the latest control action obtained by the target Actor network as the control action for generating one or more devices with the aim of minimizing the maximum processing time delay and unmanned energy consumption of all user tasks, otherwise, returning to the step S4; wherein the expression of the reward is , And As the weight coefficient of the light-emitting diode, A penalty term introduced for the user's power constraints, , For penalty coefficients greater than 0, Is a power threshold.
2. The post-disaster emergency communication system according to claim 1, wherein the link formed when the user directly performs data transmission with the edge server is a direct link, the link formed when the user performs data transmission with the edge server through the reconfigurable intelligent surface is an auxiliary link, the auxiliary link is divided into a reflective link and a transmissive link, the auxiliary link formed when the edge server is located on the same side of the reconfigurable intelligent surface as the user is a reflective link, and the auxiliary link formed when the edge server is located on both sides of the reconfigurable intelligent surface as the transmissive link.
3. The post-disaster emergency communication system of claim 1, further comprising a wireless information and energy co-transmission module for unifying data transmission and energy harvesting in the radio frequency signal to the same physical process for providing wireless energy replenishment to one or more devices in the system.
4. The post-disaster emergency communication system according to claim 1, wherein the operation states of the respective devices in the system include: the unmanned plane state comprises three-dimensional position information, residual energy, current load and flying speed; the ground terminal state, namely coordinates, energy level and task quantity to be processed; reconfigurable intelligent surface state, configuration state and controllable unit starting condition; environmental parameters including link quality, obstacle distribution, and radio interference level.
5. The post-disaster emergency communication system according to claim 1, wherein the Actor network and the target Actor network have the same structure, each comprise an embedded layer, a cyclic neural network, an improved KAN network and a fully connected layer which are sequentially connected, wherein: an embedding layer for mapping the input object to a high-dimensional feature space; the cyclic neural network is used for acquiring the time sequence characteristics of the output of the embedded layer; Improving the KAN network to enhance the timing characteristics of the output of the recurrent neural network; the full-connection layer is used for mapping the enhanced time sequence characteristics into control actions; and when the control action interacts with the environment, obtaining the next window state and the rewards at the current moment.
6. The post-disaster emergency communication system according to claim 1, wherein the specific method for constructing the target Q value of TD3 through the target network comprises: calculating a target action of a window state at each next moment in the current training batch through a target Actor network, and adding tailored Gaussian noise to the target action to obtain a smooth target action; And respectively sending the smooth target actions into two target Critic networks to calculate corresponding Q values, taking two element-by-element minimum values of the corresponding Q values, and obtaining the target Q value of each sample through a Bellman equation to obtain the target Q value of TD 3.
7. The post-disaster emergency communication system according to claim 1, wherein the method for updating parameters of the Actor network by using the current first Critic network and outputting the updated Actor network comprises: And acquiring a control action generated by the Actor network in the current window state, calculating the Q value of the control action through the current first Critic network, taking the negative average Q value as the loss of the Actor network, updating the Actor network parameters through gradient descent, and outputting the updated Actor network.
8. The post-disaster emergency communication system according to claim 1, wherein the specific method for performing soft update on the target Actor network and the target Critic network comprises: For the target Actor network and the target Critic network, soft updating is carried out according to the following formula: Wherein the method comprises the steps of The updated parameters of the target network; Updating the step length; parameters for the target network before updating; Is a parameter of a network corresponding to a target network to be updated in the decision model.

Description

Post-disaster emergency communication system Technical Field The invention relates to the field of emergency communication, in particular to a post-disaster emergency communication system. Background In view of the urgency of post-disaster tasks and the complexity of the environment, emergency communication networks need to have rapid deployment, flexibility and high reliability. Unmanned aerial vehicles are regarded as key solutions for communication recovery after disaster because of the advantages of flexibility, high deployment speed, low cost, strong line-of-sight links and the like. At the same time, mobile edge computing may increase computing and communication capabilities at the network edge. By combining the mobile edge computing technology and the unmanned aerial vehicle communication technology, an unmanned aerial vehicle auxiliary mobile edge computing (U-MEC) emergency architecture is constructed, so that communication can be quickly recovered and computing support can be provided in a post-disaster scene with severely damaged infrastructure. In the first （Shah Z, Javed U, Naeem M, et al. Mobile edge computing (MEC)-enabled UAV placement and computation efficiency maximization in disaster scenario[J]. IEEE Transactions on Vehicular Technology, 2023, 72(10): 13406-13416） prior art, an emergency architecture for unmanned aerial vehicle auxiliary movement edge calculation is adopted, research is carried out with the aim of maximizing the calculation efficiency, but only direct link of the unmanned aerial vehicle is considered by a user, and the problems of obstruction, complex electromagnetic environment and channel degradation caused by the obstruction and the complex electromagnetic environment are considered insufficiently in the environment after disaster. The optimization method relies on solving frameworks such as k-means clustering and a staged static interior point method, is difficult to adapt to rapid time-varying characteristics of user distribution, channel conditions and energy states in post-disaster scenes, lacks instantaneity and self-adaptation capability, mainly focuses on joint allocation of communication resources and computing resources, and does not further discuss a mechanism for improving sustainable operation capability of a system under the condition of energy shortage. In the prior art two （Luan Q, Cui H, Zhang L, et al. A hierarchical hybrid subtask scheduling algorithm in UAV-assisted MEC emergency network[J]. IEEE Internet of Things Journal, 2021, 9(14): 12737-12753）, topology reconstruction and subtask scheduling are mainly focused on to reduce average completion time delay of calculation tasks, but a system model is still based on a traditional unmanned aerial vehicle auxiliary mobile edge calculation architecture, and a wireless environment enhancement mechanism under the conditions of serious shielding, link interruption and energy limitation which are commonly existed in post-disaster communication is not considered. Meanwhile, the H-HSS algorithm proposed by the technology depends on the established channel condition and the preset network topology, and lacks the capability of performing end-to-end wireless propagation enhancement, energy self-maintenance and cross-layer intelligent optimization in a high-dynamic post-disaster environment. In the prior art III （Khalid R, Shah Z, Naeem M, et al. Computational efficiency maximization for UAV-assisted MEC networks with energy harvesting in disaster scenarios[J]. IEEE Internet of Things Journal, 2023, 11(5): 9004-9018）, an unmanned aerial vehicle auxiliary mobile edge computing emergency framework with radio frequency energy collection capability is constructed, and the aim of maximizing the computing efficiency is achieved. A three-stage static solution framework of k-means clustering, staged linearization and interior point method is adopted. However, the work is still based on the direct link of probability LoS, complex factors such as serious shielding, multipath fading, link interruption, dynamic arrival of users and the like in the scene after disaster are not fully described, the offline phased optimization is difficult to realize the real-time self-adaptive adjustment of the position, the energy state and the unloading strategy of the unmanned plane, and the support for the system robustness and the multi-dimensional resource joint intelligent optimization is limited. In the prior art, the four （[Li J, He Q, Wang X, et al. UAV-assisted Microservice Mobile Edge Computing Architecture: Addressing Post-Disaster Emergency Medical Rescue[J]. IEEE Transactions on Computers, 2025） optimizes an unmanned aerial vehicle auxiliary mobile edge computing architecture through micro-service and Transformer resource management so as to improve task scheduling and energy consumption efficiency in post-disaster medical rescue, but the whole design of the unmanned aerial vehicle auxiliary mobile edge computing architecture st