CN-122026959-A - Unmanned aerial vehicle RIS-assisted MU-MISO system for resisting malicious interference and joint optimization method thereof

CN122026959ACN 122026959 ACN122026959 ACN 122026959ACN-122026959-A

Abstract

An unmanned aerial vehicle RIS auxiliary MU-MISO system for resisting malicious interference and a joint optimization method thereof belong to the technical field of MU-MISO communication systems, and particularly relate to the unmanned aerial vehicle RIS auxiliary MU-MISO system under the scene of malicious interference, and joint optimization of base station precoding (beam forming), RIS passive phase shifting and unmanned aerial vehicle tracks. The unmanned aerial vehicle RIS-assisted MU-MISO system with malicious interference resistance and the joint optimization method thereof are applicable to the fields of emergency communication, hot spot area coverage, anti-interference communication and the like which need dynamic and reliable wireless service.

Inventors

ZHANG JIAYAN
Tan Zeou
WU SHAOCHUAN
SHA XUEJUN

Assignees

哈尔滨工业大学

Dates

Publication Date: 20260512
Application Date: 20260413

Claims (10)

1. An anti-malware unmanned RIS assisted MU-MISO system, comprising: The base station is used for providing signal transmission service for a plurality of users through communication links; an unmanned aerial vehicle carrying a reconfigurable intelligent surface of a plurality of reflective elements for flying in a mission area, assisting communication between the base station and the plurality of users; in the system: the active beamforming of the base station, the passive phase shift matrix of the reconfigurable intelligent surface and the three-dimensional flight trajectory of the unmanned aerial vehicle are cooperatively and jointly optimized, so that the unmanned aerial vehicle suppresses interference from an external malicious jammer while realizing blind area coverage of a user during the task of flying from a starting point to an ending point, thereby maximizing the average sum rate of the system.
2. A joint optimization method for a system according to claim 1, characterized in that the method comprises the steps of: Establishing an optimization problem model which aims at maximizing the average sum rate of the system and takes the base station transmitting power, the unmanned aerial vehicle kinematics constraint and the reconfigurable intelligent surface reflection element phase shift constant mode constraint as constraint conditions based on the physical constitution of the system and a communication link; a deep reinforcement learning solving step, namely adopting a near-end strategy optimization model to solve the optimization problem model, wherein the step comprises the following steps: defining a state space, an action space and a reward function corresponding to the optimization problem model; Initializing and training a near-end strategy optimization model to train through interaction with the system communication environment, wherein the updating of the strategy is based on data collected by an old strategy, and the amplitude of the updating of the strategy is limited through trust zone constraint or clipping operation; And through training, the near-end strategy optimization model learns and outputs an optimized base station beam forming matrix, a reconfigurable intelligent surface phase shift matrix and an unmanned aerial vehicle track sequence.
3. The joint optimization method according to claim 2, wherein the optimization problem model is: Wherein, the T represents the period of the task and is discretized into N time slots, each time slot having a length of N represents a slot index, n=0, 1,..; K represents the total number of users, K represents the user index, k=1, 2,; representing the received signal-to-interference-and-noise ratio of the kth user at the nth time slot; representing that in the nth time slot, an active beamforming vector allocated by the base station to the kth user is a part of a beamforming matrix of the base station, which is one of decision variables of the optimization problem model; The passive phase shift matrix representing the reconfigurable intelligent surface at the nth time slot is one of decision variables of the optimization problem model; m represents the total number of reflective elements comprised by the reconfigurable intelligent surface; representing the phase shift value of the mth reflecting element of the reconfigurable intelligent surface in the nth time slot, and meeting the constant modulus constraint; The horizontal position coordinate of the unmanned plane at the nth time slot is one of decision variables of the optimization problem model, wherein, The position coordinates of a preset starting point of the unmanned aerial vehicle; representing the maximum transmit power of the base station; representing a maximum flight speed of the unmanned aerial vehicle; s.t. represents constraints for optimizing the problem model.
4. The joint optimization method according to claim 2, wherein in the deep reinforcement learning solving step: The state space comprises a direction vector of the current position of the unmanned aerial vehicle relative to the terminal point, an instantaneous speed vector of the unmanned aerial vehicle and historical average communication rates of all users; The motion space is a hybrid motion space and comprises continuous motions used for controlling the speed of the unmanned aerial vehicle and discrete motions used for selecting a base station beam forming vector from a discrete codebook and a reconfigurable intelligent surface reflection phase shift from a discrete phase set; the reward function includes an instant reward proportional to the current time system and rate, and a lead reward proportional to the change in distance of the drone toward the endpoint.
5. The joint optimization method according to claim 2, characterized in that before the communication scene modeling step or the deep reinforcement learning solving step, the method further comprises an environmental parameter initializing step of: setting system communication scene environment parameters, determining related information of ground users, base stations, jammers and unmanned aerial vehicle RIS, and comprising the following steps: Determining initial positions of a base station, an jammer, an unmanned aerial vehicle starting point and a user; Setting a base station antenna number N t , an interference machine antenna number N J , a reconfigurable intelligent surface reflection element number M and an unmanned aerial vehicle maximum flying speed V max ; setting codebook number for discretized base station beam forming vector And the number of available phases omega ULA, and the number of codebooks for discretizing the phase shift of the reconfigurable smart surface And the usable phase number qupa thereof.
6. The joint optimization method of claim 2, wherein in the deep reinforcement learning solution step, initializing and training a near-end policy optimization model includes: The method comprises the steps of S1, constructing a PPO frame comprising an Actor network and a Critic network, outputting probability distribution of discrete actions by the Actor network through a Softmax function aiming at a mixed action space formed by the discrete actions and the continuous actions, and sampling Gaussian distribution output by the network to obtain the continuous actions; Step S2, the model outputs action a t according to the current state S t , obtains rewards r t and the next state S t+1 after interacting with the system environment, forms empirical data (S t ,a t ,,r t ,s t+1 ), and calculates a dominance function ; Step S3, when the accumulated experience data reaches a preset amount, performing multiple rounds of iterative training by using the accumulated experience data, and calculating strategy loss and updating model parameters theta based on cutting operation in each round of training; and S4, after the parameter updating stage is finished, updating the old strategy network parameters by using the new strategy network parameters.
7. A joint optimization device for a system according to claim 1, characterized in that the device comprises the following modules: the communication scene modeling module is used for establishing an optimization problem model which aims at maximizing the average sum rate of the system and takes the base station transmitting power, the unmanned aerial vehicle kinematics constraint and the reconfigurable intelligent surface reflection element phase shift constant mode constraint as constraint conditions based on the physical constitution of the system and a communication link; The deep reinforcement learning solving module adopts a near-end strategy optimizing model to solve the optimizing problem model, and the method comprises the following steps: defining a state space, an action space and a reward function corresponding to the optimization problem model; Initializing and training a near-end strategy optimization model to train through interaction with the system communication environment, wherein the updating of the strategy is based on data collected by an old strategy, and the amplitude of the updating of the strategy is limited through trust zone constraint or clipping operation; And through training, the near-end strategy optimization model learns and outputs an optimized base station beam forming matrix, a reconfigurable intelligent surface phase shift matrix and an unmanned aerial vehicle track sequence.
8. A computer device comprising a processor and a memory, wherein the memory is for storing executable instructions of the processor, the processor being configured to perform the joint optimization method of any of claims 2-6 via execution of the executable instructions.
9. A computer storage medium, characterized in that the storage medium has stored therein a computer program which, when run, performs the joint optimization method of any one of claims 2-6.
10. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the joint optimization method of any one of claims 2-6.

Description

Unmanned aerial vehicle RIS-assisted MU-MISO system for resisting malicious interference and joint optimization method thereof Technical Field The invention relates to the technical field of MU-MISO communication systems, in particular to an unmanned aerial vehicle RIS auxiliary MU-MISO system under a malicious interference-oriented scene, and joint optimization of base station precoding (beam forming), RIS passive phase shift and unmanned aerial vehicle track. Background In the field of wireless communication, reconfigurable Intelligent Surface (RIS) technology has become a key means for improving spectral efficiency by dynamically regulating and controlling electromagnetic wave propagation environments. Conventional fixed location RIS deployed on the ground or on a building, while enhancing signal coverage, lacks flexibility and is difficult to accommodate in dynamic communication scenarios. In recent years, development of unmanned aerial vehicle technology has prompted the concept of on-board intelligent super surface (ARIS), namely, by carrying RIS on unmanned aerial vehicles, on-demand coverage is realized by using high mobility. The ARIS can dynamically adjust the position, avoid ground obstacles, and establish and maintain the line-of-sight link, thereby expanding the service range, and being particularly suitable for emergency communication or blind area enhancement scenes. However, the prior art mainly focuses on coverage enhancement applications of the ARIS, and has not yet effectively solved the communication problem in the malicious interference environment. The openness of the wireless environment makes the user vulnerable to non-cooperative jammers, resulting in a reduced signal-to-interference-plus-noise ratio. Although the ARIS has reconfigurable regulation capability and can optimize the interference channel, the existing scheme has the following defects: the interference signals cannot be effectively restrained while blind areas are covered, and the system (average) and the rate are limited; Lack of collaborative design of base station, RIS and unmanned aerial vehicle track results in low resource utilization efficiency; there is no special invention for the above-mentioned scenario, and it is difficult to meet the requirements of high-reliability communication (such as military or emergency rescue). Therefore, there is an urgent need in the art for a system solution that can integrate the flexibility of the unmanned aerial vehicle RIS to achieve the dual goals of blind coverage and interference suppression. Disclosure of Invention The invention provides an unmanned aerial vehicle RIS-assisted MU-MISO system and a joint optimization method thereof, which solve the problems that interference signals cannot be effectively restrained while blind areas are covered, the system and the speed are limited in improvement, and the resource utilization efficiency is low and high-reliability communication requirements (such as military or emergency rescue) are difficult to meet due to lack of collaborative design of base stations, RIS and unmanned aerial vehicle tracks in the prior art. The invention relates to an unmanned aerial vehicle RIS auxiliary MU-MISO system for resisting malicious interference, which comprises the following components: The base station is used for providing signal transmission service for a plurality of users through communication links; an unmanned aerial vehicle carrying a reconfigurable intelligent surface of a plurality of reflective elements for flying in a mission area, assisting communication between the base station and the plurality of users; in the system: the active beamforming of the base station, the passive phase shift matrix of the reconfigurable intelligent surface and the three-dimensional flight trajectory of the unmanned aerial vehicle are cooperatively and jointly optimized, so that the unmanned aerial vehicle suppresses interference from an external malicious jammer while realizing blind area coverage of a user during the task of flying from a starting point to an ending point, thereby maximizing the average sum rate of the system. The invention also provides a joint optimization method for the system, which comprises the following steps: Establishing an optimization problem model which aims at maximizing the average sum rate of the system and takes the base station transmitting power, the unmanned aerial vehicle kinematics constraint and the reconfigurable intelligent surface reflection element phase shift constant mode constraint as constraint conditions based on the physical constitution of the system and a communication link; a deep reinforcement learning solving step, namely adopting a near-end strategy optimization model to solve the optimization problem model, wherein the step comprises the following steps: defining a state space, an action space and a reward function corresponding to the optimization problem model; Initializing and training a near