CN-121984552-A - Multi-mode millimeter wave beam prediction method based on multi-task learning

CN121984552ACN 121984552 ACN121984552 ACN 121984552ACN-121984552-A

Abstract

The invention relates to a multi-mode millimeter wave beam prediction method based on multi-task learning, and belongs to the technical field of millimeter wave communication. The method comprises the steps of converting a beam prediction task into a deep learning optimization task based on a geometric channel model in a dynamic communication scene, preprocessing images, three-dimensional point clouds and user motion information acquired by a base station to finish region-of-interest extraction, point cloud downsampling and space coordinate conversion, adaptively extracting and weighting and fusing the multi-modal characteristics through a cross-modal gating fusion module, constructing a multi-task learning framework, cooperatively training a beam prediction main task, blocking prediction and reflection intensity prediction auxiliary task, correcting optimal beam probability by using physical constraint information output by the auxiliary task, and selecting an optimal beam. The invention obviously reduces the beam training overhead and communication delay, effectively overcomes the environmental limitation of single-mode perception, and enhances the prediction robustness of the system in complex dynamic scenes while improving the beam prediction precision.

Inventors

WANG HENG
HUANG ZHIBIN
XIE XIN

Assignees

重庆邮电大学

Dates

Publication Date: 20260505
Application Date: 20260126

Claims (10)

1. A multi-mode millimeter wave beam prediction method based on multi-task learning is applied to a millimeter wave communication system, wherein the millimeter wave communication system comprises a base station and user equipment, the base station is provided with a camera, a three-dimensional point cloud sensor and a positioning device, and a predefined beam forming codebook is adopted Characterized in that the method comprises: Modeling a beam prediction task as a deep learning optimization task based on a geometric channel model in a dynamic communication scene; Collecting multi-mode sensing data S= { I, P, M }, wherein I is image data, P is three-dimensional point cloud data, M is motion information, respectively preprocessing the image data, the three-dimensional point cloud data and the motion information, extracting features, and obtaining a visual feature vector Geometrical feature vector of point cloud And motion feature vector ; Outputting a prediction result of the candidate beam based on a deep learning model, and determining an optimal beam index from the beam forming codebook; Wherein, still include: The said 、、 Respectively projecting to a public feature space to obtain 、、 ; Calculating normalized gating weight according to the projected characteristics through a cross-modal gating fusion network And generating a multi-modal fusion feature based on the gating weights: , based on the multi-modal fusion features Constructing a multi-task learning framework parallel output: Candidate beam probability distribution for beam prediction master task Blocking probability vector for blocking prediction auxiliary task Global reflection intensity of reflection intensity prediction auxiliary task Directional reflection intensity vector ; Based on the blocking probability vector and the reflection intensity prediction result, performing physical constraint correction on the candidate beam probability distribution according to a blocking probability threshold value And a reflection intensity threshold Filtering the candidate wave beams to obtain a corrected intermediate probability Based on the reflection gain coefficient Enhancement is carried out to obtain: , According to the described And selecting the index with the highest probability as the optimal beam index.
2. The multi-mode millimeter wave beam prediction method based on multi-task learning according to claim 1, wherein the method at least comprises the following steps: S1, converting a beam prediction task into a deep learning optimization task based on a geometric channel model in a dynamic communication scene; s2, acquiring multi-mode data S= { I, P and M }, wherein I is image data, P is three-dimensional point cloud data, and M is motion information; preprocessing the data and extracting features to obtain a visual feature vector, a point cloud geometric feature vector and a motion feature vector, wherein the preprocessing comprises the steps of extracting a region of interest of the image data, performing point cloud downsampling on three-dimensional point cloud data, and performing space coordinate conversion on motion information; S3, inputting the preprocessed multi-mode data into a cross-mode gating fusion module, wherein the cross-mode gating fusion module extracts each mode characteristic, calculates the self-adaptive weight of each mode characteristic by using a gating mechanism, and performs weighted fusion on each mode characteristic based on the self-adaptive weight to obtain a multi-mode fusion characteristic; S4, constructing a multi-mode fusion feature, inputting the multi-mode fusion feature into the multi-mode learning frame to cooperatively process a main task and an auxiliary task, wherein the main task is beam prediction and is used for outputting probability distribution of candidate beams, and the auxiliary task comprises blocking prediction and reflection intensity prediction and is used for outputting physical constraint information; s5, designing a physical constraint correction mechanism, correcting the probability distribution of the candidate beams output by the main task by using the physical constraint information output by the auxiliary task, and selecting the index with the maximum probability as the optimal beam index according to the corrected probability distribution.
3. The method for predicting a multi-mode millimeter wave beam based on multi-task learning according to claim 1, wherein in the geometric channel model, the received signal of the user equipment on the kth subcarrier satisfies: , Wherein, the Channel vectors for the downlink; Is a beamforming vector; to transmit signals and satisfy , Is the average transmit power; is additive noise subject to complex gaussian distribution.
4. A multi-modal millimeter wave beam prediction method based on multi-task learning as set forth in claim 3, wherein the downlink channel vector satisfies: , Wherein, the Represents the total number of propagation paths, Is the first The complex attenuation coefficient of the strip path, And The azimuth and elevation of arrival respectively, And Respectively the departure azimuth and departure elevation, For an array steering vector of a base station and user equipment, for a uniform linear array, its form is defined by the antenna spacing Sum signal wavelength And (5) determining.
5. The multi-mode millimeter wave beam prediction method based on multi-task learning according to claim 1, wherein the preprocessing of the image data comprises the steps of identifying target users and potential shields by means of a target detection algorithm, clipping a region of interest according to a bounding box, adjusting the region of interest to a uniform size, and inputting a depth residual error network to obtain the visual feature vector.
6. The multi-mode millimeter wave beam prediction method based on multi-task learning according to claim 1, wherein the preprocessing of the three-dimensional point cloud data comprises standard deviation standardization of three-dimensional space coordinates, downsampling by voxel grids, normalization of reflection intensity and input of a hierarchical point cloud feature extraction network to obtain the point cloud geometric feature vector.
7. The method for predicting the multi-mode millimeter wave beam based on the multi-task learning according to claim 1, wherein the preprocessing of the motion information comprises the steps of converting positioning data from a world geodetic coordinate system to a station horizon coordinate system with a base station as an origin, calculating a relative displacement difference value, splicing the relative displacement difference value with an instantaneous speed to form a motion vector, and inputting the motion vector into a multi-layer perceptron to obtain the motion feature vector.
8. The method for predicting a multi-modal millimeter wave beam based on multi-task learning of claim 1, wherein the gating weight is obtained by: , Wherein, the The operation of the splice is indicated and, And To gate the learnable parameters of the network, Represents a normalized exponential activation function, g satisfies And each element is non-negative.
9. The multi-modal millimeter wave beam prediction method based on multi-task learning of claim 1, wherein the blocking filtering satisfies: 。
10. The multi-mode millimeter wave beam prediction method based on multi-task learning according to claim 1, wherein training is performed by using a multi-task joint loss function, and the joint loss function is as follows: , , Wherein, the As a total loss function; Is a cross entropy loss function and is used for beam prediction tasks; The binary cross entropy loss function is used for blocking the prediction task; the method is a mean square error loss function and is used for a reflection intensity prediction task; And For following training rounds The dynamic weight coefficient of the change is changed, And As a result of the initial weight value(s), And In order to increase the coefficient of the rate, For the total run.

Description

Multi-mode millimeter wave beam prediction method based on multi-task learning Technical Field The invention belongs to the technical field of millimeter wave communication, and relates to a multi-mode millimeter wave beam prediction method based on multi-task learning. Background Millimeter wave communication has become a key technology for realizing ultra-wideband and low-delay transmission in 5G and 6G mobile communication systems due to abundant spectrum resources and high transmission rate. However, millimeter wave signal wavelengths are extremely short and highly sensitive to environmental dynamics, especially in complex communication scenarios involving high-speed mobile terminals and dynamic obstacles, relative motion between transceivers and random occlusion in the environment are extremely prone to line-of-sight link interruption and severe signal attenuation. To overcome this challenge, the use of environmental awareness information such as vision, lidar, and location to assist in millimeter wave beam prediction has become a research hotspot in the field of communications. The existing method introduces a deep learning algorithm, predicts the optimal beam direction by mining the mapping relation between the environment sensing information and the wireless communication channel characteristics, and therefore realizes the rapid alignment of the beams without complicated signaling interaction, and provides a potential solution for solving the problem of link stability in a high dynamic scene. While significant advances have been made in beam prediction methods based on environmental awareness assistance, significant challenges remain in practical dynamic complex environmental deployments. Firstly, in the aspect of multi-modal fusion, whether sequence modeling or contrast learning, the prior art scheme mostly relies on single-modal data, however, a single sensor has natural limitations under specific environments, for example, a camera is easy to fail under low illumination or severe weather, while a laser radar contains geometric information but lacks semantic textures, so that the robustness of a model under a complex environment is insufficient. Secondly, although partial methods try to fuse multi-mode data, the existing fusion strategy mostly adopts simple characteristic splicing and static weighting, and lacks the capability of adaptively adjusting the contribution weight of each mode according to the dynamic change of a scene. When the environment changes dynamically, such as entering a shielding area from the open, the existing multi-mode scheme cannot automatically adjust the weight of each mode, and even the prediction accuracy may be reduced due to the noise of the failure mode. Finally, most of existing prediction models are black box networks driven by pure data, physical constraint characteristics in the wireless signal propagation process, such as signal reflection intensity, diffraction loss, shielding probability and the like, are ignored, channel mutation is difficult to effectively cope with, and the interpretability and generalization capability of the models in an actual communication system are limited. Therefore, how to design a beam prediction method capable of adaptively fusing multi-mode sensing information and effectively fusing communication physical constraints so as to realize high-precision prediction in a dynamic complex scene becomes a technical problem to be solved in the field. Disclosure of Invention In view of the above, the present invention aims to provide a multi-mode millimeter wave beam prediction method based on multi-task learning, which aims at the problems of limited single-mode perception and lack of physical constraint in a dynamic communication scene, adopts a cross-mode gating fusion and multi-task collaborative training strategy, and corrects a prediction result by using physical constraint information while realizing multi-mode feature self-adaptive fusion, thereby providing high-precision and high-robustness beam prediction in a complex communication environment. In order to achieve the above purpose, the present invention provides the following technical solutions: A multimode millimeter wave beam prediction method based on multitask learning is applied to a millimeter wave communication system, the millimeter wave communication system comprises a base station and user equipment, wherein the base station is provided with a camera, a three-dimensional point cloud sensor and a positioning device, and a predefined beam forming codebook is adopted The method comprises the following steps: Modeling a beam prediction task as a deep learning optimization task based on a geometric channel model in a dynamic communication scene; Collecting multi-mode sensing data S= { I, P, M }, wherein I is image data, P is three-dimensional point cloud data, M is motion information, respectively preprocessing the image data, the three-dimensional point cloud data and the