CN-121985420-A - Spectrum resource allocation algorithm considering reliable transmission in cognitive Internet of vehicles

CN121985420ACN 121985420 ACN121985420 ACN 121985420ACN-121985420-A

Abstract

The invention relates to the technical field of cognitive internet of vehicles communication, in particular to a spectrum resource allocation algorithm considering reliable transmission. The scheme comprises the steps of firstly collecting channel historical transmission data, including reliability indexes such as bandwidth, packet loss rate, bit error rate and transmission delay, constructing a multidimensional channel reliability characteristic sequence, then constructing a lightweight transform prediction model to predict channel reliability states of a plurality of time slots in the future of a channel, modeling channel time sequence correlation through a time-aware multi-layer perceptron, obtaining a future channel reliability index prediction result through a dimension separable output structure, further carrying out normalization processing and index fusion on the prediction result to construct a channel reliability index, and on the basis, introducing an adaptive spectrum access mechanism based on reinforcement learning by combining with a network load state, and realizing adaptive spectrum access decision of a vehicle terminal through dynamically selecting an overlay access mode or an underlay access mode. The invention can realize the collaborative optimization of channel reliability prediction and spectrum access strategy in the high mobile Internet of vehicles environment, improve the throughput and spectrum utilization rate of the system and reduce the packet loss rate, thereby improving the reliable transmission performance of Internet of vehicles communication.

Inventors

MA BIN
DING WENYAN

Assignees

重庆邮电大学

Dates

Publication Date: 20260505
Application Date: 20260305

Claims (5)

1. The spectrum resource allocation algorithm considering reliable transmission in the cognitive internet of vehicles is characterized by comprising the following steps of: 101. collecting channel reliability quadruple data of each channel in a plurality of historical time slots, wherein the reliability quadruple comprises bandwidth, packet loss rate, bit error rate and transmission delay; 102. constructing a lightweight transducer channel reliability prediction model, taking the reliability quadruple in 101 as the input of the model, outputting channel reliability prediction results of a plurality of time slots in the future, further carrying out normalization processing on the multidimensional index obtained by prediction, and obtaining the reliability index of each channel through multi-layer mapping fusion; 103. according to the current network load state and the reliability index obtained in the step 102, a self-adaptive spectrum access strategy based on reinforcement learning is provided; 104. According to the strategy proposed in 103, the vehicle terminal dynamically switches access strategies according to different network loads, selects channel access in an overlay and underlay mixed mode, and updates the Q value function by setting a differentiated reward function so as to realize self-adaptive optimization of the access strategies.
2. A spectrum resource allocation algorithm for reliable transmission consideration in the internet of cognitive vehicles as described in step 101 of claim 1, wherein the channel reliability quadruple is specifically described as follows: 201. the bandwidth is the difference between the upper and lower frequency limits available to the vehicle terminal in the current time slot, and can be expressed as: ; Wherein the method comprises the steps of And Representation of Time vehicle terminal Access channel Highest and lowest frequencies of (2); 202. the packet loss rate is the complement of the ratio of the number of successfully transmitted data packets to the number of total transmitted data packets, and can be expressed as: ; Wherein, the And Representation of Time of day, vehicle terminal On the channel The number of data packets successfully transmitted in the transmission of (a) and the total number of data packets; 203. The bit error rate is the ratio of the number of error bits in the successfully received data packet to the total number of bits in the successfully received data packet, and can be expressed as: ; Wherein, the Representing a vehicle terminal Finally on the channel The total number of data packets that are eventually successfully transmitted and acknowledged by the receiving party, Represent the first The number of erroneous bits in a successfully received data packet, Represent the first Total number of bits of each successfully transmitted packet; 204. The transmission delay is the arithmetic average of the end-to-end acknowledgement delays of successful transmission of the data packets, and can be expressed as: ; Wherein the method comprises the steps of Indicating time of day Vehicle terminal Finally on the channel The total number of data packets that are eventually successfully transmitted and acknowledged by the receiving party, And Representing a vehicle For the first The start time of successful transmission of each data packet and the time of arrival of the acknowledgement data packet.
3. The spectrum resource allocation algorithm for reliable transmission consideration in the cognitive internet of vehicles according to claim 1 and step 102, wherein the channel reliability prediction model is specifically designed as follows: 301. based on the quadruple data of claim 2 as input, channel reliability data of past T times of channel Component vectors as inputs to a reliability prediction model, wherein elements within the vector may be represented as ; 302. According to the channel reliability data in step 301 Forming data of a plurality of channels into a matrix As the input matrix of the prediction model, to unify the dimensions of different reliability indexes and enhance the model expression capacity at the same time, the historical observation sequences of all channels are normalized firstly, and then the matrix is embedded linearly The input matrix is effectively compressed, and subsequent modules all operate based on the compressed matrix ; 303. In order to effectively describe the internal law of the evolution of the channel reliability index along with time and solve the problem of high computational complexity brought in the environment of the Internet of vehicles, a time-aware multi-layer perceptron is adopted to replace a computationally intensive attention mechanism, and an encoder is composed of Each layer of encoder is mainly composed of a perceptron and a feedforward network, and can be divided into two stages of time modeling and nonlinear feature mapping and fusion; 304. The two stages of the encoder in step 303 are as follows: The first stage of capturing the dependency relationship between different moments by explicit feature blending in the time dimension, transpose the input feature matrix to make the time dimension a direct object of the linear transformation, then apply two layers of full-join mapping along the dimension and introduce nonlinear activation functions to get a time blended representation, then the first stage of The calculation process of the layer can be formally expressed as: ; Wherein the method comprises the steps of , In order to be a trainable time mapping matrix, Representing the non-linear activation function ReLU, Representing a transpose operation; the second stage, fusing the output of the sensor and the original input through residual connection, and enhancing the numerical stability by adopting layer normalization to obtain intermediate characteristics The intermediate features are further input to the feed-forward network for slot-by-slot nonlinear feature mapping, and normalized again by residual error and layer, and the updating process can be expressed as: ; Wherein, the Representing a feed-forward network and, Normalizing the representation layer; 305. After the coding of the historical channel reliability sequence is completed in step 304, a prediction output head with separable dimensions is designed for considering prediction precision and model complexity, and the structure decomposes the prediction task of a plurality of time slots into two relatively independent stages of a time dimension and a channel characteristic dimension, wherein in the time dimension, the channel is in a channel state The conversion of (c) may be formalized as follows: ; Wherein the method comprises the steps of Representing a trainable time mapping matrix, Representing future within prediction window The model further maps the potential characteristic representation of the future time slot into a specific channel reliability index in a channel dimension mapping stage Mapping to the channel reliability space can be formalized as: ; Wherein the method comprises the steps of Mapping matrix for channel dimension, and obtaining four channel reliability indexes of bandwidth, packet loss rate, error rate and transmission delay In the future Reliability prediction results for individual timeslots ; 306. And (3) carrying out normalization and fusion processing according to the multidimensional reliability prediction results of the channels in the future multiple time slots, wherein the core is a double-layer Sigmoid mapping mechanism, a first layer Sigmoid function is used for mapping the prediction results of the future multiple time slots of each channel into a reliability sequence, a second layer Sigmoid function is used for carrying out weighted fusion on the reliability sequence, the near time slot weight is greater than the far time slot weight, and finally, the reliability index sequence of all the channels is output: 。
4. The spectrum resource allocation algorithm for reliable transmission consideration in the cognitive internet of vehicles according to step 103 of claim 1, wherein the network load state detection model specifically comprises: the system state is divided into two cases of high load and low load through the existing division model, and the specific division is as follows: ; Wherein, the Indicating the total amount of data requested by all vehicles, Representing the maximum data volume which can be transmitted in the current network scene, if The method means that the vehicle terminal belongs to high load, or belongs to low load, and on the basis, the vehicle terminal is assumed to have the capability of autonomously adjusting the transmission power of the vehicle terminal, so that a dynamic switching mechanism between an underly access mode and an overlay access mode is realized.
5. The spectrum resource allocation algorithm for reliable transmission consideration in the cognitive internet of vehicles according to step 104 of claim 1, wherein the adaptive access strategy based on reinforcement learning specifically comprises: 501. building a state space by comprehensively considering comprehensive target requirements of vehicle terminals and systems Time vehicle terminal The perceived state space is: ; Wherein the method comprises the steps of Terminal for depicting vehicle Time-of-day context awareness information, particularly including channel state vectors Is shown in Time vehicle terminal Detected channel 1 to channel Is set to be a real-time state sensing information of the vehicle, Representation of Time vehicle terminal Perceived channel Is occupied by the other users and is used by the other users, Then represent Time car channel Not occupied by other users and in an idle state; is shown at the moment Vehicle terminal Accessing channel 1 to channel Throughput information that can be obtained; The reliability index sequence of all channels is finally output by the prediction model; 502. constructing an action space: ; The physical meaning of each action component is as follows: Action Indicating time of day Vehicle terminal The channel number of the selected access, wherein, Indicating that the vehicle terminal does not perform channel access operation at the moment; indicating that the vehicle terminal is at time In the manner of the access channel employed, Indicating the access mode of the overlay, Then the access mode of adopting the underly is indicated; representing the transmission power used by a vehicle terminal when accessing a channel, where For the transmission power level used when the vehicle terminal is accessed, when The vehicle terminal does not transmit; 503. The method comprises the steps of constructing a reward function, wherein the reward function is divided into two conditions of low load and high load according to different loads, under the condition of low network load, the frequency spectrum resources with more idle channels in the system are relatively abundant, a vehicle terminal adopts an overlay access mode, and the normal communication of a PU (polyurethane) is not influenced, so that the main focus in the scene is to guide the vehicle to preferentially select the channel with higher reliability and improve the overall throughput of the system through reasonable access decision, and in the low load scene, the design of the reward function takes the throughput of the system and the reliability of the channel as the main targets, and the vehicle terminal is used for guiding the system to preferentially select the channel with higher reliability and improve the overall throughput of the system through reasonable access decision If the channel is successfully accessed and the data transmission is completed, a forward rewarding agent is obtained, otherwise, if the channel is not accessed or access conflict occurs, no rewarding agent is given or a negative rewarding agent is given, and the vehicle terminal At the moment of time The awards obtained The definition is as follows: ; wherein when the vehicle terminal does not access any channel, the prize value Indicating that the system is not obtaining effective benefit, giving negative rewards when the vehicle access channel collides with other users As a penalty for this, Constant, when the vehicle successfully accesses the channel and completes data transmission, the obtained rewards value is as follows: wherein Is a channel Is used for describing the current reliable transmission; Is a vehicle terminal In time slot Access channel The obtained transmission rate, which is calculated from shannon's formula; Under the condition of higher network load, the available idle spectrum resources in the system are tense, if a single overlay access mode is still adopted, the ever-increasing transmission requirement of a vehicle terminal is difficult to meet, therefore, the vehicle terminal is allowed to multiplex channels for transmission when a main user occupies the channels by introducing the underly access mode, but notably, the multiplexing of the vehicle terminal to the channels can bring interference risks to the main user, overlarge interference can even destroy the normal communication of the main user, therefore, the setting of a reward function in a high-load scene must take the protection of the main user transmission as the primary constraint condition and compromise the throughput and the channel reliability on the basis, and the vehicle terminal is designed for describing the influence of the access of the vehicle terminal to the main user in the underly scene In time slot Access channel The primary user protection feasibility index function at the time is: ; Wherein the method comprises the steps of Indicating that the primary user is on the channel Is used for the signal-to-dry ratio of (C), Minimum threshold required for normal communication for current primary user when When the current access behavior is not destroyed, the normal communication of the main user is not destroyed, otherwise, if The access behavior is considered to violate the primary user protection constraint, and based on the same, the vehicle terminal The bonus function in a high load scenario can be expressed as: ; when the vehicle terminal is not accessed to the channel and the access conflict, the design of the rewarding function is consistent with the access mode of the overlay, and when the vehicle terminal is against the protection constraint of the main user in the underly access process, a larger negative rewarding value is set When the vehicle terminal successfully completes data transmission on the premise of meeting the protection constraint of the main user, an interference punishment item is introduced on the basis of an overlay rewarding item The method is used as a soft constraint guiding mechanism, the design is used for guiding the vehicle terminal to actively reduce the interference intensity to the main user while protecting the transmission performance of the vehicle terminal, so as to realize the stability of a high-load access strategy, and the interference penalty function adopts a normalized linear form: ; Wherein the method comprises the steps of Representing the power of the interference caused by the vehicle terminal to the primary user on the channel, In order to allow for a maximum interference threshold, Punishment coefficients for interference; 504. Updating access strategy, namely continuously updating the access strategy by the vehicle terminal through continuous interaction with the environment in the access decision based on Q-learning so as to realize self-adaptive spectrum access under different network load conditions, wherein each time slot is provided with a plurality of access channels The vehicle terminals will all be based on the current observed state The access action is selected, and instant rewards are obtained from the environment according to the action execution result, so that the Q value is updated, and the Q value updating process can be expressed as follows: 。

Description

Spectrum resource allocation algorithm considering reliable transmission in cognitive Internet of vehicles Technical Field The invention belongs to the technical field of cognitive wireless communication technology and Internet of vehicles communication, and relates to a dynamic spectrum access and spectrum resource allocation mechanism combining a lightweight time sequence prediction model and reinforcement learning. Background In a cognitive internet of vehicles environment, the effectiveness of spectrum resource allocation is not only dependent on the access behavior of vehicle terminals, but also is closely related to the wireless channel state and time-varying characteristics thereof. With the continuous improvement of the high-reliability and low-delay communication demands of the internet of vehicles service, vehicle terminals face more severe transmission reliability challenges in a high-speed moving scene, on one hand, the lack of spectrum resources makes access competition more intense, and on the other hand, the high mobility causes rapid change of channel state Information (CHANNEL STATE Information, CSI) to generate a remarkable problem of channel aging, so that an access and allocation strategy based on instantaneous CSI or instant perception results is difficult to continuously and effectively. Thus, merely increasing spectrum utilization from the user behavior dimension may improve throughput and utilization efficiency, but it is often difficult to meet the requirement of CIoV scenarios for reliable communications. Around spectrum access and resource allocation of a high-mobility scene, the existing research is mostly dependent on instantaneous CSI or spectrum sensing results to make decisions, but the insufficient time-based CSI can lead to the difficulty of timely adapting to a rapidly-changing channel environment. In order to alleviate performance degradation caused by channel aging, the academy gradually introduces a channel prediction idea that the history observation is utilized to estimate the future CSI, thereby improving the prospective of access decisions. Conventional methods typically recursively estimate based on a stochastic process or state space model. Literature [KIM H, KIM S, LEE H, et al. Massive MIMO channel prediction: Kalman filtering vs. machine learning[J]. IEEE Transactions on Communications, 2020, 69(1): 518-528.] models the channel timing sequence variation as a random process and carries out recursive prediction under a state space frame, and the method has low computational complexity and good mathematical interpretation, but the prediction performance is limited by the accuracy of priori assumptions in a high-speed moving environment. In recent years, a deep learning method is also introduced into the field of channel prediction to improve prediction accuracy. The cyclic neural network is adopted for modeling the CSI sequence in part of research, the rule of dynamic evolution of the CSI sequence along with time is learned and captured from the historical CSI, and a foundation is laid for subsequent channel prediction research based on deep learning. However, the cyclic neural network faces bottlenecks such as gradient disappearance and the like in processing long-time sequences, and modeling capability still has certain limitations. The transducer model has also been introduced into wireless channel prediction because of its advantages of parallel modeling and long sequence learning capabilities. The literature [JIANG H, CUI M, NG D W K, et al. Accurate channel prediction based on transformer: Making mobility negligible[J]. IEEE Journal on Selected Areas in Communications, 2022, 40(9): 2717-2732.] creatively applies a transducer structure to a channel prediction task, converts a prediction process into a parallel sequence mapping problem, and provides a prediction framework for jointly optimizing three independent modules of channel estimation, channel prediction and precoding, so that a precoding matrix required by direct prediction of a pilot signal received by history is realized, and the channel processing complexity of a system is reduced to a certain extent. Meanwhile, efficient utilization of spectrum resources is still a core target of spectrum management in the cognitive internet of vehicles. Under the combined action of high mobility and service dynamic change, the spectrum occupation condition fluctuates in real time along with the network load, and the spectrum utilization efficiency and the transmission reliability are difficult to be considered by singly adopting an overlay or underley access mode. The literature [LI R, ZHU P. Spectrum allocation strategies based on QoS in cognitive vehicle networks[J]. IEEE Access, 2020, 8: 99922-99933.] divides different operation scenes according to network load states, and provides different channel allocation schemes under the network scenes with different loads by dividing the network environments into two sc