Search

CN-122028157-A - Transmission power control method, wireless communication system, device, medium, and product

CN122028157ACN 122028157 ACN122028157 ACN 122028157ACN-122028157-A

Abstract

The application provides a transmission power control method, a wireless communication system, equipment, a medium and a product, wherein the transmission power control method comprises the steps of obtaining a current state set of the wireless communication system, determining a transmission power adjustment value which enables a multi-reward function to be maximum based on the current state set and the multi-reward function, wherein the multi-reward function comprises channel quality rewards, transmission efficiency rewards and energy consumption efficiency rewards, and determining the transmission power of the wireless communication system based on the transmission power adjustment value. According to the method, the triple state set comprising the channel quality state, the transmission efficiency state and the energy consumption efficiency state is constructed, and the multi-reward function is adopted to quantitatively calculate the return of the power adjustment action in three dimensions, so that the multi-objective cooperative optimization of the power control problem of the wireless communication system is realized, and the balanced requirement of the channel quality, the transmission efficiency and the energy consumption efficiency can be simultaneously considered in the transmitting power adjustment decision.

Inventors

  • JIANG QUNJIE

Assignees

  • 上海星思半导体股份有限公司

Dates

Publication Date
20260512
Application Date
20260211

Claims (12)

  1. 1. A method of transmit power control, the method comprising: Acquiring a current state set of a wireless communication system, wherein the current state set comprises a channel quality state, a transmission efficiency state and an energy consumption efficiency state; Determining a transmitting power adjustment value for maximizing the value of the multiple rewarding function based on the current state set and the multiple rewarding function, wherein the multiple rewarding function comprises channel quality rewards, transmission efficiency rewards and energy consumption efficiency rewards, the channel quality rewards are used for calculating channel quality rewards which can be obtained after a power adjustment action is executed in the current channel quality state, the transmission efficiency rewards are used for calculating transmission efficiency rewards which can be obtained after the power adjustment action is executed in the current transmission efficiency state, and the energy consumption efficiency rewards are used for calculating energy consumption efficiency rewards which can be obtained after the power adjustment action is executed in the current energy consumption efficiency state; And determining the transmission power of the wireless communication system based on the transmission power adjustment value.
  2. 2. The transmission power control method according to claim 1, wherein the determining a transmission power adjustment value that maximizes the value of the multiple-bonus function based on the current state set and the multiple-bonus function includes: Calculating the channel quality rewards value, the transmission efficiency rewards value and the energy consumption efficiency rewards value corresponding to each candidate power adjustment action; Respectively carrying out normalization processing on the channel quality rewarding value, the transmission efficiency rewarding value and the energy consumption efficiency rewarding value; carrying out weighted summation on the channel quality rewarding value, the transmission efficiency rewarding value and the energy consumption efficiency rewarding value after normalization processing based on preset weights to obtain multiple rewarding function values; and determining a transmission power adjustment value corresponding to the candidate adjustment action with the largest multi-rewards function value in the plurality of candidate power adjustment actions.
  3. 3. The method for controlling transmission power according to claim 2, wherein the weighting and summing the normalized channel quality reward value, the transmission efficiency reward value, and the energy consumption efficiency reward value based on a preset weight to obtain a multi-reward function value includes: respectively carrying out discount calculation on the channel quality rewarding value, the transmission efficiency rewarding value and the energy consumption efficiency rewarding value after normalization processing based on a preset discount factor; and carrying out weighted summation on the channel quality rewarding value, the transmission efficiency rewarding value and the energy consumption efficiency rewarding value after discount calculation based on preset weights, and obtaining a multi-rewarding function value.
  4. 4. A transmission power control method according to any one of claims 1 to 3, wherein the determining, based on the current state set and a multiple reward function, a transmission power adjustment value that maximizes the multiple reward function value includes: The current state set is input into a power control strategy network, wherein the power control strategy network is configured to calculate a multi-rewarding function value corresponding to each candidate power adjustment action based on the multi-rewarding function, and map the multi-rewarding function value into action probability distribution of each candidate power adjustment action; and selecting a transmitting power adjustment value corresponding to the candidate power adjustment action with the maximum value of the multiple rewarding functions based on the action probability distribution.
  5. 5. The transmit power control method of claim 4, wherein the power control policy network comprises a policy network and a value network, the method further comprising: The method comprises the steps of collecting training samples, wherein the training samples comprise state transition experiences, and the state transition experiences comprise a current state set, a current transmitting power adjustment value, instant rewards calculated by the multi-rewarding function and a state set at the next moment; Sampling in the training sample, and respectively calculating a first loss of the strategy network and a second loss of the value network based on sampling data, wherein the first loss is used for quantifying the deviation degree of the action probability distribution output by the strategy network and a dominance function; updating network parameters of the policy network and the value network, respectively, based on the first loss and the second loss; And repeatedly executing the sampling step, the loss calculating step and the network parameter updating step until the preset training termination condition is met, and obtaining the trained power control strategy network.
  6. 6. The transmission power control method according to claim 5, characterized in that the method further comprises: deploying the trained power control strategy network to a wireless communication system; and in the running process of the wireless communication system, acquiring the current state set in real time and inputting the current state set into the power control strategy network.
  7. 7. The transmission power control method according to any one of claims 1 to 3, wherein the method for calculating the channel quality reward includes calculating the channel quality reward value by a first monotonically increasing function based on a signal-to-noise ratio obtained after execution of the candidate power adjustment action, wherein an argument of the first monotonically increasing function includes the signal-to-noise ratio, the channel quality reward value increasing with an increase in the signal-to-noise ratio; And/or the calculation method of the transmission efficiency rewards comprises the steps of calculating the transmission efficiency rewards value through a second monotonically increasing function based on a first ratio of the block error rate obtained after the execution of the candidate power adjustment action to the target block error rate, wherein the transmission efficiency rewards value increases along with the increase of the first ratio; And/or the calculation method of the energy consumption efficiency rewards comprises the steps of calculating the energy consumption efficiency rewards value through a third monotonically increasing function based on a second ratio of the energy efficiency obtained after the execution of the candidate power adjustment action to the target energy efficiency set by the system, wherein the energy efficiency is the ratio of throughput to transmitting power, and the energy consumption efficiency rewards value increases along with the increase of the ratio.
  8. 8. A method of controlling transmission power according to any one of claims 1 to 3, wherein the obtaining the current state set of the wireless communication system includes: comparing the channel quality indication, the interference signal strength and the signal to noise ratio with corresponding preset thresholds respectively to obtain corresponding comparison results, and determining the channel quality state based on the occurrence frequency of each comparison result; And/or determining the transmission efficiency state based on a target code rate; and/or determining the energy consumption efficiency state based on a power consumption level.
  9. 9. A wireless communication system, comprising a transmitting end and a receiving end, wherein: The transmitting terminal is used for acquiring a current state set of a wireless communication system, wherein the current state set comprises a channel quality state, a transmission efficiency state and an energy consumption efficiency state, determining a transmitting power adjustment value for enabling a value of a multi-rewarding function to be maximum based on the current state set and the multi-rewarding function, wherein the multi-rewarding function comprises a channel quality rewarding agent, a transmission efficiency rewarding agent and an energy consumption efficiency rewarding agent, the channel quality rewarding agent is used for calculating a channel quality rewarding agent which can be obtained after a power adjustment action is executed in the current channel quality state, the transmission efficiency rewarding agent is used for calculating a transmission efficiency rewarding agent which can be obtained after the power adjustment action is executed in the current transmission efficiency state, the energy consumption efficiency rewarding agent is used for calculating an energy consumption efficiency rewarding agent which can be obtained after the power adjustment action is executed in the current energy consumption efficiency state, and determining the transmitting power of the wireless communication system based on the transmitting power adjustment value; The receiving end is used for receiving the wireless signal sent by the transmitting end.
  10. 10. An electronic device, comprising a processor, a memory and a communication bus, wherein the processor and the memory complete communication with each other through the communication bus, the memory stores program instructions executable by the processor, and the processor invokes the program instructions to perform the method of any one of claims 1-8.
  11. 11. A computer readable storage medium storing computer instructions which, when executed by a computer, cause the computer to perform the method of any one of claims 1 to 8.
  12. 12. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the method according to any of claims 1-8.

Description

Transmission power control method, wireless communication system, device, medium, and product Technical Field The present application relates to the field of wireless communications technologies, and in particular, to a method for controlling transmission power, a wireless communications system, a device, a medium, and a product. Background The transmit power control technique in a wireless communication system is one of the key techniques of a cellular wireless communication system. By dynamically adjusting the transmitting power, the path loss can be effectively compensated, shadow fading and fast fading can be overcome, and internal and external interference of the system can be restrained, so that the energy consumption of the terminal can be reduced and the network coverage and the system capacity can be improved on the premise of guaranteeing the service quality. The traditional power control method mainly comprises two types of open loop power control and closed loop power control, wherein the former is used for setting power by self measurement (such as received signal strength) of a transmitting end, and the latter is used for realizing feedback adjustment by transmitting a power control command to the transmitting end through a receiving end. However, with the rapid development of new wireless communication systems such as 5G NR, low-orbit satellite networks, conventional power control techniques face serious challenges. In a multi-target collaborative optimization scene, the existing power control method lacks an effective multi-reward function collaborative decision mechanism, reward signals of different optimization targets are directly mixed and overlapped in a decision process, so that relative differences among rewards are smoothed, the power adjustment decision cannot accurately reflect the differences of the rewards, finally, the setting of the transmitting power is unreasonable, and the requirements of 5G and future networks on refined power control are difficult to meet. Disclosure of Invention An object of an embodiment of the present application is to provide a method for controlling transmission power, a wireless communication system, a device, a medium, and a product for solving the above-mentioned problems. In a first aspect, an embodiment of the present application provides a method for controlling transmission power, where the method includes obtaining a current state set of a wireless communication system, where the current state set includes a channel quality state, a transmission efficiency state, and an energy consumption efficiency state, determining a transmission power adjustment value that maximizes a value of a multiple rewarding function based on the current state set and the multiple rewarding function, where the multiple rewarding function includes a channel quality reward, a transmission efficiency reward, and an energy consumption efficiency reward, where the channel quality reward is used to calculate a channel quality reward that can be obtained after a power adjustment operation is performed in the current channel quality state, the transmission efficiency reward is used to calculate a transmission efficiency reward that can be obtained after a power adjustment operation is performed in the current transmission efficiency state, the energy consumption efficiency reward is used to calculate an energy consumption efficiency reward that can be obtained after a power adjustment operation is performed in the current energy consumption efficiency state, and determining a transmission power of the wireless communication system based on the transmission power adjustment value. In the implementation process of the scheme, a triple state set comprising a channel quality state, a transmission efficiency state and an energy consumption efficiency state is constructed, and a multi-reward function is adopted to quantitatively calculate returns of power adjustment actions in three dimensions, so that multi-objective collaborative optimization of a wireless communication system power control problem is realized, a transmitting power adjustment decision can simultaneously give consideration to balanced requirements of the channel quality, the transmission efficiency and the energy consumption efficiency, on the other hand, a transmitting power adjustment value is determined based on a principle that the multi-reward function is maximized, a fixed rule of traditional power control is converted into a dynamic optimizing process based on state driving, accurate matching of the power adjustment value and the current wireless environment state is realized, and the adaptability of a power control strategy to environment change is improved. In one implementation manner of the first aspect, the determining a transmit power adjustment value that maximizes the multiple reward function based on the current state set and the multiple reward function includes calculating the channel quality re