CN-122018667-A - Chip power management method, system and medium
Abstract
The application provides a chip power supply management method, a system and a medium. A chip performance management method comprises the steps of setting initial frequency and initial set voltage of a chip, measuring equivalent switch capacitance through an equivalent switch capacitance monitor, measuring chip junction temperature through a temperature sensor, measuring performance indexes of the chip through a performance sensor, and determining control frequency of the chip based on a maximized target function through an MPC controller. The method is beneficial to model predictive control, reduces the power consumption to the maximum extent on the premise of meeting the chip performance requirement, and realizes the maximum performance improvement on the premise of meeting the power consumption constraint.
Inventors
- HUANG HELONG
- YU PANPAN
Assignees
- 瀚博半导体(上海)股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260413
Claims (10)
- 1. A method for chip performance management, the method comprising: setting an initial frequency and an initial setting voltage of the chip; measuring an equivalent switched capacitance by an equivalent switched capacitance monitor; Measuring the chip junction temperature by a temperature sensor; measuring performance indexes of the chip through a performance sensor; Determining, by the MPC controller, a control frequency of the chip based on the objective function; adjusting the operating frequency of the chip according to the determined control frequency, The objective function is: , Wherein: a reward function for performance; as a loss of power function; as a function of temperature loss; alpha is the weight of the performance rewarding function, and the value range is 0< alpha <1; Beta is the energy loss function weight, and the value range is 0< beta <1; n represents an nth chip; N represents the total number of chips, and N is greater than or equal to 1; the calculation formula is that, Wherein, the - Is the highest allowable frequency; - The frequency of the current moment k; - Is the target frequency; - obtaining a performance index at the current moment from a performance monitor; - Is a performance index predicted based on the target frequency f (k+1); - A performance reward value predicted based on the target frequency f (k+1); the calculation formula is that, Wherein, the - Is the highest allowable power consumption; - for the total power consumption calculated by the power consumption prediction model based on the target frequency f (k+1), The calculation formula is that, Wherein, the -Tj max is the highest allowed chip junction temperature; -Tj min is the lowest allowable chip junction temperature; - Is the predicted chip junction temperature based on the target frequency.
- 2. The method of claim 1, wherein determining the control frequency of the chip by the objective function comprises: preset step length based on chip frequency adjustment Constructing a candidate frequency set: M is the maximum regulating step number (1 is less than or equal to m); Calculating an objective function value corresponding to each candidate frequency; And selecting the candidate frequency with the largest objective function value as the control frequency.
- 3. The method of claim 1, wherein adjusting the operating frequency of the chip based on the determined control frequency comprises: Obtaining a chip set voltage according to the control frequency by using a voltage frequency model, and And setting the chip according to the chip set voltage.
- 4. The method of claim 1, wherein the process of determining the control frequency of the chip based on the objective function satisfies the following constraint: Temperature constraints T min <T<T max ,T max and T min are the highest and lowest allowable chip junction temperatures, respectively; Voltage constraint V min <V device <V max ,V max and V min are the voltages on the highest and lowest allowed chip MOS transistors, respectively; Frequency constraint F min <f<F max , F max and F min are the highest and lowest allowed frequencies; Power consumption constraint P < P max ,P max is the highest allowed power consumption.
- 5. The method of claim 1, wherein the power consumption prediction model is: Wherein, the Wherein, the K tb , K vb is the calculation parameters of leakage current power consumption, temperature and voltage; V ref, T ref is the reference voltage and temperature used in calculating K tb , K vb during the test; V device (k+1) is the voltage on the chip MOS transistor at the moment of k+1; Tj (k+1) is the chip junction temperature at time k+1; cac is the equivalent switched capacitor; and testing the obtained leakage current power consumption under the Vref and Tref conditions.
- 6. The method of claim 5, wherein the step of determining the position of the probe is performed, Wherein, the Is the linear coefficient of Cac and voltage V device .
- 7. A method according to claim 3, wherein the voltage frequency model is: Wherein, the V set denotes a chip set voltage; v device represents the voltage on the chip MOS tube; v droop represents the pressure drop from V set to V device , Is a voltage frequency parameter; is a pressure drop model parameter; Is the control frequency of the chip.
- 8. The method according to any one of claims 1 to 7, wherein the chip is a GPU chip.
- 9. A chip performance management system for performing the method according to any one of claims 1 to 8, characterized in that the system comprises: The computing module is used for executing the operation task of the chip; A state awareness module comprising: The performance monitor is used for measuring the performance index of the chip; The equivalent switch capacitance monitor is used for measuring the equivalent switch capacitance; The temperature sensor is used for measuring the temperature of the chip junction; An MPC controller for determining a control frequency of the chip based on the objective function; And the execution module is configured to set the chip based on the control frequency.
- 10. A computer readable storage medium having stored thereon a computer program, which, when executed by a processor, causes the processor to perform the method of any of claims 1 to 8.
Description
Chip power management method, system and medium Technical Field The invention relates to the technical field of power management, in particular to a chip power management method, a system and a medium. Background With the rapid development of the fields of artificial intelligence, high-performance computing and the like, the operation performance of the GPU is continuously improved, but at the same time, the power consumption is continuously increased, and the problems of overhigh energy consumption, high heat dissipation pressure, reduced reliability and the like are brought. Dynamic power management is a key technology for solving the problems, and the core aim is to reduce the power consumption to the greatest extent on the premise of ensuring the running stability of the system and meeting the performance requirements of the GPU. The existing GPU power management technology mainly has the following defects: The multi-objective conflict is difficult to coordinate, and the prior art is difficult to combine multiple objectives such as performance maximization, power consumption minimization, temperature control and the like, so that the problems of power consumption waste caused by excessive performance or performance deficiency caused by excessive energy conservation are easy to occur. The target frequency setting lacks scientificity, namely the traditional method is used for setting the working frequency based on an empirical rule or simple threshold judgment, and accurate adaptation cannot be performed according to the dynamic change of the application load, so that the rationality and the effectiveness of frequency adjustment are insufficient. The prior art is mostly based on local feedback regulation, and is not considered about future load change trend only according to current system parameters, so that the prior art is easy to fall into a local optimal solution, and cannot realize global optimization in the full time domain. The response speed and the stability are difficult to balance, when the sudden load change is handled, the traditional regulating mechanism either has lag response and can not timely match the load demand, or the regulation is too aggressive, so that the frequency and the voltage are severely fluctuated, and the performance stability is affected. Model Predictive Control (MPC) is an advanced control method, and has the advantage of being capable of handling multiple constraint and multiple objective optimization problems and predicting future system states. There is no mature MPC-based chip dynamic power management scheme at present, which can fully adapt to hardware characteristics and dynamic load requirements of a GPU. Disclosure of Invention In view of the above, the present application provides a method, system and medium for chip power management, which are used to solve the above technical problems in the prior art. According to one aspect of the present application, there is provided a chip performance management method, the method comprising: setting an initial frequency and an initial setting voltage of the chip; measuring an equivalent switched capacitance by an equivalent switched capacitance monitor; Measuring the chip junction temperature by a temperature sensor; measuring performance indexes of the chip through a performance sensor; Determining, by the MPC controller, a control frequency of the chip based on the objective function; adjusting the operating frequency of the chip according to the determined control frequency, The objective function is: , Wherein: a reward function for performance; as a loss of power function; as a function of temperature loss; alpha is the weight of the performance rewarding function, and the value range is 0< alpha <1; Beta is the energy loss function weight, and the value range is 0< beta <1; n represents an nth chip; N represents the total number of chips, and N is greater than or equal to 1; The calculation formula of (a) is as follows, Wherein, the -Is the highest allowable frequency; - The frequency of the current moment k; - Is the target frequency; - obtaining a performance index at the current moment from a performance monitor; - Is a performance index predicted based on the target frequency f (k+1); - A performance reward value predicted based on the target frequency f (k+1); The calculation formula of (a) is as follows, Wherein, the -Is the highest allowable power consumption; - for the total power consumption calculated by the power consumption prediction model based on the target frequency f (k+1), The calculation formula of (a) is as follows, Wherein, the -Tj max is the highest allowed chip junction temperature; -Tj min is the lowest allowable chip junction temperature; - Is the predicted chip junction temperature based on the target frequency. According to a preferred embodiment of the present application, determining the control frequency of the chip by maximizing the objective function includes: preset step length b