Search

CN-121996465-A - DRAM state management method, system and terminal equipment

CN121996465ACN 121996465 ACN121996465 ACN 121996465ACN-121996465-A

Abstract

The application relates to the technical field of memory management and discloses a DRAM state management method, a system and terminal equipment, wherein the method comprises the steps of continuously collecting physical interface signals connected with a DRAM memory based on a preset time window to obtain time sequence associated parameters of the DRAM memory at corresponding moments; the method comprises the steps of inputting the preprocessed time sequence related parameters into a prediction model for prediction to obtain time sequence related parameters at a plurality of future moments, further determining a prediction safety threshold at the future moments and a minimum time sequence allowance in the future moments, generating a coding early warning signal based on the time sequence related parameters at the future moments when the minimum time sequence allowance is lower than the prediction safety threshold at the corresponding moments, and matching a target state self-healing strategy in a preset strategy library and executing the strategy based on risk level coding, risk type coding and load state coding. By adopting the method, the overall reliability and usability of the system can be improved on the premise of ensuring the safety margin.

Inventors

  • He Yanpai

Assignees

  • 皇虎测试科技(深圳)有限公司

Dates

Publication Date
20260508
Application Date
20260408

Claims (10)

  1. 1.A DRAM state management method, comprising: Continuously acquiring physical interface signals connected with a DRAM (dynamic random access memory) based on a preset time window, obtaining time sequence related parameters of the DRAM at corresponding moments and preprocessing the time sequence related parameters; Inputting the preprocessed time sequence related parameters into a prediction model for prediction to obtain time sequence related parameters of a plurality of future moments, and further determining prediction safety thresholds of the future moments and a minimum time sequence allowance in the future moments; generating a coded early warning signal based on timing related parameters of the plurality of future times when the minimum timing margin is lower than the predicted safety threshold corresponding to the future time, wherein the coded early warning signal comprises a risk level code, a risk type code and a load state code; and matching a target state self-healing strategy in a preset strategy library based on the risk level code, the risk type code and the load state code, and executing the target state self-healing strategy.
  2. 2. The method according to claim 1, wherein the method further comprises: and after the strategy is executed, continuously acquiring physical interface signals of the DRAM memory to obtain updated time sequence related parameters, calculating updated minimum time sequence allowance according to the updated time sequence related parameters, and feeding back calculated time sequence allowance changes before and after the strategy is executed to the prediction model to evaluate the effectiveness of the target state self-healing strategy.
  3. 3. The method of claim 1, wherein the timing related parameters include a specified timing parameter, an operating frequency, and a load parameter for each acquisition time point of the DRAM memory, and wherein the step of inputting the preprocessed timing related parameters to a prediction model for prediction to obtain timing related parameters for a future time, and further determining a predicted safety threshold for the future time and a minimum timing margin within the future time comprises: The appointed time sequence parameter, the working frequency and the load parameter of each acquired time point are formed into a sequence feature according to the time sequence of time acquisition, the sequence feature is input into a prediction model, and the time sequence associated parameter of each moment in a future preset period is output; Determining a predicted safety threshold for each future time instant based on the timing related parameter for each future time instant; Determining a predicted timing margin for each future time instant based on the timing related parameter and the predicted safety threshold for each future time instant, and determining a minimum timing margin among a plurality of the predicted timing margins; the prediction model is obtained based on a long-term and short-term memory network.
  4. 4. The method of claim 3, wherein the timing-related parameters further comprise operating temperature, run time, and error count data of the DRAM memory, wherein the determining a predictive security threshold for each future time based on the timing-related parameters for each future time comprises: Obtaining a difference between the operating temperature and a nominal temperature corresponding to a future time, and an aging factor calculated based on the run time and the error count corresponding to the future time; And calculating a prediction safety threshold corresponding to the future moment according to the nominal time sequence allowance, the temperature compensation coefficient, the difference value, the aging compensation coefficient and the aging factor.
  5. 5. The method of claim 3, wherein all state self-healing policies in the preset policy repository are divided into three levels by time hardware adjustment, the three levels including an optimized scheduling policy with increasing levels, a parameter fine-tuning policy, and a mark isolation policy; based on the risk level code, the risk type code and the load state code, matching a target state self-healing strategy in a preset strategy library, wherein the method comprises the following steps of: if the current risk level corresponding to the risk level code is high, matching the mark isolation strategy in a preset strategy library as a target state self-healing strategy; If the current risk level corresponding to the risk level code is low, further determining whether an aging trend exists, and if not, determining whether the optimal scheduling strategy is matched as a target state self-healing strategy according to the load rate corresponding to the load state code; If the current risk level corresponding to the risk level code is medium, further determining whether the working temperature is greater than a preset temperature threshold, and if so, matching the parameter fine-tuning strategy and/or the optimized scheduling strategy in a preset strategy library to serve as a target state self-healing strategy.
  6. 6. The method as recited in claim 5, further comprising: if the current risk level corresponding to the risk level code is medium, further determining whether the working temperature is greater than the preset temperature threshold, and if the working temperature is less than or equal to the preset temperature threshold, determining whether the performance mode of the DRAM memory is a high performance mode; And if the mode is in a non-high performance mode, matching the parameter fine-tuning strategy and/or the optimized scheduling strategy in the preset strategy library as the target state self-healing strategy.
  7. 7. The method of claim 5 or 6, wherein the optimized scheduling policy includes writing to an idle insert field of a memory controller configuration register by an arbiter internal to the memory controller and/or prioritizing scheduling commands to access different banks; The parameter fine tuning strategy comprises configuring the electrical parameters of a physical layer interface of a system on a chip through a DFI interface; the marking isolation strategy comprises the steps of marking the high-risk access command of the early warning signal, taking out the high-risk access command from the command retry queue and resending the high-risk access command if the returned data ECC is wrong, and/or migrating the physical memory page with frequent early warning to a standby storage area.
  8. 8. The method of claim 1, wherein the continuously acquiring physical interface signals connected to the DRAM memory based on the preset time window, obtaining timing related parameters of the DRAM memory at corresponding times and preprocessing the timing related parameters, comprises: Continuously acquiring physical interface signals connected with a DRAM (dynamic random access memory) based on a preset time window, and acquiring original time sequence association parameters of each acquisition time point of the DRAM in the preset time window; For each acquisition time point in the preset time window, determining a previous original time sequence associated parameter corresponding to a previous acquisition time point of the acquisition time point; And determining the time sequence related parameter of the aimed acquisition time point based on the previous original time sequence related parameter, the original time sequence related parameter corresponding to the aimed acquisition time point and a preset smoothing coefficient.
  9. 9. A DRAM state management device, comprising: the data acquisition module is used for continuously acquiring physical interface signals connected with the DRAM based on a preset time window, obtaining time sequence related parameters of the DRAM at corresponding moments and preprocessing the time sequence related parameters; The prediction and early warning module is used for inputting the preprocessed time sequence related parameters into the prediction model for prediction to obtain time sequence related parameters of a plurality of future moments, and further determining a prediction safety threshold value of the future moments and a minimum time sequence allowance in the future moments; And generating a coded early warning signal based on timing related parameters of the plurality of future times when the minimum timing margin is below the predicted safety threshold corresponding to the future time, wherein the coded early warning signal comprises a risk level code, a risk type code, and a load state code; And the strategy executing module is used for matching a target state self-healing strategy in a preset strategy library based on the risk level code, the risk type code and the load state code and executing the target state self-healing strategy.
  10. 10. A DRAM state management system, comprising: the system on chip comprises a main controller, a memory controller and a command dispatcher, and is also connected with a DRAM through a physical layer interface; The controller is configured to perform the DRAM state management method of any of the preceding claims 1 to 8, wherein the memory controller and/or the command scheduler is triggered to configure electrical parameters of a physical layer interface via a DFI interface when performing a target state self-healing policy.

Description

DRAM state management method, system and terminal equipment Technical Field The present application relates to the field of memory management technologies, and in particular, to a method and a system for managing a DRAM state, and a terminal device. Background With the rapid development of cloud computing, artificial intelligence and edge computing, the requirements of data access on bandwidth, capacity and reliability reach a certain height. DRAM (dynamic random access memory) is used as a system main memory, and stable operation is a reliable core foundation of the whole computing system. Current DRAM technology development faces challenges such as significant improvement in sensitivity to charge disturbances, significant compression of timing parameter fault tolerance space, and dynamic reduction of the timing margin that is practically available for DRAM. However, the conventional method for guaranteeing the reliability of the DRAM cannot realize dynamic sensing, potential risk prediction and active adaptive adjustment of the timing state of the DRAM. Therefore, there is an urgent need for an intelligent memory management method that changes from "passive error correction" to "active error prevention" precautionary type. Disclosure of Invention In view of the above, embodiments of the present application provide a method, a system, and a terminal device for managing a DRAM state, which can effectively solve the technical problem that the conventional method for guaranteeing the reliability of the DRAM cannot realize dynamic sensing, potential risk prediction, and adaptive adjustment of the timing state of the DRAM. In a first aspect, an embodiment of the present application provides a DRAM state management method, including: Continuously acquiring physical interface signals connected with a DRAM (dynamic random access memory) based on a preset time window, obtaining time sequence related parameters of the DRAM at corresponding moments and preprocessing the time sequence related parameters; Inputting the preprocessed time sequence related parameters into a prediction model for prediction to obtain time sequence related parameters of a plurality of future moments, and further determining prediction safety thresholds of the future moments and a minimum time sequence allowance in the future moments; generating a coded early warning signal based on timing related parameters of the plurality of future times when the minimum timing margin is lower than the predicted safety threshold corresponding to the future time, wherein the coded early warning signal comprises a risk level code, a risk type code and a load state code; and matching a target state self-healing strategy in a preset strategy library based on the risk level code, the risk type code and the load state code, and executing the target state self-healing strategy. In a second aspect, an embodiment of the present application provides a DRAM status management device, the device including: the data acquisition module is used for continuously acquiring physical interface signals connected with the DRAM based on a preset time window, obtaining time sequence related parameters of the DRAM at corresponding moments and preprocessing the time sequence related parameters; The prediction and early warning module is used for inputting the preprocessed time sequence related parameters into the prediction model for prediction to obtain time sequence related parameters of a plurality of future moments, and further determining a prediction safety threshold value of the future moments and a minimum time sequence allowance in the future moments; The prediction and early warning module is further used for generating a coding early warning signal based on the time sequence association parameters of the future time moments when the minimum time sequence margin is lower than the prediction safety threshold value corresponding to the future time moments, wherein the coding early warning signal comprises a risk level code, a risk type code and a load state code; And the strategy executing module is used for matching a target state self-healing strategy in a preset strategy library based on the risk level code, the risk type code and the load state code and executing the target state self-healing strategy. In a third aspect, the present application also provides a DRAM status management system, comprising: The system on chip comprises a main controller, a memory controller and a command dispatcher, wherein the system on chip is also connected with a DRAM through a physical layer interface, and the main controller is used for executing the following steps: Continuously acquiring physical interface signals connected with a DRAM (dynamic random access memory) based on a preset time window, obtaining time sequence related parameters of the DRAM at corresponding moments and preprocessing the time sequence related parameters; Inputting the preprocessed time sequence related param