CN-121979463-A - Training method for high-bandwidth memory, computer equipment and medium
Abstract
The application relates to the technical field of digital signal processing and provides a training method, computer equipment and a medium for a high-bandwidth memory. The training method includes performing deskew training through cooperation between a high bandwidth memory physical layer and a high bandwidth memory device using a loopback mode of the high bandwidth memory device. Therefore, the data alignment in the physical layer of the high-bandwidth memory is realized, the deviation between signal groups is reduced, the variation range of the deviation between signals is controlled, the design is simplified in terms of hardware implementation and algorithm details, the eye pattern margin is improved, and the performance of the high-bandwidth memory is improved.
Inventors
- ZHANG JUNHUI
Assignees
- 芯耀辉科技股份有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260409
Claims (15)
- 1. A training method for a high bandwidth memory, the training method comprising performing deskew training by cooperation between a high bandwidth memory physical layer and the high bandwidth memory device using a loopback mode of the high bandwidth memory device, comprising: Gradually increasing the delay of the data strobe signal from zero delay and after each adjustment of the data strobe signal, performing data comparison based on the respective sampling results of a plurality of signal groups until the respective sampling results of the plurality of signal groups are correct, thereby determining a left boundary of the data strobe signal with respect to signal width and a delay value, wherein the high bandwidth memory physical layer is configured to receive the plurality of signal groups, each of the plurality of signal groups includes at least one signal, the sum of the data widths of the signals included in each of the plurality of signal groups does not exceed the signal width, and the signals included in each of the plurality of signal groups satisfy an inter-signal deviation constraint; Configuring a digital delay chain of the data strobe signal using delay values of the data strobe signal with respect to the signal width, and then, respectively, gradually increasing respective delays of the plurality of signal groups from zero delay through respective digital delay chains of the plurality of signal groups independent of each other, thereby obtaining respective data comparison results of the plurality of signal groups within a configurable delay range; Determining a distance between a left boundary of each of the plurality of signal groups and a left boundary of the data strobe signal with respect to the signal width based on a data comparison result of each of the plurality of signal groups, thereby determining a delay value of each of the plurality of signal groups with respect to the signal width; The digital delay chains of the signal groups are configured using the delay values of the signal groups relative to the signal widths, respectively, to reduce the skew between the signal groups.
- 2. The method of claim 1, wherein the deskew training is used for receive direction deskew training, wherein the data strobe signal is a receive direction data strobe signal, wherein the high bandwidth memory physical layer generates reference data, and wherein the high bandwidth memory physical layer data compares the reference data to sample data returned by the high bandwidth memory device to determine whether the sampling results of each of the plurality of signal groups are correct.
- 3. The method of claim 1, wherein the deskew training is used for transmit direction deskew training, wherein the data strobe signal is a transmit direction data strobe signal, wherein the high bandwidth memory physical layer generates reference data and transmits the reference data to the high bandwidth memory device, wherein the high bandwidth memory device performs data comparison and transmits a data comparison result to the high bandwidth memory physical layer, and wherein the high bandwidth memory physical layer determines whether the sampling result of each of the plurality of signal groups is correct based on the data comparison result.
- 4. The method of claim 1, wherein the deskewing training comprises a receive direction deskewing training and the data strobe signal is a receive direction data strobe signal in the receive direction deskewing training, the deskewing training further comprising a transmit direction deskewing training and the data strobe signal is a transmit direction data strobe signal in the transmit direction deskewing training.
- 5. The method of claim 1, wherein the deskewing training comprises a receive direction deskewing training and a transmit direction deskewing training, the training method comprising sequentially performing initializing the high bandwidth memory physical layer, initializing the high bandwidth memory device, performing command address bus training, performing read gating training, performing the receive direction deskewing training, performing read training, performing the transmit direction deskewing training, and performing write training.
- 6. The method of claim 1, wherein the deskewing training comprises a receive direction deskewing training, the training method comprising sequentially performing initializing the high bandwidth memory physical layer, initializing the high bandwidth memory device, performing command address bus training, performing read gating training, performing the receive direction deskewing training, performing read training, and performing write training.
- 7. The method of claim 1, wherein the deskewing training comprises transmit direction deskewing training, the training method comprising sequentially performing initializing the high bandwidth memory physical layer, initializing the high bandwidth memory device, performing command address bus training, performing read gating training, performing read training, performing the transmit direction deskewing training, and performing write training.
- 8. The method of claim 1, wherein the training method further comprises re-performing the deskewing training when the high bandwidth memory applies a different parallel interface protocol, or when a trace between the high bandwidth memory physical layer and the high bandwidth memory device changes, or when a circuit layout of the high bandwidth memory changes, or when the signal width changes.
- 9. The method of claim 1, wherein the signal width-associated common digital delay chain is used to control delays of all signals in the plurality of signal groups, and wherein a programmable step size of the respective digital delay chain of the plurality of signal groups is less than a fixed step size of the common digital delay chain.
- 10. The method of claim 1, wherein the deskew training is used to reduce input timing skew and output timing skew between the high bandwidth memory physical layer and the high bandwidth memory device.
- 11. The method of claim 1, further comprising controlling the high bandwidth memory device to exit the loopback mode after obtaining the data comparison results for each of the plurality of signal groups.
- 12. The method of claim 1, wherein the signal width-associated eye-margin of the parallel interface of the high-bandwidth memory is determined based on a length of a maximum parallel time period between the plurality of signal groups, the deskew training to increase the signal width-associated eye-margin of the parallel interface of the high-bandwidth memory.
- 13. The method of claim 1, wherein the signal width is 32 bits and the signals included in the plurality of signal groups include the data strobe signal, the data signal, and a data bus flip signal.
- 14. A computer device, characterized in that it comprises a memory, a processor and a computer program stored on the memory and executable on the processor, which processor implements the method according to any of claims 1 to 13 when executing the computer program.
- 15. A computer readable storage medium storing computer instructions which, when run on a computer device, cause the computer device to perform the method of any one of claims 1 to 13.
Description
Training method for high-bandwidth memory, computer equipment and medium Technical Field The present application relates to the field of digital signal processing technologies, and in particular, to a training method for a high bandwidth memory, a computer device, and a medium. Background High bandwidth memory (High Bandwidth Memory, HBM) is widely used in applications such as artificial intelligence large models, data centers, high performance servers, etc., and typically supports the double data rate memory physical layer interface (DDR PHY INTERFACE, DFI) protocol. Because of factors such as circuit delay characteristics, path differences, process corner voltage temperature variation (Process Voltage Temperature, PVT), etc., deviation (skew) may exist between multipath signals received by the parallel interface of the high-bandwidth memory, which may affect accuracy of data sampling and is not beneficial to improving performance. To this end, the high bandwidth memory provides a high performance parallel interface solution and a series of training schemes are provided by training firmware to achieve data alignment and to improve accuracy of data sampling. The training scheme of the prior art high bandwidth memory is generally directed to detection and adjustment at the parallel interface of the high bandwidth memory, such as by pins to obtain timing information between signals and provide the training scheme. But even if data alignment between signals is performed at the parallel interface, after entering the inside of the high bandwidth memory physical layer, there may still be a deviation between the internal signals due to delay differences in signal transmission and the like between the high bandwidth memory physical layer and the high bandwidth memory device. Because it is inside the physical layer of the high bandwidth memory, it is difficult to accurately acquire timing information between signals through probes, external instruments. Moreover, the number and composition of the signals to be transmitted may vary, which means that the inter-signal bias conditions within the physical layer of the high bandwidth memory may also vary, i.e. detection and training may be needed from time to time, reducing the signal bias within the physical layer of the high bandwidth memory, which requires a simplistic consideration both in terms of hardware implementation and in terms of algorithm detail. Therefore, the application provides a training method, computer equipment and medium for a high-bandwidth memory, which are used for solving the technical problems in the prior art. Disclosure of Invention In a first aspect, the present application provides a training method for a high bandwidth memory. The training method includes performing deskew training by cooperation between a high bandwidth memory physical layer and the high bandwidth memory device using a loopback mode of the high bandwidth memory device, including gradually increasing delays of data strobe signals from zero delays and, after each adjustment of the data strobe signals, performing data comparison based on respective sampling results of a plurality of signal groups until respective sampling results of the plurality of signal groups are correct, thereby determining left boundaries and delay values of the data strobe signals with respect to signal widths, wherein the high bandwidth memory physical layer is configured to receive the plurality of signal groups, each of the plurality of signal groups including at least one signal, a sum of data widths of signals included in each of the plurality of signal groups does not exceed the signal widths, and the signals included in each of the plurality of signal groups satisfy an inter-signal bias constraint, configuring digital delay chains of the data strobe signals based on the respective signal widths, then, respectively, from zero delays of the plurality of signal groups are respectively, configuring the respective data delay chains of the data strobe signals based on the respective boundaries of the plurality of signal groups, the respective delay chains of the data strobe signals being independent of each of the plurality of signal groups, respectively, determining the respective data delay chains of the data strobe signals based on the respective boundaries of the respective data strobe signals, and configuring a digital delay chain of each of the plurality of signal groups with the delay value of each of the plurality of signal groups with respect to the signal width, thereby reducing a skew between the plurality of signal groups. According to the first aspect of the application, the data alignment in the physical layer of the high-bandwidth memory is realized, the deviation between signal groups is reduced, the variation range of the deviation between signals is controlled, the design is simplified in terms of hardware implementation and algorithm details, the eye pattern margin is improved, and