US-20260127070-A1 - Buffer Component for Interleaving Data and Metadata for Error Correction
Abstract
A memory buffer services commands from a host to access data in a memory using parity bits augmented with metadata for improved error correction and detection (EDC). The memory buffer performs EDC-protocol translation so EDC can be optimized for host-side and memory-side correction and detection. The memory buffer also services each host-side memory transaction with two or more memory-side transactions to efficiently read, write, and store metadata for each requested cache-line access.
Inventors
- Thomas Vogelsang
Assignees
- RAMBUS INC.
Dates
- Publication Date
- 20260507
- Application Date
- 20251014
Claims (20)
- 1 . A method for managing read and write transactions with first and second ranks of memory devices, the method comprising: receiving a read command, from a host, to access the first rank at a first memory address and, responsive to the read command: issuing a first command to the first rank at the first memory address; receiving first read data from the first memory address; calculating a second memory address of the second rank from the first memory address; issuing a second command to the second rank at the second memory address; receiving second read data from the second memory address, the second read data including a subset of metadata; detecting an error in the first read data; correcting the error in the first read data with the metadata to produce corrected data; and conveying the corrected data to the host.
- 2 . The method of claim 1 , wherein the second read data includes parity data, the method further comprising detecting an error in the metadata using the parity data.
- 3 . The method of claim 1 , wherein the first and second commands responsive to a sequence of read commands, including the read command, are interleaved.
- 4 . The method of claim 1 , wherein the read command follows a first interface standard and the first command follows a second interface standard different from the first interface standard.
- 5 . The method of claim 1 , further comprising: receiving a write command to store write data at the first memory address and, responsive to the write command: calculating the second memory address of the second rank from the first memory address and second metadata from the write data; issuing a read command to the second memory address to receive third read data; and replacing a subset of the third read data with the second metadata to produce modified data; and writing the modified data back to the second memory address.
- 6 . The method of claim 5 , further comprising writing, responsive to the write command, the write data to the first memory address.
- 7 . The method of claim 1 , wherein the first read data includes parity bits, and wherein detecting the error in the first read data uses the parity bits.
- 8 . The method of claim 7 , wherein the second read data includes second parity bits, and wherein the detecting the error omits the second parity bits.
- 9 . The method of claim 7 , wherein the second read data includes second parity bits, the method further comprising detecting a second error in the second read data using the second parity bits.
- 10 . The method of claim 1 , further comprising calculating check bits and adding the check bits to the corrected data before conveying the corrected data to the host.
- 11 . A buffer for providing a host with access to a memory, the buffer comprising: a host interface to receive host commands, including a host read command; a first memory interface to issue, responsive to the host read command, a first command to a first rank of the memory at a first memory address, the first memory interface to receive first read data from the first memory address responsive to the first command; a control block to calculate a second memory address as a function of the first memory address; a second memory interface to issue, responsive to the host read command, a second command to a second rank of the memory at the second memory address, the second memory interface to receive second read data from the second memory address responsive to the second command, the second read data including a subset of metadata; and an error-detection-and-correction (EDC) block to correct an error in the first read data using the metadata to produce corrected data; the host interface to transmit the corrected data to the host.
- 12 . The buffer of claim 11 , wherein the second read data includes parity data and the EDC block detects a second error in the metadata using the parity data.
- 13 . The buffer of claim 11 , the host interface to receive a write command to store write data at the first memory address, the control block to calculate the second memory address from the first memory address, and the EDC block to calculate second metadata from the write data.
- 14 . The buffer of claim 13 , the first memory interface to communicate the write data to the first rank and the second metadata to the second memory address.
- 15 . The buffer of claim 14 , the second memory interface further to read from the second memory address to convey parity bits to the control block, the EDC block to calculate updated parity bits using the second metadata, the second memory interface to second metadata, the first memory interface to write the second metadata and the updated parity bits to the second memory address responsive to the write command.
- 16 . The buffer of claim 11 , further comprising a second EDC block to add check bits to the corrected data, the host interface to transmit the corrected data with the check bits.
- 17 . The buffer of claim 16 , wherein the metadata is of metadata bits fewer than the check bits.
- 18 . The buffer of claim 11 , wherein the corrected data has fewer bits than the sum of the first read data and the metadata.
- 19 . The buffer of claim 18 , wherein the corrected data has fewer bits than the first read data.
- 20 . A module comprising: a first rank of memory devices; a second rank of memory devices; and a memory buffer for managing read and write transactions with the first and second ranks of memory devices, the memory buffer comprising: a host interface to receive host commands, including a host read command; a first memory interface to issue, responsive to the host read command, a first command to a first rank of the memory at a first memory address, the first memory interface to receive first read data from the first memory address responsive to the first command; a control block to calculate a second memory address as a function of the first memory address; a second memory interface to issue, responsive to the host read command, a second command to a second rank of the memory at the second memory address, the second memory interface to receive second read data from the second memory address responsive to the second command, the second read data including a subset of metadata; and an error-detection-and-correction (EDC) block to correct an error in the first read data using the metadata to produce corrected data.
Description
FIELD OF THE INVENTION The subject matter presented herein relates to error correction for memory systems and modules. BACKGROUND Personal computers, workstations, and servers include at least one processor, such as a central processing unit (CPU), and some form of memory system that includes dynamic, random-access memory (DRAM). The processor executes instructions and manipulates data stored in the DRAM. DRAM stores binary bits by alternatively charging or discharging capacitors to represent the logical values one and zero. The capacitors are exceedingly small, and their stored charges can be upset by electrical interference or high-energy particles. The resultant changes to the stored instructions and data produce undesirable computational errors. Some computer systems, such as high-end servers, employ various forms of error detection and correction to manage DRAM errors, or even more permanent memory failures. The general idea is to add storage for extra information that can be used to identify and correct for errors. By way of example, conventional servers that support error correction commonly include memory modules that read and write data in 512-bit (512 b) chunks called “cache lines. ” Cache lines are spread across four DRAM dies that each communicates 512 b/4=128 b per read or write transaction. Adding a fifth DRAM die allows the memory to communicate an additional 128 b of parity data per transaction, which increases the size of a cache line to 640 b per transaction. The 128 b parity bits are calculated for each 512 b write transaction and the resulting 640 b cache line is stored together at the same memory address. The data and parity data are read back together and the parity bits are used for error detection and correction (EDC) robust enough to correct for any single DRAM die failure as long as it is known which is the failing single die. Parity data sufficient to correct an error may be insufficient to identify the source of the error. A defective resource, such as a bad connection or memory device, can thus go uncorrected or even unnoticed. Additional data—sometimes called “metadata”—can be stored with data and parity bits to identify sources of errors and thus avoid silent data corruption. Unfortunately, this improvement requires additional memory and can diminish memory speed performance. BRIEF DESCRIPTION OF THE DRAWINGS The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which: FIG. 1 includes a simplified block diagram of a memory system 100 in which a memory buffer 105 services commands from a host 110 to access data in a memory 115 using parity bits augmented with metadata for improved error detection and correction (EDC). FIG. 2 depicts a memory buffer 200 with a host-side DDR5 physical interface (phy) 205 and a pair of LPDDR5 DRAM physical interfaces phy 210.0 and phy 210.1. FIG. 3A shows waveform diagrams 300 and 310 respectively illustrating read and write transactions directed by a host and managed by memory buffer 200 of FIG. 2. FIG. 3B shows waveform diagram 320 and 330 respectively illustrating interleaved read and write transactions directed by a host and managed by memory buffer 200 of FIG. 2. FIG. 4 is a flowchart 400 illustrating the roles of a host and memory buffer in managing successive read transactions to a pair of DRAM ranks 0 and 1. FIG. 5 depicts a memory system 500 in which a host controller 505 has access to a memory module 510 with five DRAM components 515. DETAILED DESCRIPTION FIG. 1 includes a simplified block diagram of a memory system 100 in which a memory buffer 105 services commands from a host 110 to access data in a memory 115 using parity bits augmented with metadata for improved error detection and correction (EDC). The EDC improvements do not require the participation of host 105 and come without significant hardware overhead or reduced speed performance. A diagram 120 shows how data, parity bits, and metadata for enhanced EDC are distributed among sixty-four columns Col[63:0] across five memory dies Die[4:0]. Each of the first sixty columns Col[59: 0] includes 512 b of data, 128 b on each of dies Die[3:0], 128 b of parity data. (The term “data” here refers to the information conveyed from host 110 for storage and related parity bits buffer 105 calculates from the host data.) Each of the last four columns Col[63:60] is divided into sixteen 32 b sub-columns, sixty of which stores metadata (data about data) for a corresponding one of columns Col[59:0]. The sub-columns are addressed from zero to sixty from left to right, top to bottom, so that the leftmost sub-column of Col[60] is metadata zero (MD0) and corresponds to the data of column Col0 and the rightmost sub-column of metadata in Col[63] is MD59 and corresponds to the data of column Col59. Columns Col[63:60] have four more 32 b sub-columns than are used