US-12626757-B2 - Low-latency multiplexed pipeline memory
Abstract
A memory such as for a high-speed microprocessor cache includes a first bank of bitcells and a second bank of bitcells. The banks are read from in a pipelined fashion with respect to cycles of a system clock signal such that a first read operation to the first bank is initiated with respect to a first cycle of the system clock signal whereas a second read operation to the second bank is initiated with respect to a second cycle of the system clock that is consecutive to the first clock cycle. A multiplexer selects between latched bits from the read operations responsive to cycles of the system clock signal.
Inventors
- Subho Chatterjee
- Xiao Chen
- Arun Ramamurthy
- Chulmin Jung
Assignees
- QUALCOMM INCORPORATED
Dates
- Publication Date
- 20260512
- Application Date
- 20240501
Claims (14)
- 1 . A memory, comprising: a first bank of bitcells; a first latch configured to latch a first bit responsive to a first self-timed read operation to the first bank in response to a system clock signal; a second bank of bitcells; a second latch configured to latch a second bit responsive to a second self-timed read operation to the second bank in response to the system clock signal; a bank multiplexer configured to select between a latched first bit from the first latch and a latched second bit from the second latch responsive to a bank multiplexer select signal; and a bank multiplexer select signal generator configured to generate the bank multiplexer select signal to control the bank multiplexer to alternate between a selection of the latched first bit from the first latch responsive to a first cycle of the system clock signal and a selection of the latched second bit from the second latch responsive to a second cycle of the system clock signal that is consecutive to the first cycle, wherein the bank multiplexer select signal generator comprises a first serial pair of registers clocked by the system clock signal to register a first bank selection signal; and a second serial pair of registers clocked by the system clock signal to register a second bank selection signal.
- 2 . The memory of claim 1 , wherein the bank multiplexer select signal is a binary signal.
- 3 . The memory of claim 1 , wherein the first serial pair of registers comprises a first serial pair of master slave flip-flops, and wherein the second serial pair of registers comprises a second serial pair of master slave flip-flops.
- 4 . The memory of claim 1 , wherein the bank multiplexer select signal generator further comprises: a first logic gate having a first input terminal coupled to an output terminal of a final register in the first serial pair of registers and having a second input terminal coupled though a first inverter to the output terminal of the final register in the first serial pair of registers; a second logic gate having a first input terminal coupled to an output terminal of a final register in the second serial pair of registers and having a second input terminal coupled through a second inverter to the output terminal of the final register in the second serial pair of registers; and a set-reset latch configured to generate the bank multiplexer select signal, the set-reset latch having a set terminal coupled to an output terminal of the first logic gate and having a reset terminal coupled to an output terminal of the second logic gate.
- 5 . The memory of claim 4 , wherein the first logic gate comprises a first NAND gate and wherein the second logic gate comprises a second NAND gate.
- 6 . The memory of claim 1 , wherein the bank multiplexer comprises: a first transmission gate coupled to the first latch and a second transmission gate coupled to the second latch.
- 7 . The memory of claim 1 , wherein the first latch further comprises a first level-shifter and wherein the second latch further comprises a second level-shifter.
- 8 . The memory of claim 7 , further comprising: a first sense amplifier coupled to the first bank of bitcells and configured to sense a first sense amplifier output signal and a complement first sense amplifier output signal during the first self-timed read operation to the first bank; a first p-type metal-oxide semiconductor (PMOS) transistor having a source coupled to a node for a core power supply voltage; a first n-type metal-oxide semiconductor (NMOS) transistor having a source coupled to ground, a drain coupled to a drain of the first PMOS transistor, and a gate coupled to a node for the first sense amplifier output signal; and a first inverter coupled between a node for the complement first sense amplifier output signal and a gate of the first PMOS transistor.
- 9 . The memory of claim 8 , further comprising: a second sense amplifier coupled to the second bank of bitcells and configured to sense a second sense amplifier output signal and a complement second sense amplifier output signal during the second self-timed read operation to the second bank; a second PMOS transistor having a source coupled to the node for the core power supply voltage; a second NMOS transistor having a source coupled to ground, a drain coupled to a drain of the second PMOS transistor, and a gate coupled to a node for the second sense amplifier output signal; and a second inverter coupled between a node for the complement second sense amplifier output signal and a gate of the second PMOS transistor.
- 10 . The memory of claim 1 , wherein the memory comprises a microprocessor cache.
- 11 . The memory of claim 1 , wherein the memory is included within a cellular telephone.
- 12 . A memory, comprising: a first bank of bitcells; a first latch configured to latch a first bit from the first bank of bitcells responsive to a first self-timed read operation to the first bank of bitcells to provide a first latched bit; a second bank of bitcells; a second latch configured to latch a second bit from the second bank of bitcells responsive to a second self-timed read operation to the second bank of bitcells to provide a second latched bit; means for alternatively selecting between the first latched bit and the second latched bit responsive to cycles of a system clock signal such that in a first cycle of the system clock signal the means selects for the first latched bit and in second cycle of the system clock signal that is consecutive to the first cycle the means selects for the second latched bit; an output register configured to be clocked by the system clock signal to register a data output signal from the means; a first inverter coupled between the first latch and the means; a second inverter coupled between the second latch and the means; and a third inverter coupled to an output terminal of the means.
- 13 . The memory of claim 12 , wherein a duration of the first self-timed read operation to the first bank of bitcells is less than a period of the system clock signal.
- 14 . The memory of claim 12 , wherein a duration of the first self-timed read operation to the first bank of bitcells is greater than a period of the system clock signal.
Description
TECHNICAL FIELD The present application relates generally to memories and, more specifically, to a low-latency multiplexed pipeline memory. BACKGROUND As compared to dynamic random-access memory (DRAM), a static random-access memory is generally faster but more expensive as the storage of a bit in SRAM requires more transistors as compared to DRAM. The main memory of a microprocessor is thus typically a DRAM due to its lower cost. But the microprocessor processing speed would be too slow if the microprocessor had to fetch all its data and instructions from the main memory. It is therefore conventional for a microprocessor to include a variety of SRAM caches for the storing of frequently used data and instructions. Each SRAM cache includes numerous transistors that consume power through the conduction of leakage currents even while the SRAM cache is idle. In addition, a latency of the read operations to the SRAM caches can limit the microprocessor operating speed. SUMMARY In accordance with an aspect of the disclosure, a memory is provided that includes: a first bank of bitcells; a first latch configured to latch a first bit responsive to a first self-timed read operation to the first bank in response to a system clock signal; a second bank of bitcells; a second latch configured to latch a second bit responsive to a second self-timed read operation to the second bank in response to the system clock signal; and a bank multiplexer configured to select between a latched first bit from the first latch and a latched second bit from the second latch responsive to a bank multiplexer select signal. In accordance with another aspect of the disclosure, a method of reading from a memory is provided that includes: initiating a first read operation to a first bank of bitcells responsive to a first cycle of a system clock signal; latching a first bit from the first read operation in a first latch responsive to a self-timed completion of the first read operation to provide a latched first bit; initiating a second read operation to a second bank of bitcells during a second cycle of the system clock signal that is consecutive to the first cycle; and controlling a bank multiplexer to select for the latched first bit during the second cycle of the system clock signal to form a first data output signal. Finally, in accordance with yet another aspect of the disclosure, a memory is provided that includes: a first bank of bitcells; a first latch configured to latch a first bit from the first bank of bitcells responsive to a first self-timed read operation to the first bank of bitcells to provide a first latched bit; a second bank of bitcells; a second latch configured to latch a second bit from the second bank of bitcells responsive to a second self-timed read operation to the second bank of bitcells to provide a second latched bit; means for alternatively selecting between the first latched bit and the second latched bit responsive to cycles of a system clock signal such that in a first cycle of the system clock signal the means selects for the first latched bit and in second cycle of the system clock signal that is consecutive to the first cycle the means selects for the second latched bit; and an output register configured to be clocked by the system clock signal to register a data output signal from the means. These and other advantageous features may be better appreciated through the following detailed description. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a timing diagram for a read operation to a microprocessor cache having a conventional three-cycle read operation latency. FIG. 2 is a diagram of a memory such as for a microprocessor cache having only a two-cycle read operation latency in accordance with an aspect of the disclosure. FIG. 3 is a circuit diagram of a bank multiplexer select signal generator for the memory of FIG. 2 in accordance with an aspect of the disclosure. FIG. 4 is a circuit diagram of portions of the memory of FIG. 2 in accordance with an aspect of the disclosure. FIG. 5 is a timing diagram for an implementation of a memory which a system clock to data output signal timing is limited by a duration of the self-timed read operations to the banks in accordance with an aspect of the disclosure. FIG. 6 is a timing diagram for an implementation of a memory which a system clock to data output signal timing is limited by the system clock signal frequency in accordance with an aspect of the disclosure. FIG. 7 is a flowchart for an example method of reading from a pair of pipelined banks in accordance with an aspect of the disclosure. FIG. 8 illustrates some example electronic systems including a memory in accordance with an aspect of the disclosure. Implementations of the present disclosure and their advantages are best understood by referring to the detailed description that follows. It should be appreciated that like reference numerals are used to identify like elements illustrated in one or more of the fi