US-12625822-B2 - Methods and apparatus to facilitate read-modify-write support in a victim cache

US12625822B2US 12625822 B2US12625822 B2US 12625822B2US-12625822-B2

Abstract

Methods, apparatus, systems and articles of manufacture are disclosed to facilitate read-modify-write support in a victim cache. An example apparatus includes a first storage coupled to a controller, a second storage coupled to the controller and parallel coupled to the first storage, and a storage queue coupled to the first storage, the second storage, and to the controller, the storage queue to obtain a memory operation from the controller indicating an address and a first set of data, obtain a second set of data associated with the address from at least one of the first storage and the second storage, merge the first set of data and the second set of data to produce a third set of data, and provide the third set of data for writing to at least one of the first storage and the second storage.

Inventors

Naveen Bhoria
Timothy David Anderson
Pete Michael Hippleheuser

Assignees

TEXAS INSTRUMENTS INCORPORATED

Dates

Publication Date: 20260512
Application Date: 20240717

Claims (20)

1 . A device, comprising: a cache controller; a first cache storage coupled to the cache controller; a second cache storage coupled to the cache controller in parallel with the first cache storage; and a storage queue coupled to the second cache storage and the cache controller, wherein the cache controller is configured to cause the storage queue to: receive a first set of data from the first cache storage; receive a second set of data from the second cache storage; merge the first set of data and the second set of data to produce a third set of data; and write the third set of data to the second cache storage.
2 . The device of claim 1 , wherein the storage queue is configured to receive a memory operation that indicates an address and includes the first set of data.
3 . The device of claim 2 , wherein the second set of data is associated with the address.
4 . The device of claim 1 , wherein the storage queue comprises a read-modify-write circuit configured to merge the first set of data and the second set of data to produce the third set of data.
5 . The device of claim 1 , wherein: the storage queue comprises an error correcting code generator circuit configured to generate a set of error correcting code syndrome bits based on the third set of data; and the storage queue is configured to write the set of error correcting code syndrome bits to the second cache storage.
6 . The device of claim 1 , wherein the first cache storage is a level one (L1) main cache storage; and the second cache storage is an L1 victim cache storage.
7 . The device of claim 6 , further comprising: a level two (L2) cache storage; and an L2 interface configured to receive the third set of data for writing to the L2 cache storage.
8 . A system, comprising: a processing unit configured to generate a memory operation; a cache controller configured to receive the memory operation; a first cache storage coupled to the cache controller; a second cache storage coupled to the cache controller in parallel with the first cache storage; and a storage queue coupled to the second cache storage and the cache controller, wherein the cache controller is configured to cause, based on the memory operation, the storage queue to: receive a first set of data from the first cache storage; receive a second set of data from the second cache storage; merge the first set of data and the second set of data to produce a third set of data; and write the third set of data to the second cache storage.
9 . The system of claim 8 , wherein the memory operation indicates an address and includes the first set of data.
10 . The system of claim 9 , wherein the second set of data is associated with the address.
11 . The system of claim 8 , wherein the storage queue comprises a read-modify-write circuit configured to merge the first set of data and the second set of data to produce the third set of data.
12 . The system of claim 8 , wherein: the storage queue comprises an error correcting code generator circuit configured to generate a set of error correcting code syndrome bits based on the third set of data; and the storage queue is configured to write the set of error correcting code syndrome bits to the second cache storage.
13 . The system of claim 8 , wherein the first cache storage is a level one (L1) main cache storage; and the second cache storage is an L1 victim cache storage.
14 . The system of claim 13 , further comprising: a level two (L2) cache storage; and an L2 interface configured to receive the third set of data for writing to the L2 cache storage.
15 . A method, comprising: retrieving, by a cache controller, a first set of data from a first cache storage; providing the first set of data to a storage queue coupled to a second cache storage; retrieving a second set of data from the second cache storage; merging, by the storage queue, the first set of data and the second set of data to produce a third set of data; and storing, by the storage queue, the third set of data in the second cache storage.
16 . The method of claim 15 , wherein the storage queue receives a memory operation that indicates an address and includes the first set of data.
17 . The method of claim 16 , wherein the second set of data is associated with the address.
18 . The method of claim 15 , further comprising: generating a set of error correcting code syndrome bits based on the third set of data; and storing the set of error correcting code syndrome bits in the second cache storage.
19 . The method of claim 15 , wherein: the first cache storage is a level one (L1) main cache storage; and the second cache storage is an L1 victim cache storage.
20 . The method of claim 19 , further comprising: providing the third set of data to a level two (L2) cache storage for writing.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. patent application Ser. No. 18/309,893 filed May 1, 2023, which is a continuation of U.S. patent application Ser. No. 16/882,231, filed May 22, 2020, now U.S. Pat. No. 11,640,357, issued May 2, 2023, which claims the benefit of and priority to U.S. Provisional Patent Application Ser. No. 62/852,494, filed May 24, 2019, titled “Data Caches and Related Systems and Methods,” all of which are hereby incorporated herein by reference in their entireties. FIELD OF THE DISCLOSURE This disclosure relates generally to computer architecture, and, more particularly, to methods and apparatus to facilitate read-modify-write support in a victim cache. BACKGROUND Computing systems include one or more processing cores to execute instructions by accessing data stored in memory. However, the amount of time it takes for the processing core to access data from the memory can be significant. Accordingly, most computing systems include a cache which stores an amount of data from the memory (e.g., typically smaller than the total amount of data in the memory) that has a high probability of being accessed by the processing core in the future. Accordingly, when the processing core calls for data, the cache can provide the data to the processing core faster than the processing core retrieving the data from the memory, thereby increasing the speed and efficiency of the computing system. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is an example computing system described in conjunction with examples disclosed herein. FIG. 2 is a block diagram of an example level one cache of the example computing system of FIG. 1. FIGS. 3A-3D illustrate an example circuit implementation of the level one cache of the example computing system of FIG. 1. FIG. 4A is an example circuit implementation of the main cache store queue of FIGS. 3A-3D. FIG. 4B is an example circuit implementation of example data forwarding logic implemented in connection with the main cache store queue of FIGS. 3A-4A. FIG. 4C is an additional example circuit implementation of the main cache store queue and/or of a victim cache store queue. FIG. 4D is an example hardware implementation of a merging circuit of FIG. 4A. FIG. 5 illustrates an example victim cache-based section of the level one cache of FIGS. 3A-3D. FIG. 6 illustrates an example first table and an example second table for implementing an example replacement policy component of FIGS. 3A-3D and 5. FIG. 7A illustrates an example encapsulated data cache system including a victim cache store queue and victim storage. FIG. 7B illustrates another example encapsulated data cache system including a main cache store queue and main storage. FIG. 7C illustrates an example victim cache bank structure. FIG. 7D illustrates an example main cache bank structure. FIG. 7E illustrates an example unified cache bank structure. FIG. 8A illustrates a schematic illustration of an example implementation of victim cache tag random access memory. FIG. 8B illustrates another schematic illustration of the example implementation of the victim cache tag random access memory of FIG. 8A. FIGS. 9A and 9B (collectively FIG. 9) illustrate example instruction servicing workflows. FIGS. 10A-1 and 10A-2 (collectively FIG. 10A) illustrate an example data cache system. FIGS. 10B-1 and 10B-2 (collectively FIG. 10B) illustrate a second example data cache system. FIG. 11A is an example circuit implementation of the victim cache store queue of FIGS. 3A-3D. FIGS. 11B-1 and 11B-2 (collectively FIG. 11B) illustrate an example circuit implementation of an example unified cache store queue. FIG. 12-33 are flowcharts representative of machine readable instructions which may be executed to implement the level one cache of FIGS. 1-3D. FIG. 34 is a block diagram of an example processing platform structured to execute the instructions of FIGS. 3A-3D to implement the level one cache of FIGS. 1-3D. DETAILED DESCRIPTION The figures are not to scale. Instead, the thickness of the layers or regions may be enlarged in the drawings. In general, the same reference numbers will be used throughout the drawing(s) and accompanying written description to refer to the same or like parts. As used in this patent, stating that any part (e.g., a layer, film, area, region, or plate) is in any way on (e.g., positioned on, located on, disposed on, or formed on, etc.) another part, indicates that the referenced part is either in contact with the other part, or that the referenced part is above the other part with one or more intermediate part(s) located therebetween. Connection references (e.g., attached, coupled, connected, and joined) are to be construed broadly and may include intermediate members between a collection of elements and relative movement between elements unless otherwise indicated. As such, connection references do not necessarily infer that two elements are directly connected and in fixed relation t