US-12619544-B2 - Differential approach for indirect memory prefetcher training
Abstract
Certain aspects of the present disclosure provide techniques and apparatus that may be used to efficiently train address generation components of an indirect memory prefetcher (IMP). In some cases, logical shift operators and comparators may be used to determine an unknown size and base address used to generate a prefetch address.
Inventors
- Damian MAIORANO
- Tanvir Manhotra
- SABINE FRANCIS
Assignees
- QUALCOMM INCORPORATED
Dates
- Publication Date
- 20260505
- Application Date
- 20240813
Claims (20)
- 1 . An apparatus for computing an address for an indirect memory prefetch address, comprising: circuitry configured to compute a first parameter that represents a difference between a first address of a first entry of a data array and a second address of a second entry of the data array, wherein the first entry of the data array corresponds to a first index of an array of indices and the second entry of the data array corresponds to a second index of the array of indices; circuitry configured to compute a second parameter that represents a difference between the first index and the second index; circuitry configured to logically shift the second parameter by different values to generate shifted versions of the second parameter; circuitry configured to predict a value of a size parameter used for an indirect memory prefetch, wherein the prediction is based on comparisons of the shifted versions of the second parameter by different amounts to find a match between the first parameter and a shifted version of the second parameter; circuitry configured to compute a base address for the indirect memory prefetch based on the predicted value of the size parameter, the first parameter, and the second parameter; and circuitry configured to compute a prefetch address for the indirect memory prefetch based on the computed base address and the predicted value of the size parameter.
- 2 . The apparatus of claim 1 , wherein the size parameter corresponds to a size of elements in the data array.
- 3 . The apparatus of claim 1 , wherein each of the different values correspond to a different value of the size parameter equal to 2 to an exponent of an integer, n.
- 4 . The apparatus of claim 3 , wherein the different values include at least one value corresponding to n=0.
- 5 . The apparatus of claim 3 , wherein the different values include at least one value corresponding to a value of the size parameter that is less than 1.
- 6 . The apparatus of claim 3 , wherein the circuitry configured to compute the base address uses a result of the comparisons to control a multiplexor to select a corresponding shifted version of the second parameter for use in computing the base address.
- 7 . The apparatus of claim 1 , further computing circuitry for performing the indirect memory prefetch based on the computed prefetch address.
- 8 . A method for computing an address for an indirect memory prefetch address, comprising: computing a first parameter that represents a difference between a first address of a first entry of a data array and a second address of a second entry of the data array, wherein the first entry of the data array corresponds to a first index of an array of indices and the second entry of the data array corresponds to a second index of the array of indices; computing a second parameter that represents a difference between the first index and the second index; logically shifting the second parameter by different values to generate shifted versions of the second parameter; predicting a value of a size parameter used for an indirect memory prefetch, wherein the prediction is based on comparisons of the shifted versions of the second parameter by different amounts to find a match between the first parameter and a shifted version of the second parameter; computing a base address for the indirect memory prefetch based on the predicted value of the size parameter, the first parameter, and the second parameter; and computing an address for the indirect memory prefetch based on the computed base address and the predicted value of the size parameter.
- 9 . The method of claim 8 , wherein the size parameter corresponds to a size of elements in the data array.
- 10 . The method of claim 8 , wherein each of the different values correspond to a different value of the size parameter equal to 2 to an exponent of an integer, n.
- 11 . The method of claim 10 , wherein the different values include at least one value corresponding to n=0.
- 12 . The method of claim 10 , wherein the different values include at least one value corresponding to a value of the size parameter that is less than 1.
- 13 . The method of claim 10 , further comprising using a result of the comparisons to control a multiplexor to select a corresponding shifted version of the second parameter for use in computing the base address.
- 14 . The method of claim 8 , further computing performing the indirect memory prefetch based on the computed address.
- 15 . An apparatus, comprising: means for computing a first parameter that represents a difference between a first address of a first entry of a data array and a second address of a second entry of the data array, wherein the first entry of the data array corresponds to a first index of an array of indices and the second entry of the data array corresponds to a second index of the array of indices; means for computing a second parameter that represents a difference between the first index and the second index; means for logically shifting the second parameter by different values to generate shifted versions of the second parameter; means for predicting a value of a size parameter used for an indirect memory prefetch, wherein the prediction is based on comparisons of the shifted versions of the second parameter by different amounts to find a match between the first parameter and a shifted version of the second parameter; means for computing a base address for the indirect memory prefetch based on the predicted value of the size parameter, the first parameter, and the second parameter; and means for computing an address for the indirect memory prefetch based on the computed base address and the predicted value of the size parameter.
- 16 . The apparatus of claim 15 , wherein the size parameter corresponds to a size of elements in the data array.
- 17 . The apparatus of claim 15 , wherein each of the different values correspond to a different value of the size parameter equal to 2 to an exponent of an integer, n.
- 18 . The apparatus of claim 17 , wherein the different values include at least one value corresponding to n=0.
- 19 . The apparatus of claim 17 , wherein the different values include at least one value corresponding to a value of the size parameter that is less than 1.
- 20 . The apparatus of claim 17 , further comprising means for using a result of the comparisons to control a multiplexor to select a corresponding shifted version of the second parameter for use in computing the base address.
Description
TECHNICAL FIELD Certain aspects of the present disclosure generally relate to prefetchers and, more particularly, to techniques for training indirect memory prefetcher (IMP) components. BACKGROUND A processing system includes a central processing unit (CPU), cache memory, main memory (e.g., random access memory), and a prefetcher. The prefetcher anticipates data (and/or instructions) the CPU may need from the main memory, fetches the data from the main memory, and loads the data into the cache memory. By fetching the data from the main memory before the data is needed by the CPU, the prefetcher minimizes an amount of time the CPU has to wait for data thereby improving the efficiency of the processing system. BRIEF SUMMARY Certain aspects provide a method that may be used to train indirect memory prefetcher (IMP) components. The memory generally includes computing a first parameter that represents a difference between a first address of a first entry of a data array and a second address of a second entry of the data array, wherein the first entry of the data array corresponds to a first index of an array of indices and the second entry of the data array corresponds to a second index of the array of indices, computing a second parameter that represents a difference between the first index and the second index, logically shifting the second parameter by different values to generate shifted versions of the second parameter, predicting a value of a size parameter used for an indirect memory prefetch, wherein the prediction is based on comparisons of the shifted versions of the second parameter by different amounts to find a match between the first parameter and a shifted version of the second parameter, computing a base address for the indirect memory prefetch based on the predicted value of the size parameter, the first parameter, and the second parameter, and computing an address for the indirect memory prefetch based on the computed base address and the predicted value of the size parameter. Other aspects provide a processor comprising an IMP configured to perform the aforementioned method as well as those described herein; and a processor comprising means for performing the aforementioned method as well as those further described herein. The following description and the related drawings set forth in detail certain illustrative features of one or more aspects. BRIEF DESCRIPTION OF THE DRAWINGS The appended figures depict certain features of one or more aspects of the present disclosure and are therefore not to be considered limiting of the scope of this disclosure. FIG. 1 depicts an example computing environment for prefetching an access pattern associated with an application according to various aspects of the present disclosure. FIG. 2 depicts an address generation component of an indirect memory prefetcher (IMP), according to various aspects of the present disclosure. FIG. 3 depicts an example differential approach to train an IMP memory address generation component, according to various aspects of the present disclosure. FIG. 4 depicts a method for training an IMP address generation component, according to various aspects of the present disclosure. FIG. 5 depicts an example processing system configured to perform various aspects of the present disclosure. To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the drawings. It is contemplated that elements and features of one aspect may be beneficially incorporated in other aspects without further recitation. DETAILED DESCRIPTION Aspects of the present disclosure provide apparatuses, methods, processing systems, and computer-readable mediums for training an IMP address generation component. Memory prefetching generally refers to a mechanism used in computer architectures to improve the efficiency of memory access by speculatively loading data into memory with low access times (e.g., a local cache). Prefetching works by predicting which data (or instructions) will be needed soon (e.g., next or in the near future) and loading that data into a cache (which is fast access) before it is actually requested by a processor (e.g., a central processing unit/CPU). This helps to reduce the latency associated with fetching data from main memory, which is slower than accessing data from the cache. By preloading data into the cache, prefetching can significantly speed up the execution of programs, especially those with predictable memory access patterns, such as loops or sequential data processing. There are various types of prefetching techniques, including hardware prefetching and software prefetching. As the name implies, hardware prefetching is implemented in hardware and operates automatically, without requiring any software intervention. Hardware prefetching relies on algorithms to predict future memory accesses based on past patterns. An indirect memory prefetcher (IMP) generall