EP-4742033-A1 - CACHE MANAGEMENT METHOD, CACHE MANAGEMENT APPARATUS, AND PROCESSOR AND ELECTRONIC APPARATUS

EP4742033A1EP 4742033 A1EP4742033 A1EP 4742033A1EP-4742033-A1

Abstract

Provided in the embodiments of the present disclosure are a cache management method, a cache management apparatus, and a processor and an electronic apparatus. The cache management method comprises: acquiring a cache access preference of a first thread for a first cache of a multi-thread processor, which first thread is among a plurality of threads running in the multi-thread processor; and in response to the cache access preference of the first thread for the first cache meeting a cache bypass condition, stopping the load of a data block, which is accessed by the first thread, into the first cache. In the method, by means of acquiring cache access preferences of different threads for at least one cache, and making a thread having a low cache access preference bypass a cache to be accessed, the contention and conflict between threads for a cache is resolved, thereby improving the utilization efficiency of the cache, and realizing low overhead and low power consumption of hardware storage.

Inventors

PAN, HAIYANG
JIN, Weisong

Assignees

Hygon Information Technology Co., Ltd.

Dates

Publication Date: 20260513
Application Date: 20240603

Claims (20)

A cache management method, comprising: acquiring a cache access preference of a first thread among a plurality of threads running in a multi-threading processor for a first cache of the multi-threading processor; and in response to the cache access preference of the first thread for the first cache satisfying a cache bypass condition, preventing a data block accessed by the first thread from being loaded into the first cache.
The cache management method according to claim 1, wherein the cache bypass condition comprises the cache access preference being lower than cache access preferences of other threads and lower than a preset threshold.
The cache management method according to claim 1 or 2, wherein the acquiring a cache access preference of a first thread among a plurality of threads running in a multi-threading processor for a first cache of the multi-threading processor comprises: computing a data block reuse rate based on a reuse situation of a data block loaded into the first cache by the first thread during a period from entering the first cache to being replaced out of the first cache.
The cache management method according to claim 3, wherein the computing a data block reuse rate based on a reuse situation of a data block loaded into the first cache by the first thread during a period from entering the first cache to being replaced out of the first cache comprises: for a first time period, recording a first value for a number of times the data block loaded into the first cache by the first thread is re-accessed during the period from entering the first cache to being replaced out of the first cache; recording a second value for a number of times the data block loaded into the first cache by the first thread is not re-accessed during the period from entering the first cache to being replaced out of the first cache; and computing the data block reuse rate of the data block loaded into the first cache by the first thread based on the first value and the second value, wherein the data block reuse rate = the first value / (the first value + the second value).
The cache management method according to claim 4, wherein the computing a data block reuse rate based on a reuse situation of a data block loaded into the first cache by the first thread during a period from entering the first cache to being replaced out of the first cache further comprises: providing, for the first cache, a first counter and a second counter for the first thread, wherein the first counter is configured to record the first value and the second counter is configured to record the second value.
The cache management method according to claim 5, wherein the first cache comprises a plurality of cache lines, and each cache line of the plurality of cache lines comprises a data block, a first field, and a second field, the first field is configured to indicate a thread that loads the data block into the first cache, and the second field is configured to indicate whether the data block is re-accessed during the period from entering the first cache to being replaced out of the first cache.
The cache management method according to claim 6, wherein, in a case where one data block of a plurality of data blocks is replaced out of the first cache, in response to a second field corresponding to the data block indicating that the data block is re-accessed during the period from entering the first cache to being replaced out of the first cache, incrementing a first value corresponding to a thread indicated by the first field corresponding to the data block by one; and in response to the second field corresponding to the data block indicating that the data block is not re-accessed during the period from entering the first cache to being replaced out of the first cache, incrementing a second value corresponding to the thread indicated by the first field corresponding to the data block by one.
The cache management method according to claim 1 or 2, wherein the acquiring a cache access preference of a first thread among a plurality of threads running in a multi-threading processor for a first cache of the multi-threading processor comprises: computing, for the first cache, a hit rate of cache access by the first thread.
The cache management method according to claim 8, wherein the computing, for the first cache, a hit rate of cache access by the first thread, comprises: for a first time period, recording a third value for a number of cache hits for the first thread accessing the first cache; recording a fourth value for a number of cache misses for the first thread accessing the first cache; and computing the hit rate of the first thread accessing the first cache based on the third value and the fourth value, wherein the hit rate = the third value / (the third value + the fourth value).
The cache management method according to claim 9, wherein the computing, for the first cache, a hit rate of cache access by the first thread, further comprises: providing, for the first cache, a third counter and a fourth counter for the first thread, wherein the third counter is configured to record the third value and the fourth counter is configured to record the fourth value.
The cache management method according to claim 1 or 2, wherein the acquiring a cache access preference of a first thread among a plurality of threads running in a multi-threading processor for a first cache of the multi-threading processor comprises: computing, for the first cache, a probability that an instruction of cache access by the first thread causes a pipeline stall.
A cache management apparatus, comprising: an acquisition module, configured to acquire a cache access preference of a first thread among a plurality of threads running in a multi-threading processor for a first cache of the multi-threading processor; and a processing module, configured to, in response to the cache access preference of the first thread for the first cache satisfying a cache bypass condition, prevent a data block accessed by the first thread from being loaded into the first cache.
The cache management apparatus according to claim 12, wherein the cache bypass condition comprises the cache access preference being lower than cache access preferences of other threads and lower than a preset threshold.
The cache management apparatus according to claim 12 or 13, wherein the acquisition module comprises: a first computing module, configured to compute a data block reuse rate based on a reuse situation of a data block loaded into the first cache by the first thread during a period from entering the first cache to being replaced out of the first cache.
The cache management apparatus according to any one of claims 12-14, wherein the acquisition module comprises: a second computing module, configured to compute, for the first cache, a hit rate of cache access by the first thread.
A processor, comprising: a pipeline, configured to run a plurality of threads; a first cache; and a cache controller, configured to acquire a cache access preference of a first thread among the plurality of threads running in a multi-threading processor for the first cache of the multi-threading processor, and in response to the cache access preference of the first thread for the first cache satisfying a cache bypass condition, prevent a data block accessed by the first thread from being loaded into the first cache.
The processor according to claim 16, wherein the first cache comprises a plurality of cache lines, and each cache line of the plurality of cache lines comprises a data block, a first field, and a second field, the first field is configured to indicate a thread that loads the data block into the first cache, and the second field is configured to indicate whether the data block is re-accessed during the period from entering the first cache to being replaced out of the first cache.
The processor according to claim 16 or 17, further comprising a counter set, wherein the counter set comprises a first counter and a second counter for the first cache, the first counter is configured to record a first value, the second counter is configured to record a second value, the first value represents a number of times the data block loaded into the first cache by the first thread is re-accessed during the period from entering the first cache to being replaced out of the first cache, and the second value represents a number of times the data block loaded into the first cache by the first thread is not re-accessed during the period from entering the first cache to being replaced out of the first cache.
The processor according to any one of claims 16-18, further comprising multi-level caches, wherein the multi-level caches comprise the first cache.
An electronic apparatus, comprising the processor according to any one of claims 16-19.

Description

The application claims priority of the Chinese Patent Application No. 202410381320.X, filed on March 29, 2024, the disclosure of which is incorporated herein by reference in its entirety as part of the present application. TECHNICAL FIELD Embodiments of the present disclosure relate to a cache management method and apparatus, a processor, and an electronic apparatus. BACKGROUND Simultaneous multi-threading (SMT) technology is an important technology to improve the overall performance of the central processing unit (CPU). This technology utilizes mechanisms such as multi-issue and out-of-order execution of high-performance CPU cores to enable a single physical CPU core to execute instructions of multiple threads in parallel at the same moment. From the view of software and operating system, a single physical CPU core can be regarded as multiple virtual CPU cores. Depending on the maximum number of active threads supported, SMT can be referred to as SMT2 (up to two active threads), SMT4 (up to four active threads), and so on. The SMT technique improves the resource utilization efficiency of the high-performance CPU compared to single-threading. When a modem multi-issue high-performance CPU core executes a single thread, its internal execution units and hardware resources are often underutilized most of the time. When a thread is stalled for some reason (e.g., L2 Cache miss), the hardware execution unit can only idle, which results in wasted hardware resources and a lower performance-to-power ratio. In SMT mode, when a thread is stalled, other threads are still able to run, which improves the utilization of hardware resources, thus increasing the multi-thread throughput of the CPU core, overall performance, and performance-to-power ratio. SUMMARY At least one embodiment of the present disclosure provides a cache management method, which includes: acquiring a cache access preference of a first thread among a plurality of threads running in a multi-threading processor for a first cache of the multi-threading processor; and in response to the cache access preference of the first thread for the first cache satisfying a cache bypass condition, preventing a data block accessed by the first thread from being loaded into the first cache. For example, in the cache management method provided in at least one embodiment of the present disclosure, the cache bypass condition includes the cache access preference being lower than cache access preferences of the other threads and lower than a preset threshold. For example, in the cache management method provided in at least one embodiment of the present disclosure, the acquiring a cache access preference of a first thread among a plurality of threads running in a multi-threading processor for a first cache of the multi-threading processor includes: computing a data block reuse rate based on a reuse situation of a data block loaded into the first cache by the first thread during a period from entering the first cache to being replaced out of the first cache. For example, in the cache management method provided in at least one embodiment of the present disclosure, the computing a data block reuse rate based on a reuse situation of a data block loaded into the first cache by the first thread during a period from entering the first cache to being replaced out of the first cache includes: for a first time period, recording a first value for a number of times the data block loaded into the first cache by the first thread is re-accessed during the period from entering the first cache to being replaced out of the first cache; recording a second value for a number of times the data block loaded into the first cache by the first thread is not re-accessed during the period from entering the first cache to being replaced out of the first cache; and computing the data block reuse rate of the data block loaded into the first cache by the first thread based on the first value and the second value, where the data block reuse rate = the first value / (the first value + the second value). For example, in the cache management method provided in at least one embodiment of the present disclosure, the computing a data block reuse rate based on a reuse situation of a data block loaded into the first cache by the first thread during a period from entering the first cache to being replaced out of the first cache further includes: providing, for the first cache, a first counter and a second counter for the first thread, where the first counter is configured to record the first value and the second counter is configured to record the second value. For example, in the cache management method provided in at least one embodiment of the present disclosure, the first cache includes a plurality of cache lines, and each cache line of the plurality of cache lines includes a data block, a first field, and a second field, the first field is configured to indicate a thread that loads the data block into the first