Search

CN-121979575-A - Instruction prefetching method, instruction prefetching device, electronic equipment and computer readable storage medium

CN121979575ACN 121979575 ACN121979575 ACN 121979575ACN-121979575-A

Abstract

An instruction prefetching method, an instruction prefetching device, an electronic device and a computer readable storage medium. The instruction prefetching method comprises the steps of determining whether instruction prefetching operation is executed for the instruction reading request or not by the final stage buffer in response to receiving the instruction reading request from any instruction buffer, determining the prefetching step length of the instruction prefetching operation if the instruction prefetching operation is determined to be executed, and prefetching and buffering the instruction from the target memory based on the prefetching step length. The method can improve the hit rate of the request from the instruction cache in the final cache and reduce the performance loss caused by the instruction deletion.

Inventors

  • Request for anonymity
  • Request for anonymity

Assignees

  • 上海壁仞科技股份有限公司

Dates

Publication Date
20260505
Application Date
20260327

Claims (15)

  1. 1. An instruction pre-fetching method for a last level cache, the last level cache being configured to process access requests of at least one instruction cache and at least one data cache, the method comprising: the final stage cache is used for responding to receiving an instruction reading request from any instruction cache and determining whether instruction prefetching operation is executed for the instruction reading request; if it is determined to perform the instruction prefetch operation, determining a prefetch stride for the instruction prefetch operation, and Based on the prefetch stride, instructions are prefetched from a target memory and cached.
  2. 2. The instruction prefetch method according to claim 1, wherein the instruction fetch request includes a prefetch field; Wherein determining whether to perform an instruction prefetch operation for the instruction read request comprises: Responsive to the prefetch field being a first value, determining to perform the instruction prefetch operation.
  3. 3. The instruction prefetch method of claim 1, wherein the instruction fetch request includes a stride field; Wherein determining a prefetch stride of the instruction prefetch operation includes determining the prefetch stride based on the stride field.
  4. 4. The method of claim 3, wherein the instruction fetch request includes a target address, the value of the step size field is determined according to an address field where the target address is located, and the plurality of values of the step size field respectively correspond to different address fields.
  5. 5. The instruction pre-fetching method of claim 1, wherein determining a pre-fetch step size of the instruction pre-fetch operation comprises: Determining access distance parameters corresponding to the final cache, and determining the prefetch step length based on the access distance parameters; And the access distance parameter characterizes the transmission distance from the final stage cache to the target memory.
  6. 6. The instruction pre-fetching method of claim 3, wherein determining the pre-fetch step based on the step field comprises: determining a first step corresponding to a step field based on the corresponding relation between the step field and a step value; Determining a second step length based on the access distance parameter corresponding to the final stage cache, and And adding the first step length and the second step length to be used as the prefetching step length.
  7. 7. The instruction pre-fetch method of any of claims 1-6 wherein the at least one instruction cache comprises a first instruction cache; wherein, based on the prefetch stride, prefetching instructions from a target memory and caching, comprising: Recording the number of occupied cache lines of the prefetched instruction corresponding to the first instruction cache in the final stage cache, and And if the number of the occupied cache lines exceeds a number threshold, in response to receiving a new prefetched instruction corresponding to the first instruction cache, replacing the instruction in the occupied cache line of the first instruction cache with the new prefetched instruction.
  8. 8. The instruction pre-fetch method as recited in any one of claims 1-6 wherein the instruction fetch request includes a virtual address; wherein, based on the prefetch stride, prefetching instructions from a target memory and caching, comprising: determining a physical address corresponding to the virtual address; Determining a physical address of an instruction to be prefetched based on the physical address and the prefetch stride, and And prefetching and caching the instruction from the target memory based on the physical address of the instruction to be prefetched.
  9. 9. The instruction pre-fetching method according to claim 8, wherein pre-fetching instructions from the target memory and caching based on the physical address of the instructions to be pre-fetched, comprises: Determining whether the physical address of the instruction to be prefetched exceeds the storage range of the final-stage cache; If the storage range is not exceeded, generating an instruction prefetching request based on the physical address of the instruction to be prefetched, and placing the instruction prefetching request into an instruction prefetching queue; Arbitrating the instruction pre-fetch queue and the access request queue to obtain a request as a pending request, and And when the pending request is an instruction prefetching request, prefetching instructions from the target memory and caching the instructions.
  10. 10. The instruction pre-fetch method according to any of claims 1-6, wherein the instruction fetch request includes a pre-fetch field and the at least one instruction cache includes a second instruction cache; The method further comprises the steps of: And stopping executing the instruction prefetching operation for the second instruction cache in response to a prefetching field in the instruction reading request sent by the second instruction cache being a second value, wherein the prefetching field is set to the second value when the current program segment is executed to the tail part.
  11. 11. An instruction pre-fetching apparatus for a last level cache for processing access requests of at least one instruction cache and at least one data cache, the apparatus comprising: A first determining unit configured to determine whether to perform an instruction prefetch operation for an instruction read request in response to receiving the instruction read request from any one of the instruction caches; a second determination unit configured to determine a prefetch step size of the instruction prefetch operation if it is determined to perform the instruction prefetch operation, and And the prefetch unit is configured to prefetch instructions from the target memory and cache the instructions based on the prefetch step size.
  12. 12. The instruction prefetch apparatus according to claim 11, wherein the instruction fetch request includes a prefetch field; wherein the first determination unit is further configured to determine to perform the instruction prefetch operation in response to the prefetch field being a first value.
  13. 13. The instruction prefetch apparatus according to claim 11, wherein the instruction fetch request includes a stride field; Wherein the second determining unit is further configured to determine the prefetch step based on the step field.
  14. 14. An electronic device, comprising: A processor; A memory storing one or more computer program modules; Wherein the one or more computer program modules are configured to be executed by the processor for implementing the instruction pre-fetching method of any of claims 1-10.
  15. 15. A computer readable storage medium, characterized in that non-transitory computer readable instructions are stored, which when executed by a computer can implement the instruction pre-fetching method of any of claims 1-10.

Description

Instruction prefetching method, instruction prefetching device, electronic equipment and computer readable storage medium Technical Field Embodiments of the present disclosure relate to an instruction prefetching method, an apparatus, an electronic device, and a computer-readable storage medium. Background In the fields of modern computer systems, network communication and data processing, cache technology is a core optimization means for alleviating memory access delay and improving system throughput. FIG. 1 shows a schematic diagram of a cache architecture of a processor. As shown in FIG. 1, a GPGPU (General-purpose computing on graphics processing, general-purpose graphics processor) includes multiple compute units, each including multiple compute cores, an L1 cache (first level cache) and an L2 cache (second level cache), the multiple compute cores sharing the L1 cache and the L2 cache, and the multiple compute units sharing the L3 cache. The L1 cache is the closest cache to the processor core, and is characterized by the fastest speed but the smallest capacity, wherein the instruction cache stores instructions to be executed by the computing core and the data cache stores data to be processed. The capacity of the L2 cache is larger than that of the L1 cache, the speed is slightly slower than that of the L1 cache, and the L2 cache is used as a supplement to the L1 cache. The L3 cache is the largest in capacity in the three-level cache, but the speed is relatively slow, and the L3 cache is positioned in a shared area inside the processor chip, and has the core function of coordinating data sharing among multiple computing units, so that data redundancy and memory bandwidth waste are avoided. When the computing core sends out data or instruction requests, the computing core searches in the sequence of 'cache, memory and hard disk', namely, checking the L1 cache first, directly returning if hit, checking the L2 cache if miss, and the like. If all caches miss, the data is read from the memory and written into the caches at the same time, ready for the next access. In a CPU (Central Processing Unit ), an L1 cache is generally integrated inside CPU cores, each core is exclusive for one copy, an L2 cache is also exclusive for each CPU core, and is generally located inside or near the core, an L3 cache is shared by all CPU cores, and is located in a shared area inside the CPU chip. Disclosure of Invention At least one embodiment of the present disclosure provides an instruction prefetching method for a last level cache, the last level cache being configured to process access requests of at least one instruction cache and at least one data cache, wherein the method includes determining whether to perform an instruction prefetching operation for an instruction fetch request in response to receiving the instruction fetch request from any one of the instruction caches, determining a prefetch stride of the instruction prefetching operation if it is determined to perform the instruction prefetching operation, and prefetching instructions from a target memory and caching based on the prefetch stride. For example, in an instruction pre-fetching method provided by at least one example of the above-described embodiment of the present disclosure, the instruction read request includes a pre-fetch field, wherein determining whether to perform an instruction pre-fetch operation for the instruction read request includes determining to perform the instruction pre-fetch operation in response to the pre-fetch field being a first value. For example, in an instruction pre-fetching method provided by at least one example of the above-described embodiments of the present disclosure, the instruction fetch request includes a stride field, wherein determining a pre-fetch stride of the instruction pre-fetch operation includes determining the pre-fetch stride based on the stride field. For example, in the instruction prefetching method provided by at least one example of the foregoing embodiment of the present disclosure, the instruction fetch request includes a target address, the value of the step size field is determined according to an address segment where the target address is located, and multiple values of the step size field respectively correspond to different address segments. For example, in an instruction prefetching method provided by at least one example of the foregoing embodiment of the disclosure, determining a prefetch stride of the instruction prefetching operation includes determining a memory access distance parameter corresponding to the last level cache, and determining the prefetch stride based on the memory access distance parameter, where the memory access distance parameter characterizes a transmission distance of the last level cache to the target memory. For example, in an instruction prefetching method provided by at least one example of the foregoing embodiment of the present disclosure, determining the pr