Search

CN-120179293-B - Branch predictor, method of operation, processor, electronic device, and storage medium

CN120179293BCN 120179293 BCN120179293 BCN 120179293BCN-120179293-B

Abstract

Embodiments of the present disclosure provide a branch predictor for multiple threads, and methods of operation, a processor, an electronic device, and a storage medium. The branch predictor comprises a plurality of tag prediction tables with different levels, wherein the plurality of threads share the plurality of tag prediction tables, and the operation method comprises the steps of setting tag offset values for the plurality of threads, responding to the operation of a target thread in the plurality of threads on the plurality of tag prediction tables, and determining a storage area corresponding to each tag prediction table and the target thread based on the target tag offset value corresponding to the target thread and the respective prediction table identifications of the plurality of tag prediction tables. The operation method can relieve the capacity competition problem of different threads to the same branch predictor, reduce the interference between threads and improve the branch prediction precision.

Inventors

  • LI HAIFENG

Assignees

  • 海光信息技术股份有限公司

Dates

Publication Date
20260512
Application Date
20250313

Claims (15)

  1. 1. A method of operation of a branch predictor for a plurality of threads, wherein the branch predictor includes a plurality of tag prediction tables of different levels, the plurality of threads sharing the plurality of tag prediction tables, the method of operation comprising: setting tag offset values for the plurality of threads; Responding to the operation of a target thread in the plurality of threads on the plurality of tag prediction tables, and determining a storage area corresponding to each tag prediction table and the target thread based on a target tag offset value corresponding to the target thread and a prediction table identifier of each tag prediction table; wherein the branch predictor includes N tag prediction tables and N storage areas, each storage area stores 2 m entries, N and m are positive integers, The determining, based on the target tag offset value corresponding to the target thread and the respective prediction table identifiers of the plurality of tag prediction tables, a storage area corresponding to the target thread for each tag prediction table includes: For a first tag prediction Table of the N tag prediction tables, determining that the area identification of the storage area corresponding to the first tag prediction Table and the target thread is (Tage_Offset+Table_id) in response to (Tage_Offset+Table_id) being less than or equal to N, or determining that the area identification of the storage area corresponding to the first tag prediction Table and the target thread is (Tage_Offset+Table_id-N) in response to (Tage_Offset+Table_id) being greater than N, Wherein table_id is a prediction Table identifier of the first tag prediction Table, and tage_offset is the target tag Offset value.
  2. 2. The method of operation of claim 1 wherein the operation of the target thread on the plurality of tag prediction tables comprises a query operation, After the determining the storage area corresponding to the target thread for each tag prediction table, the operation method further includes: And reading the table entry stored in the storage area corresponding to each tag prediction table and the target thread to inquire whether the branch instruction to be predicted of the target thread hits the N tag prediction tables.
  3. 3. The method of operation of claim 2, further comprising: responding to the target thread to be predicted, wherein the branch instruction to be predicted hits a first table item in at least one tag prediction table, taking the highest-level tag prediction table in the at least one hit tag prediction table as a target tag prediction table, and reading a prediction result of the first table item in a storage area corresponding to the target thread in the target tag prediction table to obtain a target prediction result.
  4. 4. The method of operation of claim 3, further comprising: and in response to the target prediction result error, updating a first table entry in the hit at least one tag prediction table based on the actual result of the branch instruction to be predicted, and applying for allocation of a new table entry in a tag prediction table higher than the target tag prediction table to store instruction information corresponding to the branch instruction to be predicted and the actual result.
  5. 5. The method of operation of claim 3, further comprising: in response to the target prediction result being incorrect and the target tag prediction table being highest in the plurality of tag prediction tables, a first entry in the at least one tag prediction table that hits is updated based on an actual result of the branch instruction to be predicted.
  6. 6. The method of operation of any of claims 1-5, further comprising: And recording the demand information of the plurality of threads on each tag prediction table, and adjusting the tag offset values of the plurality of threads based on the demand information.
  7. 7. The method of operation of claim 6, wherein the demand information includes a number of times each tag prediction table is applied for allocation of a new entry, the adjusting tag offset values for the plurality of threads based on the demand information comprising: And adjusting the tag offset values of the plurality of threads in response to the number of times at least one tag prediction table is applied to allocate the new entry exceeding a preset threshold within a preset time.
  8. 8. The method according to any one of claims 1-5, further comprising: The plurality of threads is partitioned into at least two groups of threads, wherein each group of threads shares a same tag offset value.
  9. 9. A branch predictor for a plurality of threads, comprising: A plurality of tag prediction tables of different levels, wherein the plurality of threads share the plurality of tag prediction tables; A shift register configured to store tag offset values set for the plurality of threads; The branch prediction control logic module is configured to respond to the operation of a target thread in the plurality of threads on the plurality of tag prediction tables, and determine a storage area corresponding to each tag prediction table and the target thread based on a target tag offset value corresponding to the target thread and a prediction table identifier of each of the plurality of tag prediction tables; wherein the branch predictor includes N tag prediction tables and N storage areas, each storage area stores 2 m entries, N and m are positive integers, The determining, based on the target tag offset value corresponding to the target thread and the respective prediction table identifiers of the plurality of tag prediction tables, a storage area corresponding to the target thread for each tag prediction table includes: For a first tag prediction Table of the N tag prediction tables, determining that the area identification of the storage area corresponding to the first tag prediction Table and the target thread is (Tage_Offset+Table_id) in response to (Tage_Offset+Table_id) being less than or equal to N, or determining that the area identification of the storage area corresponding to the first tag prediction Table and the target thread is (Tage_Offset+Table_id-N) in response to (Tage_Offset+Table_id) being greater than N, Wherein table_id is a prediction Table identifier of the first tag prediction Table, and tage_offset is the target tag Offset value.
  10. 10. The branch predictor as recited in claim 9, further comprising: And the hardware control logic module is configured to record the requirement information of the plurality of threads on each tag prediction table and adjust the tag offset value stored in the shift register based on the requirement information.
  11. 11. The branch predictor as recited in claim 9, wherein the tag offset value in the shift register is configured to be modified externally.
  12. 12. A processor comprising the branch predictor according to any of claims 9-11, wherein the processor is configured to support a plurality of threads.
  13. 13. An electronic device comprising the processor of claim 12.
  14. 14. An electronic device, comprising: at least one memory unit configured to store computer readable instructions, and At least one processing unit configured to execute the computer readable instructions stored by the storage unit to implement the method of operation according to any one of claims 1-8.
  15. 15. A computer readable storage medium having computer readable instructions stored therein, which when executed by a processor, implement a method of operation according to any of claims 1-8.

Description

Branch predictor, method of operation, processor, electronic device, and storage medium Technical Field Embodiments of the present disclosure relate to a branch predictor for multiple threads and methods of operation, a processor, an electronic device, and a storage medium. Background In a high performance out-of-order execution processor, an accurate branch predictor plays a critical role in order to maximize the throughput of the processor. It can speculatively make branch decisions on the program execution path, thereby allowing the processor to begin executing the predicted instruction stream before actually determining the branch condition. In this way, the branch predictor can effectively avoid pipeline stalls and interrupts of instruction execution sequences caused by waiting for branch results, thereby significantly improving the parallelism of the whole instruction level and the utilization rate of processor resources, ensuring that the branch predictor is continuously in a high-efficiency working state, and especially when facing codes depending on a large number of branch logics, accurate branch prediction is a key factor for optimizing the system performance. Disclosure of Invention At least one embodiment of the present disclosure provides an operation method of a branch predictor for a plurality of threads, where the branch predictor includes a plurality of tag prediction tables of different levels, and the plurality of threads share the plurality of tag prediction tables, the operation method includes setting tag offset values for the plurality of threads, and determining a storage area of each tag prediction table corresponding to a target thread based on the target tag offset value corresponding to the target thread and a prediction table identifier of each of the plurality of tag prediction tables in response to an operation of the target thread in the plurality of threads on the plurality of tag prediction tables. For example, in an operation method provided by at least one embodiment of the present disclosure, a branch predictor includes N tag prediction tables and N storage areas, each storage area stores 2m entries, N and m are positive integers, and a storage area corresponding to a target thread is determined based on a target tag Offset value corresponding to the target thread and a prediction Table identifier of each of the tag prediction tables, where the tag_id is a prediction Table identifier of the first tag prediction Table and the tag_offset is a target tag Offset value, in response to (tag_offset+table_id) N being less than or equal to N, and the area identifier of the storage area corresponding to the target thread is determined to be (tag_offset+table_id) in response to (tag_offset+table_id) N. For example, in the operation method provided in at least one embodiment of the present disclosure, the operation of the target thread on the plurality of tag prediction tables includes a query operation, and after determining a storage area corresponding to the target thread for each tag prediction table, the operation method further includes reading an entry stored in the storage area corresponding to the target thread for each tag prediction table, so as to query whether the branch instruction to be predicted of the target thread hits N tag prediction tables. For example, the operation method provided by at least one embodiment of the present disclosure further includes, in response to a branch instruction to be predicted of a target thread hitting a first entry in at least one tag prediction table, taking a highest-ranked tag prediction table in the hitting at least one tag prediction table as the target tag prediction table, and reading a prediction result of the first entry located in a storage area of the target tag prediction table corresponding to the target thread, so as to obtain a target prediction result. For example, the operation method provided by at least one embodiment of the present disclosure further includes updating a first entry in at least one hit tag prediction table based on an actual result of a branch instruction to be predicted in response to a target prediction result error, and applying for allocation of a new entry in the tag prediction table higher than the target tag prediction table to store instruction information and the actual result corresponding to the branch instruction to be predicted. For example, at least one embodiment of the present disclosure provides an operating method further comprising updating a first entry in at least one of the tag prediction tables that hits based on an actual result of the branch instruction to be predicted, in response to the target prediction result being incorrect and the target tag prediction table being highest in the plurality of tag prediction tables. For example, the operation method provided by at least one embodiment of the present disclosure further includes recording requirement information of the p