Search

CN-122018982-A - Branch prediction system and method

CN122018982ACN 122018982 ACN122018982 ACN 122018982ACN-122018982-A

Abstract

The invention discloses a branch prediction system and a branch prediction method, comprising a main branch predictor, a decoupling queue and a branch instruction stream processing unit, wherein the main branch predictor comprises a branch jump target buffer area subsystem and a branch jump direction prediction subsystem, the decoupling queue generates an entry comprising a PC list and branch data, the branch instruction stream information processing unit comprises a difficult-to-predict branch instruction buffer area and a branch instruction stream information buffer area, the main branch predictor predicts a branch instruction, the generated prediction result is input into the decoupling queue, and the branch instruction stream processing unit receives branch analysis information, identifies the difficult-to-predict branch, analyzes the instruction stream and feeds back the analysis result to the main branch predictor. The invention solves the problems of insufficient hysteresis and accuracy and high hardware resource expenditure of the branch which is difficult to predict in the speculative execution dependency chain analysis method.

Inventors

  • LI DONGSHENG
  • ZHANG XIRAN
  • WU YE

Assignees

  • 南京英麒智能科技有限公司

Dates

Publication Date
20260512
Application Date
20260120

Claims (10)

  1. 1. The branch prediction system is characterized by comprising a main branch predictor, a decoupling queue and a branch instruction stream processing unit, wherein the main branch predictor comprises a branch jump target buffer subsystem and a branch jump direction prediction subsystem, the decoupling queue generates an entry comprising a PC list and branch data, the branch instruction stream information processing unit comprises a difficult-to-predict branch instruction buffer and a branch instruction stream information buffer, the main branch predictor predicts a branch instruction, a generated prediction result is input into the decoupling queue, and the branch instruction stream processing unit receives branch analysis information, identifies the difficult-to-predict branch and analyzes the instruction stream and feeds back the analysis result to the main branch predictor.
  2. 2. The branch prediction system as claimed in claim 1, wherein the branch jump target buffer subsystem is a two-stage branch jump target buffer cascade in which a first stage branch jump target buffer is accessed in parallel with program counter generation logic in the same pipeline stage and a second stage branch jump target buffer is accessed in a stage next to the pipeline stage in which the first stage branch jump target buffer is located.
  3. 3. The branch prediction system of claim 2, wherein the branch jump target buffer includes N entries, each entry including a valid field, a tag field, instruction block information, a plurality of branch instruction slots, an entry split, LRU, the instruction block information including an instruction number and an instruction block termination manner.
  4. 4. The branch prediction system of claim 2, wherein the branch jump target buffer allocation policy is to index existing entries of the branch block through the PC, to add corresponding branch information if an entry exists and there is a free branch instruction slot, to split an entry if an entry exists but a branch instruction slot is full, to allocate a new entry if no entry, and to cull least recently used entries according to LRU when the branch jump target buffer is full.
  5. 5. The branch prediction system of claim 4, wherein the split entry policy is to add branch information, calculate a split point, the split point being an instruction slot middle position, create two new entries, entry 1 and entry 2, respectively, allocate all branches to the two new entries according to the split point, if the offset of the branch instruction in the instruction block is less than the split point, remain as it is added to entry 1, if the offset of the branch is greater than or equal to the split point, add to entry 2, for the branch instruction allocated to the entry, need to adjust its offset, the new offset = original offset-split point, so that the offset of the branch in entry 2 is kept unchanged from 0, further update the control flow pointer, i.e. mark entry 1 has been split while setting a start position with a target pointing to entry 2, and jump to entry 2 when the end of entry 1 is reached.
  6. 6. The branch prediction system of claim 2, wherein the information of the first level branch jump target buffer is a subset of the information stored by the second level branch jump target buffer.
  7. 7. The branch prediction system of claim 1, wherein when resolving to a branch prediction error, the hard-to-predict branch instruction buffer is indexed using the PC, a counter is incremented if hit, a new entry is allocated if miss, and the hard-to-predict branch is marked after the misprediction count exceeds a threshold.
  8. 8. A method for use in the branch prediction system of claim 1, comprising the steps of: Step S1, branch prediction is carried out through a main branch predictor, wherein the branch prediction comprises a branch jump target buffer subsystem for predicting a branch jump target address, and a branch jump direction prediction subsystem for predicting a branch jump direction; Step S2, decoupling the queue to generate an entry containing a PC list and branch data for use in a subsequent instruction extraction stage; Step S3, when resolving the branch prediction error, the branch instruction stream processing unit uses the PC index to difficultly predict a branch instruction buffer area, if hit, the counter is increased, if miss, a new table entry is allocated, and when the misprediction count exceeds a threshold value, the branch is marked as difficultly predicted; and S4, collecting the submitted/exited instructions by the branch instruction stream information buffer, and carrying out backward instruction stream analysis to provide instruction stream information for the branch predictor.
  9. 9. The branch prediction method according to claim 8, wherein in step S1, the access method of the branch jump target buffer subsystem includes: s11, accessing a first-stage branch jump target buffer area and program counter generation logic in parallel in the same pipeline stage; Step S12, the second-stage branch jump target buffer area is accessed at the next stage of the pipeline stage where the first-stage branch jump target buffer area is positioned; step S13, updating the PC value as the main branch predictor redirects the pipeline.
  10. 10. The branch prediction method according to claim 8, wherein in step S1, the allocation method of the branch jump target buffer includes: Indexing the existing entries of the branch block by the PC, and adding corresponding branch information if the entries exist and there is an idle branch instruction slot; if the entry exists but the branch instruction slot is full, splitting the entry, calculating a splitting point as the middle position of the instruction slot, creating two new entries, and distributing all branches into the two new entries according to the splitting point; if no entry exists, a new entry is allocated, and when the branch jump target buffer is full, the least recently used entry is removed based on the LRU.

Description

Branch prediction system and method Technical Field The present invention relates to the field of branch prediction technologies, and in particular, to a system and a method for branch prediction. Background Branch mispredictions are a major obstacle to performance in current processors, and yet some branches cannot be easily predicted by the high-precision branch predictors of current mainstream (e.g., tag-SC-L and perseptron), commonly referred to as difficult-to-predict branches, including data-dependent branches and branches with complex control flow patterns, which are difficult to identify when the processor is running dynamically. It is common practice today to identify instructions in a dependency chain where it is difficult to predict branches, analyze their dependencies, and generate a stream of speculatively executed instructions. The method has the defects of 1) hysteresis, namely that the requirement of completing a calculation result before a branch is predicted is hardly met completely, 2) accuracy, namely that in a general branch prediction unit design, only aiming at a difficult-to-predict branch with a simple or specific type dependency chain, flexibility and general coverage rate are low, and 3) a branch predictor updating mechanism is complex, a memory unit which is huge gradually occupies a larger chip area, and information storage density and hardware area efficiency are to be improved. Disclosure of Invention The invention aims to provide a branch prediction system and a branch prediction method, which solve the problems of insufficient hysteresis and accuracy and high hardware resource expenditure existing in a speculative execution dependency chain analysis method of a difficult-to-predict branch. The branch prediction system comprises a main branch predictor, a decoupling queue and a branch instruction stream processing unit, wherein the main branch predictor comprises a branch jump target buffer subsystem and a branch jump direction prediction subsystem, the decoupling queue generates an entry containing a PC list and branch data, the branch instruction stream information processing unit comprises a difficult-to-predict branch instruction buffer and a branch instruction stream information buffer, the main branch predictor predicts a branch instruction, the generated prediction result is input into the decoupling queue, and the branch instruction stream processing unit receives branch analysis information, recognizes the difficult-to-predict branch, analyzes the instruction stream and feeds back the analysis result to the main branch predictor. Preferably, the branch jump target buffer subsystem is a cascade of two stages of branch jump target buffers, wherein the first stage branch jump target buffer and the program counter generation logic are accessed in parallel in the same pipeline stage, and the second stage branch jump target buffer is accessed at the next stage of the pipeline stage where the first stage branch jump target buffer is located. Preferably, the branch jump target buffer includes N entries, each entry includes a valid field, a tag field, instruction block information, a plurality of branch instruction slots, an entry splitting, and an LRU, where the instruction block information includes an instruction number and an instruction block termination mode. Preferably, the branch jump target buffer area allocation policy is that existing entries of a branch block are indexed through a PC, corresponding branch information is added if an entry exists and a free branch instruction slot exists, the entry is split if the entry exists but the branch instruction slot is full, a new entry is allocated if no entry exists, and the least recently used entry is removed according to LRU when the branch jump target buffer is full. Preferably, the split entry policy is to add branch information, calculate a split point, the split point is an intermediate position of an instruction slot, create two new entries, namely an entry 1 and an entry 2, allocate all branches to the two new entries according to the split point, if the offset of the branch instruction in an instruction block is smaller than the split point, the branch instruction is added to the entry 1 as it is, if the offset of the branch instruction is greater than or equal to the split point, the branch instruction allocated to the entry 2 needs to be adjusted, the new offset = the original offset-the split point, so that the offset of the branch in the entry 2 starts from 0, the relative position is kept unchanged, and further update the control flow pointer, namely mark the entry 1 as split, set a starting position of the target pointing to the entry 2, and jump to the entry 2 when the end of the entry 1 is reached. Preferably, the information of the first-stage branch jump target buffer is a subset of the information stored in the second-stage branch jump target buffer. Preferably, when resolving a branch prediction erro