CN-121996295-A - RISC-V processor architecture, instruction processing method, chip and device

CN121996295ACN 121996295 ACN121996295 ACN 121996295ACN-121996295-A

Abstract

The application relates to the technical processor field and provides a RISC-V processor architecture, an instruction processing method, a chip and equipment, wherein the RISC-V processor architecture comprises a decoding unit, an execution unit and a loading storage unit, wherein the decoding unit is used for responding to the starting of an X86 full memory sequence compatible memory mode and decoding an input memory access instruction into a micro-operation with a corresponding memory sequence constraint, and the execution unit comprises a loading storage unit and is used for executing the micro-operation in a mode consistent with an X86 full memory sequence model through hardware sequence control logic. The embodiment of the application can reduce or avoid the influence on the execution performance of the RISC-V processor when the multithreaded concurrent program written based on the x86 architecture runs on the RISC-V processor.

Inventors

Pang Yachuan

Assignees

成都群芯微电子科技有限公司

Dates

Publication Date: 20260508
Application Date: 20251229

Claims (19)

1. A RISC-V processor architecture, comprising: the decoding unit is used for responding to the starting of the compatible X86 full memory sequence memory mode and decoding an input memory access instruction into a micro operation with corresponding memory sequence constraint; And the execution unit comprises a loading storage unit, wherein the loading storage unit is used for executing the micro-operation in a mode consistent with the X86 full storage sequence model through hardware sequence control logic.
2. The RISC-V processor architecture of claim 1, wherein the X86-compatible full memory sequence memory model corresponds to a memory consistency model comprising an RCPC model.
3. The RISC-V processor architecture of claim 2, wherein the decoding of the incoming memory access instructions into micro-operations with corresponding memory order constraints comprises: the Load instruction is decoded into a first micro-operation having Load acquire RCPC in-memory semantics and/or, The Store instruction is decoded into a second micro-operation with Store RELEASE RCPC in-memory semantics.
4. A RISC-V processor architecture according to claim 3, wherein the load store unit includes dynamic conflict resolution circuitry to: Allowing a new Load instruction to complete address resolution, access and/or write back before an old Load instruction, and recording the state of the new Load instruction relative to the old Load instruction in advance resolution, access and/or write back; when the write operation from other processor cores to the address accessed by the Load instruction in the state of advance resolution, access memory and/or write back is monitored, marking the corresponding record as a conflict state; and when the old Load instruction finishes analysis, triggering the instruction stream refresh after the old Load instruction if a subsequent Load instruction record marked as a conflict state exists.
5. The RISC-V processor architecture of claim 3 or 4, wherein the load store unit includes a write order control circuit to: controlling Store instructions to initiate write operations to the first level cache from the memory buffer in program order; And after the write operation of the previous Store instruction successfully enters the write arbitration flow of the first-level cache, allowing the next Store instruction to initiate the write operation.
6. The RISC-V processor architecture of claim 5, wherein the write sequence control circuit is further configured to support speculative execution of Store instructions.
7. The RISC-V processor architecture of claim 6, wherein the enabling speculative execution of Store instructions comprises: allowing the subsequent Store instruction to initiate a write operation after the write operation of the previous Store instruction enters the same clock cycle or the next clock cycle of the write arbitration flow; if the write operation of the previous Store instruction fails in arbitration or fails in a subsequent write to the cache, a blocking signal is generated to prevent the write operation of the next Store instruction and a new Store instruction thereafter from being executed, and the blocked write operation is put back into a waiting state.
8. The RISC-V processor architecture of claim 6, wherein the speculative execution is implemented based on a multi-level state machine that tracks the stage each Store instruction is in during speculative execution, including at least wait initiation, arbitration contention, cache writing.
9. A chip comprising a RISC-V processor architecture as claimed in any one of claims 1-8.
10. An electronic device comprising a RISC-V processor architecture according to any one of claims 1-8 or a chip according to claim 9.
11. A method of processing instructions, comprising: Responding to the starting of the compatible X86 full memory sequence memory mode, decoding an input memory access instruction into micro-operation with corresponding memory sequence constraint; the micro-operations are performed in a manner consistent with the X86 full memory order model by hardware order control logic.
12. The method of claim 11, wherein the X86-compatible full memory sequence memory model corresponding to memory consistency model comprises an RCPC model.
13. The method of claim 12, wherein decoding the input memory access instruction into a micro-operation having a corresponding memory order constraint comprises: the Load instruction is decoded into a first micro-operation having Load acquire RCPC in-memory semantics and/or, The Store instruction is decoded into a second micro-operation with Store RELEASE RCPC in-memory semantics.
14. The instruction processing method of claim 13, wherein said performing, by hardware sequential control logic, said micro-operations in a manner consistent with an X86 full memory sequential model comprises: Allowing a new Load instruction to complete address resolution, access and/or write back before an old Load instruction, and recording the state of the new Load instruction relative to the old Load instruction in advance resolution, access and/or write back; when the write operation from other processor cores to the address accessed by the Load instruction in the state of advance resolution, access memory and/or write back is monitored, marking the corresponding record as a conflict state; and when the old Load instruction finishes analysis, triggering the instruction stream refresh after the old Load instruction if a subsequent Load instruction record marked as a conflict state exists.
15. The instruction processing method of claim 13 or 14, wherein the performing, by hardware sequential control logic, the micro-operation in a manner consistent with an X86 full memory sequential model, further comprises: controlling Store instructions to initiate write operations to the first level cache from the memory buffer in program order; And after the write operation of the previous Store instruction successfully enters the write arbitration flow of the first-level cache, allowing the next Store instruction to initiate the write operation.
16. The instruction processing method of claim 15, wherein said performing said micro-operations by hardware sequential control logic in a manner consistent with an X86 full memory sequential model further comprises: speculative execution of Store instructions is supported.
17. The instruction processing method of claim 16, wherein the enabling speculative execution of Store instructions comprises: allowing the subsequent Store instruction to initiate a write operation after the write operation of the previous Store instruction enters the same clock cycle or the next clock cycle of the write arbitration flow; if the write operation of the previous Store instruction fails in arbitration or fails in a subsequent write to the cache, a blocking signal is generated to prevent the write operation of the next Store instruction and a new Store instruction thereafter from being executed, and the blocked write operation is put back into a waiting state.
18. The method of claim 16, wherein the speculative execution is performed based on a multi-level state machine that tracks the stage each Store instruction is in during speculative execution, the stages including at least wait initiation, arbitration contention, and cache writing.
19. A computer device comprising a memory, a processor, and a computer program stored on the memory, characterized in that the computer program, when being executed by the processor, performs the instructions of the method according to any of claims 11-18.

Description

RISC-V processor architecture, instruction processing method, chip and device Technical Field The present application relates to the field of processor technologies, and in particular, to a RISC-V processor architecture, an instruction processing method, a chip, and a device. Background In recent years, the RISC-V (Reduced Instruction Set Computer V, RV) architecture has gained rapid development by virtue of its open source, modularity and scalability properties, being considered as the third largest mainstream instruction set architecture following the x86 architecture and ARM architecture. At present, a great deal of mature application software (especially in the fields of servers, high-performance computing and the like) is written based on an x86 architecture, so that the RISC-V ecological development needs to be compatible with a great deal of x86 application software. However, there is a fundamental difference between RISC-V architecture and x86 architecture memory models in that x86 architecture employs strict full memory ordering (Total Store Order, TSO), whereas RISC-V architecture defaults to use weak memory ordering (RV Weak Memory Order, RVWMO) that allows more rearrangement. This results in the possibility of errors due to the misordering of memory operations when a multithreaded concurrent program written for the x86 architecture runs directly on the RISC-V processor. Therefore, in order to ensure the correctness of the memory operation sequence, the existing scheme generally inserts the Fence instruction forcefully after each memory access instruction in the software layer, however, the Fence instruction has stricter requirements than the release consistency (Release Consistency with Processor Consistency, RCPC) model with processor consistency in ensuring the memory operation sequence, the semantic requirement of the operation after the Fence can be started after all memory operations before the Fence in the pipeline are necessarily completed, so that a large number of operations which can be executed in parallel are forcefully serialized, and the execution performance of the RISC-V processor can be seriously influenced. Disclosure of Invention An object of an embodiment of the present application is to provide a RISC-V processor architecture, an instruction processing method, a chip and a device, so as to reduce or avoid an influence on an execution performance of a RISC-V processor when a multithreaded concurrent program written based on an x86 architecture runs on the RISC-V processor. To achieve the above object, in one aspect, an embodiment of the present application provides a RISC-V processor architecture, including: the decoding unit is used for responding to the starting of the compatible X86 full memory sequence memory mode and decoding an input memory access instruction into a micro operation with corresponding memory sequence constraint; And the execution unit comprises a loading storage unit, wherein the loading storage unit is used for executing the micro-operation in a mode consistent with the X86 full storage sequence model through hardware sequence control logic. In the RISC-V processor architecture of the embodiment of the present application, the memory consistency model corresponding to the compatible X86 full memory sequence memory mode includes an RCPC model. In the RISC-V processor architecture of the embodiment of the present application, the decoding the input memory access instruction into the micro-operation with the corresponding memory sequence constraint includes: the Load instruction is decoded into a first micro-operation having Load acquire RCPC in-memory semantics and/or, The Store instruction is decoded into a second micro-operation with Store RELEASE RCPC in-memory semantics. In the RISC-V processor architecture of the embodiments of the present application, the load store unit includes a dynamic conflict resolution circuit for: Allowing a new Load instruction to complete address resolution, access and/or write back before an old Load instruction, and recording the state of the new Load instruction relative to the old Load instruction in advance resolution, access and/or write back; when the write operation from other processor cores to the address accessed by the Load instruction in the state of advance resolution, access memory and/or write back is monitored, marking the corresponding record as a conflict state; and when the old Load instruction finishes analysis, triggering the instruction stream refresh after the old Load instruction if a subsequent Load instruction record marked as a conflict state exists. In the RISC-V processor architecture of the embodiment of the present application, the load store unit includes a write sequence control circuit for: controlling Store instructions to initiate write operations to the first level cache from the memory buffer in program order; And after the write operation of the previous Store instruction successfull