Search

CN-115688640-B - Coprocessor access interface based on superscalar RISC-V processor pipeline

CN115688640BCN 115688640 BCN115688640 BCN 115688640BCN-115688640-B

Abstract

The invention belongs to the technical field of integrated circuit design, and particularly relates to a coprocessor access memory interface based on a superscalar RISC-V processor pipeline. The invention comprises original load storage unit pipeline modification logic of an open source ferrous processor and a coprocessor access storage state machine, wherein the pipeline modification logic completes processing logic of a coprocessor access storage instruction in a pipeline, the processing logic comprises load and storage pipeline borrowing logic and address conflict resolution logic, the state machine is responsible for management of instruction emission and response behaviors of the coprocessor access storage instruction between the coprocessor and load storage running water of a main processor, the processing logic comprises the load state machine and the storage state machine, and the normal access storage behavior of the coprocessor in a system is realized by using the load storage pipeline of the main processor through a coprocessor access storage interface. And the coprocessor connected with the main processor performs memory access operation on the L1Cache in the system through the memory access interface, and completes access to the shared data area for other computing operations.

Inventors

  • HAN JUN
  • KONG XINJIE
  • WANG KAIXUAN

Assignees

  • 复旦大学

Dates

Publication Date
20260505
Application Date
20220924

Claims (3)

  1. 1. A coprocessor access interface based on a superscalar RISC-V processor pipeline is characterized in that the coprocessor access interface realizes butt joint of a coprocessor and the superscalar RISC-V processor pipeline, and completes access requests of the coprocessor by loading a storage pipeline by a main processor, the coprocessor access interface comprises original loading storage pipeline modification logic of an open source ferrous processor and a coprocessor access state machine module, wherein the loading storage pipeline modification logic completes processing logic of a coprocessor access instruction in the pipeline, a specific logic module comprises loading pipeline borrowing logic, a storage pipeline borrowing logic, a loading storage borrowing interface and address conflict resolution logic, the coprocessor access state machine module is responsible for managing instruction transmitting behavior and response behavior between the coprocessor access instruction and loading storage running water of the main processor, and the specific module comprises a loading state machine and a storage state machine, wherein: The loading pipeline borrowing logic is processing logic responsible for completing loading access instructions of a coprocessor, has a structure similar to that of a main processor loading pipeline and comprises an AG stage, a DC stage, a DA stage and a WB stage, wherein after the access instructions of the coprocessor are sent out from the coprocessor, access requests are sent into the main processor loading pipeline through a coprocessor access state machine, address translation is carried out in the AG stage through a borrowing interface of the pipeline, the DCache is accessed in the DC stage, the instructions are made to enter LoadQueue, the accessed data are aligned in the DA stage, and finally the data are returned to the coprocessor state machine in the WB stage to be used as response of access; The memory pipeline borrowing logic is processing logic responsible for completing the memory access instruction of the coprocessor, has a structure similar to a main processor loading pipeline and comprises an AG/EX1 stage, a DC stage, a DA stage and a WB stage, wherein after the memory access instruction of the coprocessor is sent out from the coprocessor, the memory access request is sent into a pipeline of the main processor memory pipeline through a memory access state machine of the coprocessor; The interface is used for connecting a coprocessor request between a main pipeline RF end and an AG end of a loading and storing pipeline, and setting the priority of the coprocessor access instruction higher than that of the main processor access instruction, thereby completing the borrowing of the coprocessor access instruction to the main pipeline by blocking the main processor to load the storage instruction; The address conflict resolution logic is responsible for resolving conflicts of different instructions on access behaviors of the same address, and after sequence information of instructions is obtained by identifying indexes and head pointer information given by a coprocessor access state machine module, checking memory continuity in LoadQueue with 16 items and StoreQueue with 12 items to finish the functions of data feedforward or access behavior retransmission; The coprocessor access state machine module is a management module responsible for processing instruction states and instruction requests between a coprocessor and a main processor pipeline, and comprises 12 state machine management units corresponding to storage and loading, wherein the state machine management units are used for managing at most 12 incomplete coprocessor access requests, loading and storing the instructions to the main processor in a pipelining manner, requesting the instructions by means of ports, processing non-hit access events through state management, and simultaneously, the module also comprises index information and head pointer information corresponding to state machines, and the index information and the head pointer information are used for indicating sequence relations among different coprocessor access instructions so as to facilitate management of internal memory continuity problems, and logic of the state machines is divided into a loading state machine and a storing state machine.
  2. 2. The coprocessor access interface of claim 1, wherein: The loading state machine is responsible for processing a loading request sent by the coprocessor, and internally comprises 4 states, namely ready, play/replay, wait and wb, wherein ready corresponds to idle, play/replay corresponds to sending an execution or retransmission request to a loading pipeline, wait corresponds to the completion of waiting for execution, and wb corresponds to the return of access data; The storage state machine is responsible for processing a storage request sent by the coprocessor, and comprises 6 states, namely ready, play/replay, ex1data1, ex1data2, wait and wb, wherein ready corresponds to idle, play/replay corresponds to sending an execution or retransmission request to a storage pipeline, ex1data1 and ex1data2 correspond to a data preparation process of a store instruction, wait corresponds to completion of execution, and wb corresponds to return of access data.
  3. 3. The coprocessor access interface of claim 2, wherein the workflow of coprocessor access instructions in the system is as follows: (1) The coprocessor sends a memory request through a memory interface, wherein the request comprises a memory virtual address, requested data and memory type information, the request information enters a memory state machine of the coprocessor to be recorded and managed, and meanwhile, a memory state machine module of the coprocessor allocates according to the currently idle state machine and transmits a corresponding state machine index back to the coprocessor to be stored; (2) When the coprocessor memory access state machine is requested, the state of the coprocessor memory access state machine jumps to play/replay, which means that the memory access request requests the main processor to load and store the pipeline for execution; when the state machine is at the highest priority and the main processor can process the coprocessor access instruction, the state machine jumps to the next state; (3) If the coprocessor access type is loading, the state machine state enters wait state, which means to wait for the execution of the loading pipeline to complete, the access request enters AG stage of the loading pipeline for address translation, if TLBmiss occurs at this stage, the state position of TLBmiss is transferred to DC stage, if TLBmiss is detected at DC stage, the state machine is converted to replay state for re-request of the loading pipeline, if TLBmiss does not occur, the access request is LoadQueue for checking dependency relationship, if no violation occurs, DCache is accessed, if Cachehit, DA stage is directly entered for data alignment, and data write back is performed in response to the coprocessor access state machine module at WB stage, if CACHEMISS, CACHEMISS is managed by RefillBuffer module, and finally written back data is provided by RefillBuffer; (4) If the coprocessor access type is store, the state machine state does not directly enter wait and enters ex1data1 state because the store involves the reading of the store data, the request is in AG stage, the processing of AG stage is similar to the loading request, the next cycle state request is in DC stage, TLBmiss processing is similar to the loading, but the store request needs to check the store data, send data request to the state machine module, when the state machine receives the request, the state jumps to ex1data2 and gives out data, and then jumps to wait for the access to finish, the subsequent DCache request of the store request is similar to the loading, and the writing back of the data is finished by DC stage or RefillBuffer.

Description

Coprocessor access interface based on superscalar RISC-V processor pipeline Technical Field The invention belongs to the technical field of integrated circuit design, and particularly relates to a coprocessor access memory interface based on a superscalar RISC-V processor pipeline. Background Central general purpose processors are responsible for important unified control tasks in many application scenarios. However, due to the slowing of semiconductor process progress, light dependent process dividends have failed to meet the growing performance demands. Meanwhile, in order to ensure wide universality, the traditional general processor has limited architecture optimization space due to complex architecture logic. Therefore, the recently developed architecture special for the field is a feasible way for meeting the performance requirements of different application fields, but the original novel architecture taking the central general processor as a control core is completely overturned, so that huge cost for hardware research and development and generation can be brought, and huge cost for software instruction set specification and software research and development on a compiler can be brought. Therefore, it is important to use a domain-specific coprocessor to assist the original cpu to perform part of the tasks in a specific scenario, and the performance of the coprocessor in a scenario where a large amount of data is required is quite limited due to the limited data bandwidth of a general coprocessor interface. Disclosure of Invention In order to overcome the defects of the prior art, the invention aims to provide a coprocessor access memory interface based on a superscalar RISC-V processor pipeline, which provides an interface capable of accessing an L1-Cache for a coprocessor so as to meet the requirement of partial computing scenes requiring more data. The coprocessor access interface provided by the invention can be in butt joint with a superscalar RISC-V processor pipeline, and accesses to the L1-Cache are completed by using a loading storage unit in the pipeline to acquire memory data. The invention comprises the existing load memory cell pipeline modification logic of an open source ferrous processor and a coprocessor memory access state machine module, wherein the load memory cell pipeline modification logic completes the processing logic of a coprocessor memory access instruction in a pipeline, and comprises load pipeline borrowing logic, memory pipeline borrowing logic and address conflict resolution logic, and the coprocessor memory access state machine module is responsible for the management of instruction transmitting behavior and response behavior between the coprocessor memory access instruction and load memory running water of a main processor, and comprises a load state machine and a memory state machine. The invention provides original load memory unit pipeline modification logic, which is described by taking an open source superscalar RISC-V processor-the pipeline of the brown iron C910 as an example, and comprises load pipeline borrowing logic, store pipeline borrowing logic and address conflict resolution logic, wherein: The main structure of the load pipeline borrowing logic is similar to that of a main processor load pipeline, and comprises an AG (address generation) stage, a DC (DCache) stage, a DA (address alignment) stage and a WB (write back) stage, wherein after the coprocessor memory instruction is sent out from a coprocessor, a memory access request is sent into the main processor load pipeline through a coprocessor memory access state machine, the pipeline is accessed through a borrowing interface of the pipeline, address translation is carried out in the AG stage, the DCache is accessed in the DC stage, the instruction is made to enter LoadQueue, the accessed data is aligned in the DA stage, and finally the data is returned to the coprocessor state machine in the WB stage as a memory access response. The memory pipeline borrowing logic is processing logic responsible for completing the memory access instruction of the coprocessor. The main structure is similar to a main processor load pipeline, including an AG/EX1 (address generation and data request) stage, a DC (DCache) stage, a DA (address alignment) stage, and a WB (write back) stage. After the coprocessor access instruction is sent out from the coprocessor, an access request is sent to a main processor storage pipeline through a coprocessor access state machine. The pipeline is entered through the borrowing interface of the pipeline, the DCache access is completed and the instruction is entered StoreQueue when the address translation is completed in the AG/EX1 stage and the data is ready, and then the access is completed after the data alignment of DA and the data response of WB are completed. The load store borrowing interface is an interface for completing the function of loading a store pipeline of the main process