Search

CN-121996579-A - Request processing module, request processing method, electronic device and storage medium

CN121996579ACN 121996579 ACN121996579 ACN 121996579ACN-121996579-A

Abstract

The disclosure provides a request processing module, a request processing method, electronic equipment and a storage medium, and relates to the technical field of computers. The request processing module comprises a first-stage merging sub-module and a second-stage merging sub-module, wherein the first-stage merging sub-module is used for aggregating a plurality of memory access sub-requests into a plurality of first access requests under the condition of receiving a plurality of memory access sub-requests corresponding to writing instructions, and the second-stage merging sub-module is used for receiving the plurality of first access requests and updating a plurality of preset address mapping information according to target address information carried in the plurality of first access requests in the same clock period so as to generate a plurality of second access requests. The method and the device can realize dynamic merging of the write-in type memory access requests, thereby meeting the high throughput requirement of outputting a plurality of merging requests in the same clock period.

Inventors

  • Request for anonymity
  • Request for anonymity
  • Request for anonymity

Assignees

  • 摩尔线程智能科技(上海)有限责任公司

Dates

Publication Date
20260508
Application Date
20251230

Claims (17)

  1. 1. A request processing module, comprising: the first-stage merging sub-module is used for aggregating the memory access sub-requests into a plurality of first access requests under the condition that the memory access sub-requests corresponding to the writing type instruction are received; and the second-stage merging sub-module is used for receiving the plurality of first access requests and updating a plurality of preset address mapping information according to target address information carried in the plurality of first access requests in the same clock period so as to generate a plurality of second access requests.
  2. 2. The request processing module of claim 1, wherein the second stage merge sub-module comprises: the system comprises a plurality of merging buffer units, a plurality of memory unit and a plurality of memory unit, wherein each merging buffer unit is used for storing address mapping information corresponding to each merging buffer unit, and the address mapping information comprises memory address information and write operation information corresponding to preset merging granularity; And the memory address comparison unit is used for matching the target address information carried in each received first access request with the memory address information in each merging buffer unit in the same clock period to obtain a matching result, so that each merging buffer unit updates the address mapping information according to the matching result, and generates the second access request.
  3. 3. The request processing module of claim 2, wherein each of the merge buffer units is configured to: In each clock period, when detecting that the target address information carried in each first access request is matched with the memory address information stored in any merging buffer unit, merging the write operation information carried in each first access request with the write operation information stored in the matched merging buffer unit to obtain aggregation request information; And generating the second access request according to the aggregation request information under the condition that the preset trigger condition is met.
  4. 4. A request processing module according to claim 3, wherein the preset trigger condition comprises at least one of occupation of the matched merging buffer unit by a new first access request, and detection of the first access request carrying an instruction end identifier.
  5. 5. The request processing module of claim 2, wherein each of the merge buffer units is further configured to: In each clock period, when detecting that the target address information carried in each first access request is not matched with the memory address information stored in any merging buffer unit, generating the second access request according to the memory address information and the write operation information currently stored in the target merging buffer unit; Wherein the target merge buffer unit is determined from a merge buffer unit that does not perform a request merge operation.
  6. 6. The request processing module of claim 5, wherein each of the merge buffer units is further configured to: and writing target address information and writing operation information carried in each first access request into the target merging buffer unit so as to replace the memory address information and the writing operation information currently stored in the target merging buffer unit.
  7. 7. The request processing module of claim 2, wherein each of the merge buffer units is further configured to: And generating the second access request corresponding to each effective merging buffer unit according to the address mapping information stored in each effective merging buffer unit under the condition that each effective merging buffer unit is detected to execute the request merging operation in the current clock cycle of ending the writing instruction.
  8. 8. The request processing module of claim 7, wherein each of the merge buffer units is further configured to: and after the second access request is generated, clearing the address mapping information stored in each merging buffer unit.
  9. 9. The request processing module according to claim 2, wherein the second-stage merge submodule further includes a plurality of pipeline buffer units respectively provided in correspondence with the merge buffer units; Each of the pipeline buffer units is configured to: and in the current clock cycle of ending the writing instruction, when detecting that any effective merging buffer unit does not execute request merging operation, receiving target address information and writing operation information carried in a first access request in the current clock cycle, and generating the second access request based on the stored target address information and writing operation information in the next clock cycle.
  10. 10. The request processing module of claim 9, wherein the second stage merge sub-module further comprises: And the input arbitration unit is used for controlling the input channel of the first-stage merging sub-module to be in a receiving state when the pipeline buffer unit is detected to be in a writing process, so that the input channel of the first-stage merging sub-module receives a memory access sub-request corresponding to a new writing instruction after the pipeline buffer unit finishes writing.
  11. 11. The request processing module according to any one of claims 1 to 10, wherein the second stage merging sub-module further comprises a plurality of multiplexers respectively provided corresponding to the plurality of merging buffer units and the plurality of pipeline buffer units; each of the multiplexers is configured to: and in the same clock cycle, selecting the stored memory address information and the write operation information from the corresponding merging buffer unit or pipeline buffer unit, generating a second access request and sending the second access request to the downstream module.
  12. 12. The request processing module of claim 1, wherein the first stage merge sub-module is configured to: And receiving Q memory access sub-requests carrying writing data in the same clock period, performing address matching and aggregation under a preset merging granularity based on memory address information carried by each memory access sub-request to obtain N first access requests, and outputting the N first access requests to the second-stage merging sub-module, wherein N is smaller than Q.
  13. 13. The request processing module of claim 1, wherein the request processing module further comprises: The first instruction processing module is used for receiving M memory access sub-requests corresponding to the read-type instructions in the same clock period, and executing single-stage aggregation processing on the M memory access sub-requests to obtain P target access requests, wherein P is smaller than M.
  14. 14. The request processing module of claim 1, wherein the request processing module further comprises: And the second instruction processing module is used for receiving the consistency control instruction and sending a target access request corresponding to the consistency control instruction to the downstream module in one clock period.
  15. 15. A method of processing a request, comprising: Receiving a plurality of memory access sub-requests corresponding to a writing type instruction, and aggregating the plurality of memory access sub-requests into a plurality of first access requests; And receiving the plurality of first access requests, and updating a plurality of preset address mapping information according to target address information carried in the plurality of first access requests in the same clock period to generate a plurality of second access requests.
  16. 16. An electronic device comprising the request processing module of any of claims 1 to 14.
  17. 17. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processing unit, implements the request processing method as claimed in claim 15.

Description

Request processing module, request processing method, electronic device and storage medium Technical Field The disclosure relates to the field of computer technology, and in particular, to a request processing module, a request processing method, electronic equipment and a storage medium. Background In existing computer architectures, multiple memory access requests issued by a processor or graphics processing unit are typically aggregated to reduce the number of downstream cache or memory accesses. Specifically, the memory access requests can be merged according to the instruction type, so that the access granularity is increased, and the data transmission efficiency is improved. Common instruction types include read instructions (load), write instructions (store), atomic operation instructions (atomic), and coherency control instructions (flush/fence), with different types of instructions differing in interface data format and manner of transmission. Taking a write-in instruction as an example, the write-in instruction needs to transmit not only an access address but also write-in data at the same time, so that the number of sub-requests which can be processed in parallel under the same bandwidth condition is limited, and particularly under the scene that the number of access requests carrying the write-in data is large or the access mode is dense, the request processing scheme in the related art still has a certain limitation in terms of data throughput capability. It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art. Disclosure of Invention The disclosure aims to provide a request processing module, a request processing method, electronic equipment and a storage medium, which can realize dynamic combination of write-in memory access requests, thereby meeting the high throughput requirement of outputting a plurality of combination requests in the same clock cycle. According to a first aspect of the present disclosure, there is provided a request processing module comprising: the first-stage merging sub-module is used for aggregating the memory access sub-requests into a plurality of first access requests under the condition that the memory access sub-requests corresponding to the writing type instruction are received; And the second-stage merging sub-module is used for receiving the plurality of first access requests and updating the pre-configured address mapping information according to the target address information carried in the plurality of first access requests in the same clock period so as to generate a plurality of second access requests. In an exemplary embodiment of the present disclosure, the second stage merging sub-module includes: the system comprises a plurality of merging buffer units, a plurality of memory unit and a plurality of memory unit, wherein each merging buffer unit is used for storing address mapping information corresponding to each merging buffer unit, and the address mapping information comprises memory address information and write operation information corresponding to preset merging granularity; And the memory address comparison unit is used for matching the target address information carried in each received first access request with the memory address information in each merging buffer unit in the same clock period to obtain a matching result, so that each merging buffer unit updates the address mapping information according to the matching result, and generates the second access request. In an exemplary embodiment of the present disclosure, each of the merge buffer units is configured to: In each clock period, when detecting that the target address information carried in each first access request is matched with the memory address information stored in any merging buffer unit, merging the write operation information carried in each first access request with the write operation information stored in the matched merging buffer unit to obtain aggregation request information; And generating the second access request according to the aggregation request information under the condition that the preset trigger condition is met. In an exemplary embodiment of the present disclosure, the preset trigger condition includes at least one of the matched merge buffer unit being occupied by a new first access request, and detecting that the first access request carries an instruction end identifier. In an exemplary embodiment of the present disclosure, each of the merge buffer units is further configured to: In each clock period, when detecting that the target address information carried in each first access request is not matched with the memory address information stored in any merging buffer unit, generating the second access request according to the memory addres