CN-122019411-A - Memory access optimization method, equipment, storage medium and product
Abstract
The embodiment of the application provides a memory access optimization method, equipment, a storage medium and a product, and relates to the technical field of artificial intelligent chips; based on the hardware hierarchical structure of the artificial intelligent chip, the access memory address is segmented, and a plurality of hierarchical address segments are obtained. The method comprises the steps of carrying out multi-level hash calculation on a plurality of hierarchical address segments to obtain corresponding target hash results, and then combining the obtained target hash results to obtain target channel indexes, wherein the obtained target channel indexes are not gathered in part of memory channels but are uniformly dispersed on different memory channels, so that load balancing of memory access is realized. Because each memory channel corresponds to an independent memory controller, when a plurality of memory access requests are uniformly dispersed to different memory channels, the utilization rate of the memory controller is effectively improved, and the utilization rate of the whole memory bandwidth is further improved.
Inventors
- Request for anonymity
- Request for anonymity
- Request for anonymity
- Request for anonymity
- Request for anonymity
Assignees
- 上海壁仞科技股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260123
Claims (10)
- 1. A memory access optimization method, the method comprising: Receiving a memory access request sent by a computing core, wherein the memory access request carries a memory access address; based on a hardware hierarchical structure of the artificial intelligent chip, the access address is segmented to obtain a plurality of hierarchical address segments, wherein the hardware hierarchical structure comprises a plurality of hardware hierarchies, and each hardware hierarchy corresponds to one hierarchical address segment; respectively carrying out hash calculation on the plurality of hierarchical address segments to obtain a corresponding target hash result; combining the obtained multiple target hash results to obtain a target channel index; And sending the access request to a corresponding memory channel for execution according to the target channel index, and obtaining an execution result.
- 2. The method of claim 1, wherein the hardware hierarchy of the artificial intelligence chip comprises a plurality of hardware levels, each hardware level corresponding to a level address segment.
- 3. The method of claim 1, wherein performing hash computation on the plurality of hierarchical address segments, respectively, to obtain corresponding target hash results, comprises: And respectively carrying out hash calculation on the plurality of hierarchical address segments according to the order of the hierarchical layers from high to low to obtain a corresponding target hash result.
- 4. The method of claim 3, wherein performing hash computation on the plurality of hierarchical address segments in order of hierarchy from high to low to obtain corresponding target hash results, respectively, includes: the following hash calculation process is respectively carried out on the plurality of hierarchical address segments according to the order of the hierarchical layers from high to low: Carrying out hash calculation on the reference address segment carried by the access address and the hierarchical address segment to obtain a preliminary hash result; and obtaining the target hash result of the hierarchical address segment based on the preliminary hash result and the obtained target hash results of other hierarchical address segments.
- 5. The method of claim 4, wherein the obtaining the target hash result for the hierarchical address segment based on the preliminary hash result and the obtained target hash result for the other hierarchical address segment comprises: And performing exclusive OR operation on the preliminary hash result and the obtained target hash results of other hierarchical address segments to obtain the target hash result of the hierarchical address segment.
- 6. The method of claim 1, wherein the memory address further comprises a reference address segment, and wherein performing hash computation on the plurality of hierarchical address segments, respectively, to obtain a corresponding target hash result comprises: And carrying out hash calculation on the reference address segment and the hierarchical address segment aiming at each hierarchical address segment to obtain a corresponding target hash result.
- 7. The method according to any one of claims 1 to 6, wherein combining the obtained plurality of target hash results to obtain the target channel index includes: And combining the target hash results according to the order of the layers from high to low to obtain a target channel index.
- 8. The method of any of claims 1-6, wherein the target channel index comprises a sub-index of each of the plurality of hardware levels; and sending the access request to a corresponding memory channel for execution according to the target channel index to obtain an execution result, wherein the method comprises the following steps of: Determining, for each hardware level, a hardware resource that is matched under the hardware level based on a sub-index of the hardware level; Determining a memory channel matched with the access request based on the hardware resources matched with the hardware layers respectively; And sending the access request to the memory channel for execution, and obtaining an execution result.
- 9. The method of claim 8, wherein the sending the access request to the memory channel for execution, obtaining an execution result, comprises: And sending the memory access request to the memory channels for execution through the memory controller to obtain an execution result, wherein the memory channels are positioned in a video memory, the video memory comprises a plurality of memory channels, and each memory channel corresponds to one memory controller.
- 10. A computer device comprising a memory, an artificial intelligence chip and a computer program stored on the memory and running on the artificial intelligence chip, characterized in that the artificial intelligence chip implements the steps of the method according to any one of claims 1-9 when the computer program is executed by the artificial intelligence chip.
Description
Memory access optimization method, equipment, storage medium and product Technical Field The embodiment of the application relates to the technical field of artificial intelligent chips, in particular to a memory access optimization method, equipment, a storage medium and a product. Background An artificial intelligence chip (such as General-Purpose Graphics Processing Units, GPGPU) architecture typically includes a multi-level cache hierarchy in which access performance to high bandwidth memory (High Bandwidth Memory, HBM) is critical to the computational performance of the entire artificial intelligence chip. In the related art, in the GPGPU architecture, the HBM generally adopts a multi-channel (multi-channel) architecture, that is, includes a plurality of memory channels (memory channels). The computing core (computer core) distributes the memory requests to specific memory channels in a direct mapping manner. However, when the address has the characteristic of fixed step length, the direct mapping method can lead to the continuous memory access requests being mapped to the same memory channel, thus causing overload of some memory channels and idle condition of other memory channels, thereby leading to unbalanced resource utilization and further affecting the overall performance of the chip. Disclosure of Invention The embodiment of the application provides a memory access optimization method, equipment and a storage medium, which are used for uniformly mapping memory access requests to a plurality of memory channels to realize memory access load balancing, so that the resource utilization rate and the overall performance of a chip are improved. In one aspect, an embodiment of the present application provides a memory access optimization method, where the method includes: Receiving a memory access request sent by a computing core, wherein the memory access request carries a memory access address; based on a hardware hierarchical structure of the artificial intelligent chip, the access address is segmented to obtain a plurality of hierarchical address segments, wherein the hardware hierarchical structure comprises a plurality of hardware hierarchies, and each hardware hierarchy corresponds to one hierarchical address segment; respectively carrying out hash calculation on the plurality of hierarchical address segments to obtain a corresponding target hash result; combining the obtained multiple target hash results to obtain a target channel index; And sending the access request to a corresponding memory channel for execution according to the target channel index, and obtaining an execution result. In one aspect, an embodiment of the present application provides a memory access optimization apparatus, including: the acquisition module is used for receiving a memory access request sent by the computing core, wherein the memory access request carries a memory access address; the segmentation module is used for segmenting the access address based on a hardware hierarchical structure of the artificial intelligent chip to obtain a plurality of hierarchical address segments, wherein the hardware hierarchical structure comprises a plurality of hardware hierarchies, and each hardware hierarchy corresponds to one hierarchical address segment; The matching module is used for respectively carrying out hash computation on the multiple hierarchical address segments to obtain corresponding target hash results, combining the multiple obtained target hash results to obtain a target channel index, and sending the access request to a corresponding memory channel for execution according to the target channel index to obtain an execution result. Optionally, the hardware hierarchy of the artificial intelligent chip comprises a plurality of hardware hierarchies, and each hardware hierarchy corresponds to one hierarchy address segment. Optionally, the matching module is specifically configured to: And respectively carrying out hash calculation on the plurality of hierarchical address segments according to the order of the hierarchical layers from high to low to obtain a corresponding target hash result. Optionally, the matching module is specifically configured to: the following hash calculation process is respectively carried out on the plurality of hierarchical address segments according to the order of the hierarchical layers from high to low: Carrying out hash calculation on the reference address segment carried by the access address and the hierarchical address segment to obtain a preliminary hash result; and obtaining the target hash result of the hierarchical address segment based on the preliminary hash result and the obtained target hash results of other hierarchical address segments. Optionally, the matching module is specifically configured to: And performing exclusive OR operation on the preliminary hash result and the obtained target hash results of other hierarchical address segments to obtain the target hash resul