Search

CN-116450338-B - Method for acquiring maximum continuous resource block of GPU

CN116450338BCN 116450338 BCN116450338 BCN 116450338BCN-116450338-B

Abstract

The invention relates to a method for acquiring a maximum continuous resource block of a GPU (graphics processing unit), which comprises the steps of E1, reading a current resource state sequence S 0 ={d 1 ,d 2 ,…d N },d n of a resource to be checked as a state identifier of an nth resource block of the resource to be checked, wherein the value range of N is 1 to N, N is the total number of resource blocks of the resource to be checked, E2, equally dividing S 0 into a Z group resource state sequence { U 1 ,U 2 ,…U Z }, wherein U z is a Z group resource state sequence, U z ={d N*(z‑1)/Z+1 ,d N*(z‑1)/Z+2 ,…d N*z/Z } and the value range of Z is 1 to Z, Z is smaller than N, Z can be divided by N, carrying out bit-wise AND operation or bit-wise OR operation on each U z in { U 1 ,U 2 ,…U Z } to generate a bit-wise AND operation or bit-wise OR operation result corresponding to U z of the state sequence F 0 ={UA 1 ,UA 2 ,…UA Z },UA z to be checked, and E3, determining the current maximum continuous resource block number of the resource to be checked based on F 0 . The invention reduces the area and the power consumption of the GPU chip and improves the resource utilization rate and the computing performance of the GPU.

Inventors

  • Request for anonymity
  • Request for anonymity
  • Request for anonymity
  • Request for anonymity

Assignees

  • 沐曦集成电路(上海)有限公司

Dates

Publication Date
20260512
Application Date
20220110

Claims (5)

  1. 1. A method for obtaining the maximum continuous resource block of a GPU is characterized in that, The method comprises the following steps: E1, reading a current resource state sequence S 0 ={d 1 ,d 2 ,…d N },d n of the resource to be checked as a state identifier of an nth resource block of the resource to be checked, wherein the value range of N is 1 to N, and N is the total number of resource blocks of the resource to be checked; E2, equally dividing S 0 into Z groups of resource state sequences { U 1 ,U 2 ,…U Z }, wherein U z is a Z group of resource state sequences, U z ={d N*(z-1)/Z+1 , d N*(z-1)/Z+2 ,…d N*z/Z }, the value range of Z is 1 to Z, Z is smaller than N, Z can be divided by N, each U z in { U 1 ,U 2 ,…U Z } is subjected to bit-wise AND operation or bit-wise OR operation, and a bit-wise AND operation or bit-wise OR operation result corresponding to a state sequence F 0 ={UA 1 ,UA 2 ,…UA Z },UA z to be processed is U z is generated; e3, determining the current maximum continuous resource block number of the resource to be checked based on F 0 ; The step E3 includes: E31, parallelly acquiring a state sequence F 1 、F 2 、…F Z-1 of F 0 moving j bits in a preset direction, wherein F j is F 0 moving j bits in the preset direction, and setting the j bits at the tail of the preset direction as a sequence obtained by occupied identification, wherein the value range of j is 0 to Z-1; step E32, obtaining the results FA j of the bit-wise AND operation or the bit-wise OR operation of F 0 to F j in parallel; e33, performing self-OR operation or self-negation operation on each FA j to determine the current maximum continuous resource block number of the resource to be checked; step E33 includes: Step E331, performing self-OR operation or self-and-post negation operation on each FA j to obtain FAR j ; step E332, generating a first sequence { FAR 0 ,FAR 1 ,…FAR Z-1 } to be tested based on all FARs j ; Step E333, determining the current maximum number of continuous resource blocks of the resource to be checked based on { FAR 0 ,FAR 1 ,…FAR Z-1 }; The step E333 includes: Step E3331, reading from FAR Z-1 bits in { FAR 0 ,FAR 1 ,…FAR Z-1 }, determining that a first value j equal to 1 for FAR j occurs, setting j' =j+1; step E3332, determining the current maximum number of consecutive resource blocks X of the resource under investigation based on j': X=j’*(N/Z)。
  2. 2. the method of claim 1, wherein the step of determining the position of the substrate comprises, In the step E31, the moving j bits in the preset direction includes moving j bits left or moving j bits right.
  3. 3. The method of claim 1, wherein the step of determining the position of the substrate comprises, In the step E1, the status identifier includes an occupied identifier and an unoccupied identifier.
  4. 4. The method of claim 3, wherein the step of, The occupied mark is 0, the unoccupied mark is 1, in the step E2, each U z in { U 1 ,U 2 ,…U Z } is bitwise and operated, in the step E32, bitwise and operated is executed, and in the step E33, a self-or operation is executed; Or the occupied mark is 1, the unoccupied mark is 0, in the step E2, each U z in { U 1 ,U 2 ,…U Z } is bitwise or operated, in the step E32, bitwise or operation is performed, in the step E33, a self-and-post negation operation is performed.
  5. 5. The method of claim 1, wherein N is an integer multiple of 4 and Z is N/4.

Description

Method for acquiring maximum continuous resource block of GPU Technical Field The invention relates to the technical field of computers, in particular to a method for acquiring a maximum continuous resource block of a Graphic Processing Unit (GPU). Background Graphics processors (Graphics Processing Unit, referred to as GPUs) also known as display cores, vision processors, display chips are designed for computationally intensive, highly parallelized computations. In the process of executing tasks by the GPU, if the allocation of any one of the resources is unbalanced, the waste of the GPU resources can be caused, so that the utilization rate of the GPU resources and the computing performance of the GPU are reduced. Therefore, in the running process of the GPU, each GPU resource needs to be balanced and scheduled as much as possible, so that each GPU resource is in a resource balance state as much as possible, and the running of the whole GPU is in the resource balance state, thereby improving the resource utilization rate and the computing performance of the GPU. However, in the prior art, when the GPU performs a task, particularly when the GPU performs a complex computing task, it is still difficult to implement balanced scheduling of GPU resources, a great deal of time is generally required to allocate resources, and the allocation result cannot ensure balanced resources, so that reliability is poor. Therefore, how to provide an efficient and reliable GPU resource balance scheduling technology, to reasonably allocate corresponding GPU resources for multiple task groups, to improve task processing efficiency, and to improve GPU resource utilization and computing performance, is a technical problem to be solved. Disclosure of Invention The invention aims to provide a method for acquiring the maximum continuous resource block of a GPU, which reduces the area and the power consumption of a GPU chip and improves the resource utilization rate and the computing performance of the GPU. According to the invention, a method for acquiring the maximum continuous resource block of a GPU is provided, which comprises the following steps: E1, reading a current resource state sequence S 0={d1,d2,…dN},dn of the resource to be checked as a state identifier of an nth resource block of the resource to be checked, wherein the value range of N is 1 to N, and N is the total number of resource blocks of the resource to be checked; E2, equally dividing S 0 into Z groups of resource state sequences { U 1,U2,…UZ }, wherein U z is a Z group of resource state sequences, U z={dN*(z-1)/Z+1,dN*(z-1)/Z+2,…dN*z/Z }, the value range of Z is 1 to Z, Z is smaller than N, Z can be divided by N, each U z in { U 1,U2,…UZ } is subjected to bit-wise AND operation or bit-wise OR operation, and a bit-wise AND operation or bit-wise OR operation result corresponding to a state sequence F 0={UA1,UA2,…UAZ},UAz to be processed is U z is generated; And E3, determining the current maximum continuous resource block number of the resource to be checked based on F 0. Compared with the prior art, the invention has obvious advantages and beneficial effects. By means of the technical scheme, the method for acquiring the maximum continuous resource block of the GPU can achieve quite technical progress and practicality, has wide industrial utilization value, and has at least the following advantages: According to the invention, the current resource state sequence of the resource to be checked is segmented into the Z groups, so that the state sequence is shortened from N bits to Z bits, the subsequent calculated amount is greatly reduced, and the area and the power consumption of the GPU chip are reduced. The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention, as well as the preferred embodiments thereof, together with the following detailed description of the invention, given by way of illustration only, together with the accompanying drawings. Drawings FIG. 1 is a schematic diagram of a prior art multi-tasking channel issuing a task group to a GPU; FIG. 2 is a flowchart of a GPU resource scheduling method according to a first embodiment; FIG. 3 is a flowchart of a GPU resource scheduling method according to a second embodiment; FIG. 4 is a flowchart of a GPU resource scheduling method according to the third embodiment; fig. 5 is a flowchart of a method for obtaining a GPU maximum continuous resource block according to the fourth embodiment; Fig. 6 is a flowchart of a method for obtaining a GPU maximum continuous resource block according to the fifth embodiment; fig. 7 is a flowchart of a method for obtaining a GPU maximum continuous resource block according to the sixth embodiment; Fig. 8 is a flowchart of a method for acquiring a GPU maximum continuous resource block based on time division multiplexing according to the seventh embodiment; FIG. 9 is a flow