CN-116136894-B - Data processing method based on matrix processor and readable storage medium

CN116136894BCN 116136894 BCN116136894 BCN 116136894BCN-116136894-B

Abstract

The application provides a data processing method based on a matrix processor and a readable storage medium, wherein the method comprises the steps of reading W first elements in a first matrix, sending the W first elements to a computing unit of the matrix processor for computing, wherein W is larger than the width N of the first matrix and smaller than or equal to the number K of the computing unit of the matrix processor, repeating the steps until the number of the residual elements in the first matrix is smaller than W, and sending the residual elements to the computing unit for computing in response to the fact that the number of the residual elements in the first matrix is not zero. The method can improve the utilization rate of the computing unit of the matrix processor, reduce the cycle number used for computation, shorten the computation time and fully utilize the computing unit of the matrix processor.

Inventors

WANG XUEDONG
PAN WEIXING

Assignees

北京希姆计算科技有限公司

Dates

Publication Date: 20260505
Application Date: 20211118

Claims (7)

1. A method for matrix processor-based data processing, comprising: Reading W first elements in a first matrix to a first register unit of a matrix processor, reading W first elements from the first register unit and sending the W first elements to a calculation unit of the matrix processor, and reading W calculation results from a result register unit of the matrix processor and sending the W calculation results to the calculation unit of the matrix processor, so that the calculation unit calculates and updates the calculation results to the result register unit, wherein the result register unit is used for caching the calculation results of each calculation of the calculation unit, and W is larger than the width N of the first matrix and smaller than or equal to the number K of the calculation units of the matrix processor; Repeating the steps until the number of the residual elements in the first matrix is smaller than W; responding to the number of the residual elements in the first matrix being not zero, and sending the residual elements to the calculation unit for calculation; determining that u×n elements are contained in the result register in response to the number of remaining elements in the first matrix being zero; Setting L equal to the integer part of u/2; If u is an even number, dividing the elements in the result register into two groups according to a storage sequence, wherein each group contains L.N elements, sending the two groups of elements into the calculation unit for calculation, and outputting the L.N elements of the calculated result to the result register unit; If u is an odd number, dividing the elements in the result register into three groups according to a storage sequence, wherein a first group and a second group comprise L.N elements, a third group comprises N elements, the first group and the second group of elements are sent to a calculation unit to be calculated, L.N elements of a calculation result are output to the result register unit, and the result register unit comprises L.N elements of the calculation result and N elements of the third group of elements; The above steps are repeated until u=1.
2. The method of claim 1, wherein said feeding the W first elements into the computation unit of the matrix processor for computation comprises: responding to the set W=K, and clipping W first elements in the read first matrix according to a data sequence to obtain W 1 elements, wherein W 1 =beta.N≤W, beta >1, and beta is an integer; And sending the W 1 elements to the calculation unit of the matrix processor for calculation.
3. The method of claim 2, further comprising, prior to reading the W elements in the source matrix: Determining a read address of the first matrix, wherein the read address is addr+β×n (C-1), addr is an address of a first element of the first matrix, and C is the number of times of reading the first matrix.
4. A method for matrix processor-based data processing, comprising: Reading W first elements in a first matrix, and acquiring W second elements corresponding to the W first elements in a second matrix, wherein W is larger than the width N of the first matrix and smaller than or equal to the number K of computing units of the matrix processor; The W first elements and the W second elements are sent to a calculation unit of the matrix processor for calculation; The method comprises the steps of obtaining W second elements corresponding to W first elements in a second matrix, wherein the obtaining of the W second elements comprises the steps of responding to the fact that W=alpha is equal to or less than N and the width of the first matrix is equal to that of the second matrix, reading N second elements in the second matrix to a second register unit of a matrix processor, wherein alpha is 1 and alpha is an integer, copying the N second elements in the second register unit to obtain W second elements by expansion, or responding to the fact that W=alpha is equal to or less than N and the height of the first matrix is equal to that of the second matrix, reading alpha second elements in the second matrix to the second register unit of the matrix processor, copying the alpha second elements in the second register unit to obtain W second elements by expansion, or copying the W second elements in the second register unit to obtain W second elements by responding to the fact that the width and the height of the second matrix are 1, reading the second elements in the second register unit to obtain W second elements by expansion, respectively copying the W second elements in the second register unit 1; Repeating the steps until the number of the residual elements in the first matrix is smaller than W; And responding to the fact that the number of the residual elements in the first matrix is not zero, and sending the residual elements to the computing unit for computing.
5. The method of claim 4, further comprising, prior to said reading W first elements in the first matrix: receiving an operation instruction, and analyzing the received operation instruction to determine the operation type indicated by the operation instruction; and determining the value of W according to the determined operation type, the width of the first matrix and the number of calculation units.
6. The method of claim 4, further comprising, prior to said reading W first elements in the first matrix: performing register parameter configuration according to a register instruction, wherein the register instruction at least comprises the width, the height and the row interval number of the first matrix; In response to the width and the row spacing number of the first matrix being equal, confirming that the source addresses of the first matrix are consecutive.
7. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any one of claims 1 to 3, 4 to 6.

Description

Data processing method based on matrix processor and readable storage medium Technical Field The application belongs to the technical field of processors, and particularly relates to a data processing method based on a matrix processor and a readable storage medium. Background In matrix processor designs, which often have multiple computing units, it becomes important how to use these efficiently, however sometimes the width of the matrix that needs to be computed is smaller than the width of the computing units. In the prior art, generally, one row of data of a matrix is read each time, and then the data is calculated according to rows, and when the matrix with a small size is calculated, all calculation units cannot be fully utilized, so that the required calculation period is increased, and the utilization rate of the calculation units is low. Disclosure of Invention It is an object of the application to provide a matrix processor based data processing method and a readable storage medium, the method can improve the utilization rate of the computing unit of the matrix processor, shorten the computing period and fully utilize the computing unit of the matrix processor. According to one aspect of the application, a data processing method based on a matrix processor is provided, and the method comprises the steps of reading W first elements in a first matrix, sending the W first elements to a computing unit of the matrix processor for computing, wherein W is larger than the width N of the first matrix and smaller than or equal to the number K of the computing unit of the matrix processor, repeating the steps until the number of the residual elements in the first matrix is smaller than W, and sending the residual elements to the computing unit for computing in response to the number of the residual elements in the first matrix is not zero. Optionally, before the reading of the W first elements in the first matrix, the method further includes receiving an operation instruction, analyzing the received operation instruction to determine an operation type indicated by the operation instruction, and determining a value of W according to the determined operation type, the width of the first matrix and the number of calculation units. Optionally, W first elements in the first matrix are read to the first register unit of the matrix processor, and w=k is set. Optionally, the step of sending the W first elements to a calculation unit of a matrix processor for calculation includes clipping the W first elements in the read first matrix according to a data sequence in response to the set w=k to obtain W 1 elements, where W 1 =β×n is less than or equal to W, β >1, and β is an integer, and sending the W 1 elements to the calculation unit of the matrix processor for calculation. Optionally, before the W elements in the source matrix are read, determining a read address of the first matrix, where the read address is addr+β×n (C-1), addr is an address of a first element of the first matrix, and C is a number of times of reading the first matrix. Optionally, the method comprises the steps of reading W first elements in a first matrix and sending the W first elements to a calculation unit of a matrix processor for calculation, wherein the method comprises the steps of reading W first elements in the first matrix to a first register unit of the matrix processor, reading W first elements from the first register unit and sending the W first elements to the calculation unit of the matrix processor, reading W calculation results from a result register unit of the matrix processor and sending the W calculation results to the calculation unit of the matrix processor, enabling the calculation unit to calculate and updating the calculation results to the result register unit, and the result register unit is used for caching the calculation results of each calculation by the calculation unit. Optionally, the method includes reading W first elements from the first register unit and sending the W first elements to a calculation unit of the matrix processor for calculation, outputting a result of the calculation to a result register unit of the matrix processor, determining that u×n elements are included in the result register in response to the number of remaining elements in the first matrix being zero, setting an integer part equal to u/2, dividing the elements in the result register into two groups, each group including l×n elements according to a storage order, sending the two groups of elements to the calculation unit for calculation, outputting l×n elements of a result of the calculation to the result register unit, dividing the elements in the result register into three groups, wherein the first group and the second group include l×n elements, the third group includes N elements, sending the first group and the second group to the calculation unit according to a storage order, sending the l×n elements of the calculation result to t