WO-2026092649-A1 - INTEGRATED CIRCUIT FOR DATA PROCESSING, PROCESSOR AND COMPUTING DEVICE

WO2026092649A1WO 2026092649 A1WO2026092649 A1WO 2026092649A1WO-2026092649-A1

Abstract

The present disclosure relates to an integrated circuit for data processing, a processor and a computing device. The integrated circuit comprises a data storage circuit and a matrix operation circuit. The data storage circuit is configured to store matrix data. The matrix operation circuit comprises: a control logic circuit configured to: on the basis of a specified data format, read a first matrix and a second matrix from the data storage circuit, the specified data format indicating that elements in the first matrix are floating-point data, and elements in the second matrix are integer data; and an operation logic circuit configured to: convert elements in the first matrix and in the second matrix into elements of a target floating-point type on the basis of the data type of the elements indicated by the specified data format and determine, on the basis of the converted elements of the first matrix and the second matrix, a result matrix corresponding to the first matrix and the second matrix.

Inventors

WEN, Si
GAO, KUN
HAN, LIANG
LI, Xiaosen
LIU, YONGGANG

Assignees

摩尔线程智能科技(上海)有限责任公司

Dates

Publication Date: 20260507
Application Date: 20251031
Priority Date: 20241031

Claims (14)

An integrated circuit for data processing includes: Data storage circuitry is configured to store matrix data; and, Matrix operation circuit, including: The control logic circuit is configured to: read a first matrix and a second matrix from the data storage circuit based on a specified data format, wherein the specified data format indicates that the elements in the first matrix are floating-point data and the elements in the second matrix are integer data; and, The arithmetic logic circuit is configured to: convert elements in the first matrix and the second matrix into target floating-point elements according to the data type of the elements indicated by the specified data format, and determine a result matrix corresponding to the first matrix and the second matrix based on the converted elements of the first matrix and the second matrix.
According to claim 1, the integrated circuit, wherein the operational logic circuit comprises: At least one conversion circuit, wherein each conversion circuit is configured to: receive elements from the first matrix or the second matrix and element type data indicating the data type of the received elements; and convert the received elements into elements of the target floating-point type based on the element type data, wherein the element type data is determined based on the specified data format; and, An arithmetic circuit, coupled to the conversion circuit, is configured to perform at least one of multiplication and addition operations based on the elements of the converted first matrix and second matrix to determine the result matrix.
According to claim 2, the integrated circuit, wherein each conversion circuit is configured as follows: In response to the element type data indicating that the received element is integer data, the target floating-point element corresponding to the element is determined based on a pre-stored mapping table or by calling a preset conversion logic.
According to claim 2 or 3, the integrated circuit is configured as follows: In response to the element type data indicating that the received element is 4-bit integer data, the element corresponding to the target floating-point data is determined based on the mapping table from 4-bit integer data to target floating-point data; In response to the element type data indication that the received element is 8-bit integer data, the sign value, exponent, and mantissa corresponding to the element are determined according to the data structure of 8-bit integer data, and the target floating-point element corresponding to the element is determined based on the determined sign value, exponent, and mantissa.
The integrated circuit according to any one of claims 1 to 4, wherein the integer elements in the second matrix are closely packed.
According to claim 5, the integrated circuit, wherein the data storage circuit includes a register array, and the integer elements of the second matrix are stored in the register array in a closely packed manner according to the matrix structure. The control logic circuit is configured to read closely packed integer elements of the second matrix from the register array.
According to claim 5, the integrated circuit, wherein the data storage circuit includes a buffer, and the integer elements of the second matrix are stored in the buffer in an interleaved pattern in a compact arrangement, wherein the interleaving pattern matches the data type of the integer elements and the size of the matrix stored in the buffer, and in the interleaving pattern, row-major elements of the second matrix in the same column are stored in different memory locations of the buffer, or column-major elements of the second matrix in the same row are stored in different memory locations of the buffer. The control logic circuit is configured to read closely packed integer elements of the second matrix from multiple storage bodies of the cache according to the interleaving mode.
The integrated circuit according to any one of claims 1 to 7 further comprises: A data loading circuit is configured to load an external matrix, including the second matrix, into the data storage circuit according to a loading step size, wherein the loading step size matches the size of the external matrix to be loaded and the data type of the integer elements.
The integrated circuit according to any one of claims 1 to 8, wherein the control logic circuit is configured as follows: Obtain the read step size, wherein the read step size is matched with the size of the second matrix and the data type of the integer elements; The integer elements of the second matrix are read from the data storage circuit according to the read step size.
The integrated circuit according to any one of claims 1 to 9, wherein the target floating-point type is a non-standard floating-point data type used by the hardware circuit when performing floating-point operations.
The integrated circuit according to any one of claims 2 to 10, wherein the operational logic circuit comprises a plurality of sub-circuits, wherein each sub-circuit corresponds to an element in the result matrix and includes: The conversion module is configured to: receive elements from the first matrix and the second matrix corresponding to the elements in the result matrix, and convert the received elements into the target floating-point elements based on the data type of the received elements; The arithmetic module, coupled to the conversion module, is configured to: perform at least one operation of multiplication and addition based on the elements of the converted first matrix and second matrix to obtain the target floating-point element corresponding to the element in the result matrix; The regularization module, coupled to the operation module, is configured to convert the obtained target floating-point element into a desired floating-point element, wherein the desired floating-point type is a standard floating-point data type different from the target floating-point type.
The integrated circuit according to any one of claims 2 to 11, wherein the conversion circuit comprises: A floating-point branch circuit is configured to: in response to the element type data indicating that the received element is a floating-point element, determine the sign value, exponent, and mantissa corresponding to the element according to the floating-point type data structure indicated by the element type data, and determine the target floating-point element corresponding to the element based on the determined sign value, exponent, and mantissa; The integer branch circuit is configured to: in response to the element type data indicating that the received element is an integer element, determine the target floating-point element corresponding to the element based on a pre-stored mapping table or by calling a preset conversion logic.
A processor comprising an integrated circuit according to any one of claims 1-12.
A computing device comprising the processor according to claim 13.

Description

Integrated circuits, processors, and computing devices for data processing This application claims priority to Chinese Patent Application No. 202411548446.8, filed on October 31, 2024, entitled "Integrated Circuit, Processor and Computing Device for Data Processing", the entire contents of which are incorporated herein by reference. Technical Field This disclosure relates to the field of integrated circuit technology, and more specifically, to an integrated circuit, processor, and computing device for data processing. Background Technology In fields such as Artificial Intelligence (AI), a large number of matrix operations are often involved. Taking AI models as an example, the training process involves a large number of operations between feature matrices, weight matrices, and other matrices. As the model structure becomes more complex and larger, on the one hand, the efficiency of matrix operations will limit the model training efficiency; on the other hand, the massive number of weight parameters will occupy a high amount of storage space. Summary of the Invention This disclosure provides an integrated circuit, processor, and computing device for data processing. According to one aspect of the present disclosure, an integrated circuit for data processing is provided, comprising: a data storage circuit configured to store matrix data; and a matrix operation circuit, comprising: a control logic circuit configured to: read a first matrix and a second matrix from the data storage circuit based on a specified data format, wherein the specified data format indicates that elements in the first matrix are floating-point data and elements in the second matrix are integer data; and an operation logic circuit configured to: convert elements in the first matrix and the second matrix into target floating-point elements according to the data type of the elements indicated by the specified data format, and determine a result matrix corresponding to the first matrix and the second matrix based on the converted elements of the first matrix and the second matrix. In some embodiments, the integer elements in the second matrix are closely packed. In some embodiments, the arithmetic logic circuit includes: at least one conversion circuit, wherein each conversion circuit is configured to: receive elements in a first matrix or a second matrix and element type data indicating the data type of the received elements, and convert the received elements into target floating-point elements based on the element type data, wherein the element type data is determined based on a specified data format; and an arithmetic circuit, coupled to the conversion circuit, and configured to: perform at least one operation of multiplication and addition based on the elements of the converted first matrix and second matrix to determine a result matrix. In some embodiments, each conversion circuit is configured to: in response to an element type data indication that the received element is integer data, determine the target floating-point element corresponding to the element based on a pre-stored mapping table or by invoking preset conversion logic. In some embodiments, the conversion circuit is configured to: in response to an element type data indication that the received element is 4-bit integer data, determine the target floating-point element corresponding to the element based on a mapping table from 4-bit integer data to target floating-point data; in response to an element type data indication that the received element is 8-bit integer data, determine the sign value, exponent, and mantissa corresponding to the element according to the data structure of the 8-bit integer data, and determine the target floating-point element corresponding to the element based on the determined sign value, exponent, and mantissa. In some embodiments, the data storage circuitry includes a register array, on which integer elements of the second matrix are stored in a close-packed manner according to the matrix structure, wherein the control logic circuitry is configured to read the close-packed integer elements of the second matrix from the register array. In some embodiments, the data storage circuit includes a buffer in which integer elements of the second matrix are stored in a closely packed manner in an interleaved pattern. The interleaved pattern matches the data type of the integer elements and the size of the matrix stored in the buffer. In the interleaved pattern, elements in the same column of the row-major second matrix are stored in different storage cells of the buffer, or elements in the same row of the column-major second matrix are stored in different storage cells of the buffer. The control logic circuit is configured to read closely packed integer elements of the second matrix from multiple storage cells of the buffer according to the interleaved pattern. In some embodiments, the integrated circuit further includes a data loading circuit configured