EP-4002092-B1 - PROCESSOR FOR FINE-GRAIN SPARSE INTEGER AND FLOATING-POINT OPERATIONS

EP4002092B1EP 4002092 B1EP4002092 B1EP 4002092B1EP-4002092-B1

Inventors

SHAFIEE ARDESTANI, Ali
HASSOUN, JOSEPH H.

Dates

Publication Date: 20260506
Application Date: 20210920

Claims (15)

A method, comprising: forming a first set of products, each product of the first set of products being an integer product of a first activation value and a respective weight of a first plurality of weights; and forming a second set of products, each product of the second set of products being a floating-point product of a second activation value and a respective weight of a second plurality of weights, each of the weights of the first plurality of weights including a least significant sub-word and a most significant sub-word; the most significant sub-word of a first weight of the first plurality of weights being nonzero; the most significant sub-word of a second weight of the first plurality of weights being zero, the least significant sub-word of the second weight of the first plurality of weights being nonzero; the forming of the first set of products comprising: multiplying, in a first multiplier, the first activation value by the least significant sub-word of the first weight to form a first partial product; multiplying, in a second multiplier, the first activation value by the least significant sub-word of the second weight to form the product of the first activation value and the second weight; multiplying, in a third multiplier, the first activation value by the most significant sub-word of the first weight to form a second partial product; and adding the first partial product and the second partial product to form the product of the first activation value and the first weight; the forming of the second set of products comprising forming a first floating point product, the forming of the first floating point product comprising multiplying, in the first multiplier, a first sub-word of a mantissa of the second activation value by a first sub-word of a mantissa of a first weight of the second plurality of weights, to form a third partial product.
The method of claim 1, wherein: the first multiplier is configured to receive a first argument and a second argument, the first argument has a first argument size, the second argument has a second argument size, and the first argument size of is greater than the second argument size.
The method of claim 2, wherein: the forming of the first floating point product comprises: receiving, by the first multiplier, a first argument; receiving, by the first multiplier, a second argument; and multiplying the first argument by the second argument; the first argument includes: the first sub-word of the mantissa of the second activation value, and a second sub-word of the mantissa of the second activation value; and the second argument includes the first sub-word of the mantissa of the first weight of the second plurality of weights.
The method of claim 2, wherein: the forming of the first floating point product comprises: receiving, by the first multiplier, a first argument; receiving, by the first multiplier, a second argument; and multiplying the first argument by the second argument; the first argument includes: the first sub-word of the mantissa of the first weight of the second plurality of weights, and a second sub-word of the mantissa of the first weight of the second plurality of weights; and the second argument includes the first sub-word of the mantissa of the second activation value.
The method of claim 2, wherein the second multiplier is configured to receive a first argument having a first size and a second argument having a second size, the first size being greater than the second size.
The method of claim 5, wherein: the forming of the second set of products further comprises forming a second floating point product; the forming of the second floating point product comprises: receiving, by the second multiplier, a first argument; receiving, by the second multiplier, a second argument; and multiplying the first argument received by the second multiplier by the second argument received by the second multiplier; the first argument received by the second multiplier includes: the first sub-word of the mantissa of the first activation value, and a sub-word composed of zeros; and the second argument received by the second multiplier includes: a third sub-word of the mantissa of the first weight of the second plurality of weights.
The method of claim 1, wherein the adding of the first partial product and the second partial product comprises performing an offset addition in a first offset adder.
The method of claim 1, wherein the forming of the second set of products further comprises forming a second floating point product, the forming of the second floating point product comprising multiplying, in the third multiplier, the first sub-word of a mantissa of the second activation by a second sub-word of a mantissa of the first weight of the second plurality of weights.
The method of claim 8, further comprising adding the first floating point product and the second floating point product.
The method of claim 9, wherein the adding of the first floating point product and the second floating point product comprises performing an offset addition in the first offset adder.
A system, comprising: a processing circuit comprising: a first multiplier, a second multiplier, and a third multiplier, the processing circuit being configured to: form a first set of products, each product of the first set of products being an integer product of a first activation value and a respective weight of a first plurality of weights; and form a second set of products, each product of the second set of products being a floating-point product of a second activation value and a respective weight of a second plurality of weights, each of the weights of the first plurality of weights including a least significant sub-word and a most significant sub-word; the most significant sub-word of a first weight of the first plurality of weights being nonzero; the most significant sub-word of a second weight of the first plurality of weights being zero, the least significant sub-word of the second weight of the first plurality of weights being nonzero; the forming of the first set of products comprising: multiplying, in the first multiplier, the first activation value by the least significant sub-word of the first weight to form a first partial product; multiplying, in the second multiplier, the first activation value by the least significant sub-word of the second weight to form the product of the first activation value and the second weight; multiplying, in the third multiplier, the first activation value by the most significant sub-word of the first weight to form a second partial product; and adding the first partial product and the second partial product to form the product of the first activation value and the first weight; the forming of the second set of products comprising forming a first floating point product, the forming of the first floating point product comprising multiplying, in the first multiplier, a first sub-word of a mantissa of the second activation value by a first sub-word of a mantissa of a first weight of the second plurality of weights, to form a third partial product.
The system of claim 11, wherein: the first multiplier is configured to receive a first argument and a second argument, the first argument has a first argument size, the second argument has a second argument size, and the first argument size of is greater than the second argument size.
The system of claim 12, wherein: the forming of the first floating point product comprises: receiving, by the first multiplier, a first argument; receiving, by the first multiplier, a second argument; and multiplying the first argument by the second argument; the first argument includes: the first sub-word of the mantissa of the second activation value, and a second sub-word of the mantissa of the second activation value; and the second argument includes the first sub-word of the mantissa of the first weight of the second plurality of weights.
The system of claim 12, wherein: the forming of the first floating point product comprises: receiving, by the first multiplier, a first argument; receiving, by the first multiplier, a second argument; and multiplying the first argument by the second argument; the first argument includes: the first sub-word of the mantissa of the first weight of the second plurality of weights, and a second sub-word of the mantissa of the first weight of the second plurality of weights; and the second argument includes the first sub-word of the mantissa of the second activation value.
The system of claim 12, wherein the second multiplier is configured to receive a first argument having a first size and a second argument having a second size, the first size being greater than the second size.

Description

FIELD One or more aspects of embodiments according to the present disclosure relate to processing circuits, and more particularly to a system and method for performing sets of multiplications in a manner that accommodates outlier values, and that is capable of performing both integer and floating-point operations. BACKGROUND Processors for neural networks may perform large volumes of multiplication and addition operations, some of which may be a poor use of processing resources because a significant fraction of the numbers being processed may be relatively small, and only a small fraction of outliers may be relatively large. Moreover, some operations in such a system may be integer operations and some may be floating point operations, which, if performed on separate respective sets of dedicated hardware may consume significant amounts of chip area and power. Thus, there is a need for a system and method for performing sets of multiplications in a manner that accommodates outlier values and that is capable of performing both integer and floating-point operations. US 2020/349106 A1 discloses a processor including: a first tile, the first tile being configured: to feed a first nibble from a third queue, through a first shuffler, to a first multiplier, and to multiply, in the first multiplier, the first nibble from the third queue by a first nibble of a third weight; to feed a second nibble from the third queue, through the first shuffler, to a second multiplier, and to multiply, in the second multiplier, the second nibble from the third queue by a second nibble of the third weight; to feed a first nibble from a fourth queue, through the first shuffler, to a third multiplier, and to multiply, in the third multiplier, the first nibble from the fourth queue by a first nibble of a fourth weight. SUMMARY According to an embodiment of the present invention, there is provided a method, including: forming a first set of products, each product of the first set of products being an integer product of a first activation value and a respective weight of a first plurality of weights; and forming a second set of products, each product of the second set of products being a floating-point product of a second activation value and a respective weight of a second plurality of weights, each of the weights of the first plurality of weights including a least significant sub-word and a most significant sub-word; the most significant sub-word of a first weight of the first plurality of weights being nonzero; the most significant sub-word of a second weight of the first plurality of weights being zero; the forming of the first set of products including: multiplying, in a first multiplier, the first activation value by the least significant sub-word of the first weight to form a first partial product; multiplying, in a second multiplier, the first activation value by the least significant sub-word of the second weight; multiplying, in a third multiplier, the first activation value by the most significant sub-word of the first weight to form a second partial product; and adding the first partial product and the second partial product; the forming of the second set of products including forming a first floating point product, the forming of the first floating point product including multiplying, in the first multiplier, a first sub-word of a mantissa of the second activation value by a first sub-word of a mantissa of a first weight of the second plurality of weights, to form a third partial product. In some embodiments: the first multiplier is configured to receive a first argument and a second argument, the first argument has a first argument size, the second argument has a second argument size, and the first argument size of is greater than the second argument size. In some embodiments: the forming of the first floating point product includes: receiving, by the first multiplier, a first argument; receiving, by the first multiplier, a second argument; and multiplying the first argument by the second argument; the first argument includes: the first sub-word of the mantissa of the second activation value, and a second sub-word of the mantissa of the second activation value; and the second argument includes the first sub-word of the mantissa of the first weight of the second plurality of weights. In some embodiments: the forming of the first floating point product includes: receiving, by the first multiplier, a first argument; receiving, by the first multiplier, a second argument; and multiplying the first argument by the second argument; the first argument includes: the first sub-word of the mantissa of the first weight of the second plurality of weights, and a second sub-word of the mantissa of the first weight of the second plurality of weights; and the second argument includes the first sub-word of the mantissa of the second activation value. In some embodiments, the second multiplier is configured to receive a first argument having a