US-20260126955-A1 - SYSTEMS AND METHODS FOR ARTIFICIAL INTELLIGENCE COMPUTATIONS

US20260126955A1US 20260126955 A1US20260126955 A1US 20260126955A1US-20260126955-A1

Abstract

An apparatus, comprising: a memory storing a first vector and a second vector associated with a layer of an artificial intelligence (AI) model, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; and a processor comprising: an adder circuit wired to receive the first mantissa value and the second mantissa value and generate a first sum based on the first mantissa value and the second mantissa value; and a shifter circuit wired to receive the first sum and shift the first sum by a first number of bits to generate a first shifted value; and wherein the processor is configured to generate an inference of the AI model based on the first shifted value.

Inventors

Chiho Choi
Joon Hee Choi
Sai Prahladh Padmanabhan
Srikanth Malla

Assignees

SAMSUNG ELECTRONICS CO., LTD.

Dates

Publication Date: 20260507
Application Date: 20251015

Claims (20)

1 . An apparatus, comprising: a memory storing a first vector and a second vector associated with a layer of an artificial intelligence (AI) model, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; and a processor comprising: an adder circuit wired to receive the first mantissa value and the second mantissa value and generate a first sum based on the first mantissa value and the second mantissa value; and a shifter circuit wired to receive the first sum and shift the first sum by a first number of bits to generate a first shifted value; and wherein the processor is configured to generate an inference of the AI model based on the first shifted value.
2 . The apparatus of claim 1 , wherein the first number is based on an expected value of one or more mantissa values associated with the layer of the AI model.
3 . The apparatus of claim 2 , wherein the expected value is based on a statistical distribution of a parameter of the layer of the AI model.
4 . The apparatus of claim 1 , wherein the first number is based on a first term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model.
5 . The apparatus of claim 1 , wherein the first number is zero.
6 . The apparatus of claim 1 , wherein the shifter circuit is wired to receive the first sum and shift the first sum by a second number of bits to generate a second shifted value, wherein the processor is configured to generate the inference of the AI model based on the second shifted value.
7 . The apparatus of claim 6 , wherein the adder circuit is wired to receive the first shifted value and the second shifted value to generate a second sum, wherein the processor is configured to generate the inference of the AI model based on the second sum.
8 . The apparatus of claim 6 , wherein the second number is based on a second term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model.
9 . The apparatus of claim 1 , wherein the shifter circuit is wired to shift the first sum leftward by the first number of bits.
10 . The apparatus of claim 1 , wherein the shifter circuit is wired to shift the first sum rightward by the first number of bits.
11 . A method, comprising: storing a first vector and a second vector associated with a layer of an artificial intelligence (AI) model in a memory device, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; routing the first mantissa value and the second mantissa value to an adder circuit of a processor; outputting, by the adder circuit, a first sum based on the first mantissa value and the second mantissa value; routing the first sum to a shifter circuit of the processor; shifting, by the shifter circuit, the first sum by a first number of bits to generate a first shifted value; and generating, by the processor, an inference of the AI model based on the first shifted value.
12 . The method of claim 11 , wherein the first number is based on an expected value of one or more mantissa values associated with the layer of the AI model.
13 . The method of claim 12 , wherein the expected value is based on a statistical distribution of a parameter of the layer of the AI model.
14 . The method of claim 11 , wherein the first number is based on a first term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model.
15 . The method of claim 11 , wherein the first number is zero.
16 . The method of claim 11 , further comprising: routing the first sum to the shifter circuit; and shifting, by the shifter circuit, the first sum by a second number of bits to generate a second shifted value.
17 . The method of claim 16 , further comprising: routing the first shifted value and the second shifted value to the adder circuit; and outputting, by the adder circuit, a second sum based on the first shifted value and the second shifted value, wherein the inference is generated based on the second sum.
18 . The method of claim 16 , wherein the second number is based on a second term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model.
19 . The method of claim 11 , wherein the shifter circuit is wired to shift the first sum leftward by the first number of bits.
20 . The method of claim 11 , wherein the shifter circuit is wired to shift the first sum rightward by the first number of bits.

Description

CROSS-REFERENCE TO RELATED APPLICATION(S) The present application claims priority to and the benefit of U.S. Provisional Application No. 63/716,601 filed Nov. 5, 2024, entitled “METHOD FOR APPROXIMATING MULTIPLICATIONS USING STATISTICAL DISTRIBUTION OF OPERANDS FOR ARTIFICIAL INTELLIGENCE (AI) MODELS,” the entire content of which is incorporated herein by reference. FIELD One or more aspects of embodiments according to the present disclosure relate to artificial intelligence models, and more particularly to computations used in artificial intelligence models. BACKGROUND The use of artificial intelligence (AI) has increased dramatically over the last few years. AI has become commonly used in domains such as image classification, speech recognition, media analytics, heath care, autonomous machines, smart assistants, etc. Using AI often necessitates the use of large datasets (e.g., from databases, sensors, images etc.) and the use of advanced algorithms that similarly necessitate high performance computing with teraflops of computational power. The above information disclosed in this Background section is only for enhancement of understanding of the background of the present disclosure, and therefore, it may contain information that does not form prior art. SUMMARY In one or more embodiments, an apparatus includes a memory storing a first vector and a second vector associated with a layer of an artificial intelligence (AI) model, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; and a processor comprising: an adder circuit wired to receive the first mantissa value and the second mantissa value and generate a first sum based on the first mantissa value and the second mantissa value; and a shifter circuit wired to receive the first sum and shift the first sum by a first number of bits to generate a first shifted value; and wherein the processor is configured to generate an inference of the AI model based on the first shifted value. In some embodiments, the first number is based on an expected value of one or more mantissa values associated with the layer of the AI model. In some embodiments, the expected value is based on a statistical distribution of a parameter of the layer of the AI model. In some embodiments, the first number is based on a first term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model. In some embodiments, the first number is zero. In some embodiments, the shifter circuit is wired to receive the first sum and shift the first sum by a second number of bits to generate a second shifted value, wherein the processor is configured to generate the inference of the AI model based on the second shifted value. In some embodiments, the adder circuit is wired to receive the first shifted value and the second shifted value to generate a second sum, wherein the processor is configured to generate the inference of the AI model based on the second sum. In some embodiments, the second number is based on a second term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model. In some embodiments, the shifter circuit is wired to shift the first sum leftward by the first number of bits. In some embodiments, the shifter circuit is wired to shift the first sum rightward by the first number of bits. In one or more embodiments, a method includes storing a first vector and a second vector associated with a layer of an artificial intelligence (AI) model in a memory device, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; routing the first mantissa value and the second mantissa value to an adder circuit of a processor; outputting, by the adder circuit, a first sum based on the first mantissa value and the second mantissa value; routing the first sum to a shifter circuit of the processor; shifting, by the shifter circuit, the first sum by a first number of bits to generate a first shifted value; and generating, by the processor, an inference of the AI model based on the first shifted value. In some embodiments, the first number is based on an expected value of one or more mantissa values associated with the layer of the AI model. In some embodiments, wherein the expected value is based on a statistical distribution of a parameter of the layer of the AI model. In some embodiments, the first number is based on a first term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model. In some embodiments, the first number is zero. In some embodiments, the method further includes routing the first sum to the shifter circuit; and shifting, by the shifter circuit, the first sum by a second number of bits to generate a second sh