EP-4738201-A1 - SYSTEMS AND METHODS FOR ARTIFICIAL INTELLIGENCE COMPUTATIONS

EP4738201A1EP 4738201 A1EP4738201 A1EP 4738201A1EP-4738201-A1

Abstract

An apparatus, comprising: a memory storing a first vector and a second vector associated with a layer of an artificial intelligence, AI, model, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; and a processor comprising: an adder circuit wired to receive the first mantissa value and the second mantissa value and generate a first sum based on the first mantissa value and the second mantissa value; and a shifter circuit wired to receive the first sum and shift the first sum by a first number of bits to generate a first shifted value; and wherein the processor is configured to generate an inference of the AI model based on the first shifted value.

Inventors

CHOI, CHIHO
CHOI, JOON HEE
PADMANABHAN, Sai Prahladh
MALLA, SRIKANTH

Assignees

Samsung Electronics Co., Ltd.

Dates

Publication Date: 20260506
Application Date: 20251029

Claims (15)

An apparatus, comprising: a memory storing a first vector and a second vector associated with a layer of an artificial intelligence, Al, model, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; and a processor comprising: an adder circuit wired to receive the first mantissa value and the second mantissa value and generate a first sum based on the first mantissa value and the second mantissa value; and a shifter circuit wired to receive the first sum and shift the first sum by a first number of bits to generate a first shifted value; and wherein the processor is configured to generate an inference of the AI model based on the first shifted value.
The apparatus of claim 1, wherein the first number is based on an expected value of one or more mantissa values associated with the layer of the AI model.
The apparatus of claim 2, wherein the expected value is based on a statistical distribution of a parameter of the layer of the AI model.
The apparatus of claim 1, wherein the first number is based on a first term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model.
The apparatus of any one of claims 1 to 4, wherein the first number is zero.
The apparatus of any one of claims 1 to 5, wherein the shifter circuit is wired to receive the first sum and shift the first sum by a second number of bits to generate a second shifted value, wherein the processor is configured to generate the inference of the AI model based on the second shifted value.
The apparatus of claim 6, wherein the adder circuit is wired to receive the first shifted value and the second shifted value to generate a second sum, wherein the processor is configured to generate the inference of the AI model based on the second sum.
The apparatus of claim 6 or 7, wherein the second number is based on a second term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model.
The apparatus of any one of claims 1 to 8, wherein the shifter circuit is wired to shift the first sum leftward by the first number of bits.
The apparatus of any one of claims 1 to 8, wherein the shifter circuit is wired to shift the first sum rightward by the first number of bits.
A method, comprising: storing a first vector and a second vector associated with a layer of an artificial intelligence, AI, model in a memory device, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; routing the first mantissa value and the second mantissa value to an adder circuit of a processor; outputting, by the adder circuit, a first sum based on the first mantissa value and the second mantissa value; routing the first sum to a shifter circuit of the processor; shifting, by the shifter circuit, the first sum by a first number of bits to generate a first shifted value; and generating, by the processor, an inference of the AI model based on the first shifted value.
The method of claim 11, wherein the first number is based on an expected value of one or more mantissa values associated with the layer of the AI model.
The method of claim 12, wherein the expected value is based on a statistical distribution of a parameter of the layer of the AI model.
The method of claim 11, wherein the first number is based on a first term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model.
The method of any one of claims 11 to 14, wherein the first number is zero.

Description

FIELD One or more aspects of embodiments according to the present disclosure relate to artificial intelligence models, and more particularly to computations used in artificial intelligence models. BACKGROUND The use of artificial intelligence (Al) has increased dramatically over the last few years. AI has become commonly used in domains such as image classification, speech recognition, media analytics, heath care, autonomous machines, smart assistants, etc. Using AI often necessitates the use of large datasets (e.g., from databases, sensors, images etc.) and the use of advanced algorithms that similarly necessitate high performance computing with teraflops of computational power. The above information disclosed in this Background section is only for enhancement of understanding of the background of the present disclosure, and therefore, it may contain information that does not form prior art. SUMMARY In one or more embodiments, an apparatus includes a memory storing a first vector and a second vector associated with a layer of an artificial intelligence (AI) model, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; and a processor comprising: an adder circuit wired to receive the first mantissa value and the second mantissa value and generate a first sum based on the first mantissa value and the second mantissa value; and a shifter circuit wired to receive the first sum and shift the first sum by a first number of bits to generate a first shifted value; and wherein the processor is configured to generate an inference of the AI model based on the first shifted value. In some embodiments, the first number is based on an expected value of one or more mantissa values associated with the layer of the AI model. In some embodiments, the expected value is based on a statistical distribution of a parameter of the layer of the AI model. In some embodiments, the first number is based on a first term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model. In some embodiments, the first number is zero. In some embodiments, the shifter circuit is wired to receive the first sum and shift the first sum by a second number of bits to generate a second shifted value, wherein the processor is configured to generate the inference of the AI model based on the second shifted value. In some embodiments, the adder circuit is wired to receive the first shifted value and the second shifted value to generate a second sum, wherein the processor is configured to generate the inference of the AI model based on the second sum. In some embodiments, the second number is based on a second term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model. In some embodiments, the shifter circuit is wired to shift the first sum leftward by the first number of bits. In some embodiments, the shifter circuit is wired to shift the first sum rightward by the first number of bits. In one or more embodiments, a method includes storing a first vector and a second vector associated with a layer of an artificial intelligence (AI) model in a memory device, the first vector comprising a first operand having a first mantissa value, and the second vector comprising a second operand having a second mantissa value; routing the first mantissa value and the second mantissa value to an adder circuit of a processor; outputting, by the adder circuit, a first sum based on the first mantissa value and the second mantissa value; routing the first sum to a shifter circuit of the processor; shifting, by the shifter circuit, the first sum by a first number of bits to generate a first shifted value; and generating, by the processor, an inference of the AI model based on the first shifted value. In some embodiments, the first number is based on an expected value of one or more mantissa values associated with the layer of the AI model. In some embodiments, wherein the expected value is based on a statistical distribution of a parameter of the layer of the AI model. In some embodiments, the first number is based on a first term of a binary decomposition of an expected value of one or more mantissa values of one or more parameters of the AI model. In some embodiments, the first number is zero. In some embodiments, the method further includes routing the first sum to the shifter circuit; and shifting, by the shifter circuit, the first sum by a second number of bits to generate a second shifted value. In some embodiments, the method further includes routing the first shifted value and the second shifted value to the adder circuit; and outputting, by the adder circuit, a second sum based on the first shifted value and the second shifted value, wherein the inference is generated based on the second sum. In some embodiments, the second number is based on