KR-20260065316-A - METHOD AND APPARATUS FOR DATA QUANTIZATION RELATED ARTIFICIAL INTELLIGENCE TRAINING AND INFERENCE IN WIRELESS COMMUNICATION SYSTEM

KR20260065316AKR 20260065316 AKR20260065316 AKR 20260065316AKR-20260065316-A

Abstract

The present disclosure relates to a method and apparatus for artificial intelligence learning and inference related to quantization in a wireless communication system. According to one embodiment of the present disclosure, a method for learning an artificial intelligence model related to quantization of an electronic device in a wireless communication system comprises: determining a maximum value and a minimum value of the output value of a first node included in the artificial intelligence model related to quantization; updating the maximum value and the minimum value; calculating a quantization unit based on the updated maximum value and the minimum value; and updating quantization noise applied to the artificial intelligence model based on the quantization unit.

Inventors

심세준
서한창
신명철
안광진
김태현
하길식

Assignees

삼성전자주식회사

Dates

Publication Date: 20260508
Application Date: 20241101

Claims (20)

In a method for learning an artificial intelligence model related to the quantization of electronic devices in a wireless communication system, An operation to determine the maximum value of the output value of a first node included in a quantization-related artificial intelligence model and the maximum value; Operation to update the above maximum and minimum values; The operation of calculating a quantization unit based on the above-mentioned updated maximum and minimum values; and A method characterized by including an operation to update quantization noise applied to the artificial intelligence model based on the quantization unit.
In paragraph 1, The operation of performing training on an artificial intelligence model with the above-mentioned updated quantization noise applied; An operation to determine parameters for the above artificial intelligence model; and A method characterized by further including the operation of performing inference using the artificial intelligence model based on the determined parameters.
A method according to claim 1, wherein the output value includes the output value of a floating-point model included in the first node.
In paragraph 3, the operation of updating the quantization noise is, The operation of dividing the output values of the above floating-point model into the above quantization unit; The operation of calculating a first value for the output values of the above floating-point model by rounding to the nearest value among the values corresponding to the quantization unit, or by rounding down to the values corresponding to the quantization unit; and A method characterized by including the operation of multiplying the first value by the quantization unit.
In paragraph 1, A method characterized by further including an operation to calculate the output value of the second node of the artificial intelligence model for the input value to which the above-mentioned updated quantization noise is applied.
In paragraph 1, the operation of updating the maximum and minimum values is, The operation of updating the maximum value includes the operation of updating the maximum value, and the operation of updating the maximum value The maximum value of the output of the first node is In this case, the first maximum value up to the j-th training data set of the first node ( ) is the second maximum value up to the j-1th training data set of the first node ( If it is greater than ), = A method characterized by including an operation to update to;
In paragraph 1, the operation of updating the maximum and minimum values is, The operation of updating the minimum value above is included, and the operation of updating the minimum value above is The minimum value of the output value of the first node is In this case, the first minimum value up to the j-th training data set of the first node ( ) is the second minimum value up to the j-1th training data set of the first node ( If it is smaller than ), A method characterized by including an operation to update to;
In a storage medium storing at least one instruction readable by a computer, the at least one instruction causes the electronic device to perform at least one operation when executed by at least a part of at least one processor (120) of the electronic device. The above at least one operation is: An operation to determine the maximum value of the output value of a first node included in a quantization-related artificial intelligence model and the maximum value; Operation to update the above maximum and minimum values; The operation of calculating a quantization unit based on the above-mentioned updated maximum and minimum values; and A storage medium characterized by including an operation to update quantization noise applied to the artificial intelligence model based on the above quantization unit.
In paragraph 8, the above at least one operation is: The operation of performing training on an artificial intelligence model with the above-mentioned updated quantization noise applied; An operation to determine parameters for the above artificial intelligence model; and A storage medium characterized by further including the operation of performing inference using the artificial intelligence model based on the parameters determined above.
A storage medium characterized in that, in claim 8, the output value includes the output value of a floating-point model included in the first node.
In paragraph 10, the operation of updating the quantization noise is, The operation of dividing the output values of the above floating-point model into the above quantization unit; The operation of calculating a first value for the output values of the above floating-point model by rounding to the nearest value among the values corresponding to the quantization unit, or by rounding down to the values corresponding to the quantization unit; and A storage medium characterized by including the operation of multiplying the first value by the quantization unit.
In paragraph 8, the above at least one operation is, A storage medium characterized by further including the operation of calculating the output value of the second node of the artificial intelligence model for the input value to which the above-mentioned updated quantization noise is applied.
In paragraph 8, the operation of updating the maximum and minimum values is, The operation of updating the maximum value includes the operation of updating the maximum value, and the operation of updating the maximum value The maximum value of the output of the first node is In this case, the first maximum value up to the j-th training data set of the first node ( ) is the second maximum value up to the j-1th training data set of the first node ( If it is greater than ), = A storage medium characterized by including an operation to update to;
In Clause 13, the operation of updating the maximum and minimum values is, The operation of updating the minimum value above is included, and the operation of updating the minimum value above is The minimum value of the output value of the first node is In this case, the first minimum value up to the j-th training data set of the first node ( ) is the second minimum value up to the j-1th training data set of the first node ( If it is smaller than ), A storage medium characterized by including an operation to update to;
In electronic devices, At least one processor (120); and It includes a memory (130) that stores at least one instruction, and When the above at least one instruction is executed by at least part of the above at least one processor (120), it causes the electronic device to perform at least one operation, and The above at least one operation is: An operation to determine the maximum value of the output value of a first node included in a quantization-related artificial intelligence model and the maximum value; Operation to update the above maximum and minimum values; The operation of calculating a quantization unit based on the above-mentioned updated maximum and minimum values; and An electronic device characterized by including an operation to update quantization noise applied to the artificial intelligence model based on the quantization unit.
In paragraph 15, the above at least one operation is: The operation of performing training on an artificial intelligence model with the above-mentioned updated quantization noise applied; An operation to determine parameters for the above artificial intelligence model; and An electronic device characterized by including an operation of performing inference using the artificial intelligence model based on the parameters determined above.
An electronic device according to claim 15, characterized in that the output value includes the output value of a floating-point model included in the first node.
In paragraph 17, the operation of updating the quantization noise is, The operation of dividing the output values of the above floating-point model into the above quantization unit; The operation of calculating a first value for the output values of the above floating-point model by rounding to the nearest value among the values corresponding to the quantization unit, or by rounding down to the values corresponding to the quantization unit; and An electronic device characterized by further including the operation of multiplying the first value by the quantization unit.
In paragraph 15, the above at least one operation is, An electronic device characterized by including the operation of calculating the output value of the second node of the artificial intelligence model for the input value to which the above-mentioned updated quantization noise is applied.
In Clause 15, the operation of updating the above maximum and minimum values is, The operation of updating the maximum value includes the operation of updating the maximum value, and the operation of updating the maximum value The maximum value of the output of the first node is In this case, the first maximum value up to the j-th training data set of the first node ( ) is the second maximum value up to the j-1th training data set of the first node ( If it is greater than ), = An electronic device characterized by including an operation to update to;

Description

Method and apparatus for data quantization related artificial intelligence training and inference in wireless communication systems The present disclosure relates to an artificial intelligence learning and inference method and apparatus related to quantization in a wireless communication system. With the development of digital technology, electronic devices are being provided in various forms, such as smartphones, tablet PCs, or PDAs. As artificial intelligence technology advances, electronic devices can provide various artificial intelligence services by applying artificial intelligence technology. Based on artificial intelligence technology and voice recognition technology, electronic devices can provide artificial intelligence services that process tasks requested by the user and offer software configurations (e.g., services, functions, or programs) that provide services specialized for the user (e.g., customized information based on user voice commands). Machine learning is a field related to artificial intelligence that develops algorithms and technologies enabling computers to learn. Deep learning refers to a set of machine learning algorithms that attempt a high level of abstraction (the task of extracting only the essential content from a large amount of complex data) through a combination of non-linear transformation techniques. Meanwhile, as interest in AI-applied on-device products has recently increased, there is a demand for technological advancements in the research and design of low-power, low-complexity AI inference engines. The information described above may be provided as related art for the purpose of aiding understanding of this document. None of the foregoing is to be claimed as prior art related to this document, nor is it to be used to determine prior art. In relation to the description of the drawings, the same or similar reference numerals may be used for identical or similar components. FIG. 1 is a block diagram of an electronic device in a network environment according to one embodiment of the present disclosure. FIG. 2 illustrates a block diagram of an electronic device according to one embodiment of the present disclosure. FIG. 3 is a diagram illustrating a floating-point artificial intelligence (AI) model in a network node according to one embodiment of the present disclosure. FIG. 4 is a diagram showing an example of a dynamic region of a signal and a B-bit quantization model according to one embodiment of the present disclosure. FIG. 5 illustrates an example of an equivalent quantization noise model according to one embodiment of the present disclosure. FIG. 6 is a flowchart illustrating a quantization recognition learning operation according to one embodiment of the present disclosure. FIG. 7 is a flowchart illustrating the optimization operation of an artificial intelligence inference model considering quantization noise according to one embodiment of the present disclosure. FIG. 8 is a flowchart illustrating a quantization optimal learning operation according to one embodiment of the present disclosure. FIG. 9 is a drawing showing an example of a fixed-point AI model including N nodes according to one embodiment of the present disclosure. FIG. 10 is a drawing showing an example of a simple equivalent quantization noise AI model according to one embodiment of the present disclosure. FIG. 11 is a diagram illustrating an example of an adaptive equivalent quantization noise AI model according to one embodiment of the present disclosure. FIG. 12 is a block diagram of a quantization optimal AI model according to one embodiment of the present disclosure. FIG. 13 is a detailed block diagram of the nth node included in a quantization optimal AI model according to one embodiment of the present disclosure. FIG. 14 is a diagram illustrating a method for updating the maximum value (Max) and minimum value (Min) of the nth node included in a quantization optimal AI model according to one embodiment of the present disclosure. FIG. 15 is a diagram illustrating a method of applying quantization noise of the nth node included in a quantization optimal AI model according to one embodiment of the present disclosure. FIG. 16 is a flowchart illustrating the learning operation of an artificial intelligence model related to the quantization of an electronic device according to one embodiment of the present disclosure. Hereinafter, embodiments of the present disclosure are described in detail with reference to the drawings so that those skilled in the art can easily practice them. However, the present disclosure may be embodied in various different forms and is not limited to the embodiments described herein. In relation to the description of the drawings, the same or similar reference numerals may be used for identical or similar components. Furthermore, in the drawings and related descriptions, descriptions of well-known functions and configurations may be omitted for clarity and brevity. FIG. 1