Search

CN-114662643-B - Data transmission method of acquirer and acquirer

CN114662643BCN 114662643 BCN114662643 BCN 114662643BCN-114662643-B

Abstract

A data transmission method of an acquirer and the acquirer are disclosed. The acquirer includes a loader, at least one transmitter, a buffer controller, and a reuse buffer. The data transmission method includes loading, by a loader, input data of an input feature map according to a loading order based on the input data stored in a reuse buffer, two-dimensional (2D) zero value information of a shape of a kernel to be used for a convolution operation and a weight of the kernel, storing, by a buffer controller, the loaded input data in the reuse buffer to which addresses are cyclically allocated according to the loading order, and selecting, by each of the at least one transmitter, input data corresponding to each output data of the convolution operation among the input data stored in the reuse buffer based on the one-dimensional (1D) zero value information of the weight, and outputting the selected input data.

Inventors

  • Pu Xuanxuan
  • Zhang Zhundai
  • JIN YUZHEN
  • Jin Canlao

Assignees

  • 三星电子株式会社

Dates

Publication Date
20260508
Application Date
20210601
Priority Date
20201222

Claims (16)

  1. 1. A data transmission method of an acquirer, the acquirer including a loader, at least one transmitter, a buffer controller, and a reuse buffer, the data transmission method comprising: loading, by a loader, input data of an input feature map stored in a memory according to a loading order based on the input data stored in a reuse buffer, two-dimensional zero value information of a shape of a kernel to be used for a convolution operation, and a weight of the kernel; Storing, by a buffer controller, input data loaded by a loader in a reuse buffer that cyclically allocates addresses according to a load order, and Selecting, by each of the at least one transmitter, input data corresponding to each output data of the convolution operation among the input data stored in the reuse buffer by the buffer controller based on the weight one-dimensional zero value information, and outputting the selected input data.
  2. 2. The data transmission method according to claim 1, wherein, The inner core has a rectangular shape, and The two-dimensional zero value information includes two-dimensional position information indicating positions of one or more weights each having a zero value among the weights.
  3. 3. The data transmission method according to claim 1, wherein, The core has a shape other than a rectangular shape, and The two-dimensional zero value information includes two-dimensional location information indicating a location of one or more weights that do not overlap the kernel in a smallest rectangle that overlaps the kernel.
  4. 4. The data transmission method according to claim 1, wherein, The inner core has a rectangular shape, and The two-dimensional zero value information includes two-dimensional position information indicating a position of one or more weights among the weights deleted by pruning.
  5. 5. The data transmission method according to any one of claims 1 to 4, wherein the step of loading the input data includes: Selecting a location of a weight having a non-zero value among weights based on the shape of the kernel and the two-dimensional zero value information; selecting input data which does not overlap with input data stored in a reuse buffer among input data of an input feature map corresponding to a position having a weight of a non-zero value, and The selected input data is loaded.
  6. 6. The data transmission method according to any one of claims 1 to 4, wherein the step of selecting input data includes: selecting a position having a weight other than zero value among the weights based on the one-dimensional zero value information; selecting input data corresponding to a location having a weight of a non-zero value among the input data stored in the reuse buffer, and The selected input data is sent to the actuator.
  7. 7. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the data transmission method of any one of claims 1 to 6.
  8. 8. An acquirer comprising: A loader; at least one transmitter; Buffer controller, and The buffer is reused and the buffer is used, Wherein the loader is configured to load input data of the input feature map stored in the memory according to a loading order based on the input data stored in the reuse buffer, two-dimensional zero value information of a shape of a kernel to be used for a convolution operation and a weight of the kernel, The buffer controller is configured to store input data loaded by the loader in a reuse buffer that cyclically allocates addresses according to a load order, and Each of the at least one transmitter is configured to select input data corresponding to each output data of the convolution operation among the input data stored in the reuse buffer by the buffer controller based on the one-dimensional zero value information of the weight, and output the selected input data.
  9. 9. The harvester of claim 8, wherein, The inner core has a rectangular shape, and The two-dimensional zero value information includes two-dimensional position information indicating positions of one or more weights each having a zero value among the weights.
  10. 10. The harvester of claim 8, wherein, The core has a shape other than a rectangular shape, and The two-dimensional zero value information includes two-dimensional location information indicating a location of one or more weights that do not overlap the kernel in a smallest rectangle that overlaps the kernel.
  11. 11. The harvester of claim 8, wherein, The inner core has a rectangular shape, and The two-dimensional zero value information includes two-dimensional position information indicating a position of one or more weights among the weights deleted by pruning.
  12. 12. The acquirer of claim 8, wherein the loader is configured to: Selecting a location of a weight having a non-zero value among weights based on the shape of the kernel and the two-dimensional zero value information; selecting input data which does not overlap with input data stored in a reuse buffer among input data of an input feature map corresponding to a position having a weight of a non-zero value, and The selected input data is loaded.
  13. 13. The acquirer of claim 8, wherein the at least one transmitter is configured to: selecting a position having a weight other than zero value among the weights based on the one-dimensional zero value information; selecting input data corresponding to a location having a weight of a non-zero value among the input data stored in the reuse buffer, and The selected input data is sent to the actuator.
  14. 14. The acquirer according to any one of claims 8 to 13, further comprising: a memory configured to store an input feature map, and An actuator configured to perform a parallel convolution operation on selected input data output from the at least one transmitter.
  15. 15. An acquirer comprising: one or more processors configured to: Loading input data of an input feature map stored in a memory by loading feature values of the input feature map corresponding to positions of non-zero values of weights of cores to be used for convolution operations and skipping feature values of the input feature map corresponding to positions of zero values of the weights of the cores; Storing the loaded input data in a reuse buffer, and A portion of the input data to be output stored in the reuse buffer is selected based on one-dimensional zero value information of the weights of the kernels.
  16. 16. The fetcher of claim 15, wherein in the event that the kernel has a non-rectangular shape, the one or more processors are configured to assign zero values to any weights in a smallest rectangle that completely contains the kernel that do not overlap the kernel.

Description

Data transmission method of acquirer and acquirer The present application claims the benefit of korean patent application No. 10-2020-0180967 filed in the korean intellectual property agency on 12 months 22 of 2020, the entire disclosure of which is incorporated herein by reference for all purposes. Technical Field The following description relates to a buffer management apparatus, and more particularly, to an efficient buffer management apparatus for data reuse of a neural accelerator. Background Deep learning techniques are techniques for training a neural network comprising a plurality of layers, each layer comprising a plurality of neurons, based on a large amount of training data. To improve the accuracy of the neural network's inference, a large amount of training data is required, and the training data may include image, sound, or text information. Convolutional Neural Networks (CNNs) help to significantly improve the accuracy of image classification and recognition through convolutional operations. However, the CNN-based model requires a large number of calculation operations, and as the amount of training data increases, the required resources also increase. Various studies are being conducted to accelerate convolution operations, with hardware acceleration accelerating convolution operations through hardware improvements. For example, a Neural Processor (NPU) is a processor designed to be optimized for parallel processing of matrix operations such as convolution operations, and exhibits a higher operation speed than a general processor. Disclosure of Invention This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. In one general aspect, a data transmission method of an acquirer including a loader, at least one transmitter, a buffer controller, and a reuse buffer includes loading, by the loader, input data of an input feature map stored in a memory according to a loading order based on input data stored in the reuse buffer, two-dimensional (2D) zero value information of a shape of a kernel and a weight of the kernel to be used for a convolution operation, storing, by the buffer controller, the loaded input data in the reuse buffer to which addresses are cyclically allocated according to the loading order, and selecting, by each of the at least one transmitter, input data corresponding to each output data of the convolution operation among the input data stored in the reuse buffer, and transmitting the selected input data to an executor. The core may have a rectangular shape. The 2D zero value information may include 2D location information indicating locations of one or more weights each having a zero value among the weights. The core may have a shape other than a rectangular shape. The 2D zero value information may include 2D location information indicating a location of one or more weights that do not overlap the kernel in a smallest rectangle that overlaps the kernel. The core may have a rectangular shape. The 2D zero value information may include 2D location information indicating a location of one or more weights among the weights deleted by pruning. The loading of the input data may include selecting a location of a weight having a non-zero value among weights based on the shape of the kernel and the 2D zero value information, selecting input data that does not overlap with the input data stored in the reuse buffer among the input data of the input feature map corresponding to the location of the weight having the non-zero value, and loading the selected input data. The selecting of the input data may include selecting a location having a weight of a non-zero value among the weights based on the 1D zero value information, selecting input data corresponding to the location having the weight of the non-zero value among the input data stored in the reuse buffer, and transmitting the selected input data to the actuator. A non-transitory computer-readable storage medium may store instructions that, when executed by a processor, cause the processor to perform a data sharing method. In another general aspect, an apparatus includes a loader configured to load input data of an input signature stored in a memory according to a loading order based on the input data stored in the reuse buffer, a shape of a kernel to be used for a convolution operation, and 2D zero value information of weights of the kernel, a buffer controller configured to store the loaded input data in the reuse buffer cyclically assigning addresses according to the loading order, and each of the at least one transmitter configured to select input data corresponding to each output data of the convolution operation among the input data stored in the