US-12626117-B2 - Memory device for performing convolution operation

US12626117B2US 12626117 B2US12626117 B2US 12626117B2US-12626117-B2

Abstract

A memory device performs a convolution operation. The memory device includes first to N-th processing elements (PEs), a first analog-to-digital converter (ADC), a first shift adder, and a first accumulator. The first to N-th PEs, where N is a natural number equal to or greater than 2, are respectively associated with at least one weight data included in a weight feature map and are configured to perform a partial convolution operation with at least one input data included in an input feature map. The first ADC is configured to receive a first partial convolution operation result from the first to N-th PEs. The first shift adder shifts an output of the first ADC. The first accumulator accumulates an output from the first shift adder.

Inventors

Young Jae JIN
Ki Young Kim
Sang Eun JE

Assignees

SK Hynix Inc.

Dates

Publication Date: 20260512
Application Date: 20211230
Priority Date: 20210803

Claims (8)

1 . A memory device for performing a convolution operation, comprising: a plurality of processing elements (PEs) respectively associated with at least one weight data piece included in a weight feature map and configured to perform a partial convolution operation with at least one input data piece included in an input feature map; and a plurality of circuits each receiving a result of the partial convolution operation from each of the plurality of PEs; wherein each of the plurality of circuits comprises: an analog-to-digital converter (ADC) configured to receive the result of the partial convolution operation from synaptic arrays included in the plurality of PEs; a shift adder configured to shift and add an output of the ADC; and an accumulator configured to accumulate an output from the shift adder, and wherein the ADC is configured to receive a sum of output currents that are simultaneously generated from the synaptic arrays of the plurality of PEs.
2 . The memory device of claim 1 , wherein each of the plurality of PEs includes first to N-th PEs, where N is a natural number greater than or equal to 2, and wherein each of the first to N-th PEs comprises first to k-th synaptic arrays, where k is a natural number equal to or greater than 2.
3 . The memory device of claim 2 , wherein a first ADC included in a first circuit of the plurality of circuits receives an output of a first synaptic array of each of the first to N-th PEs as a first result of the partial convolution operation.
4 . The memory device of claim 3 , wherein the first ADC receives a sum of an output current of the first synaptic array of each of the first to N-th PEs.
5 . The memory device of claim 2 , wherein a second circuit of the plurality of circuits comprises: a second ADC configured to receive a second result of the partial convolution operation from the first to N-th PEs; a second shift adder configured to shift an output of the second ADC; and a second accumulator configured to accumulate an output from the second shift adder.
6 . The memory device of claim 5 , wherein the second ADC receives an output of a second synaptic array of each of the first to N-th PEs as the second result.
7 . The memory device of claim 6 , wherein the second ADC receives a sum of an output current of the second synaptic array of each of the first to N-th PEs.
8 . The memory device of claim 2 , wherein each of the first to k-th synaptic arrays includes a plurality of memristors.

Description

CROSS-REFERENCE TO RELATED APPLICATION The present application claims priority under 35 U.S.C. § 119 (a) to Korean patent application number 10-2021-0102202, filed on Aug. 3, 2021, and which is incorporated herein by reference in its entirety. BACKGROUND Field of Invention The present disclosure relates to an electronic device, and more particularly, to a memory device for performing a convolution operation. Description of Related Art Input data of an artificial neural network configured of only fully connected layers is limited to a one-dimensional (arrangement) form. On the other hand, one color picture is three-dimensional data, and several pictures used in a batch mode are 4D data. When a fully connected (FC) neural network is required to be learned with picture data, three-dimensional picture data is required to be flattened into one dimension. Spatial information is lost in a process of flattening the picture data. As a result, the artificial neural network is inefficient in extracting and learning features caused by information lack due to loss of image spatial information, and increasing an accuracy is limited. A model capable of learning while maintaining spatial information of an image is a convolutional neural network (CNN). SUMMARY An embodiment of the present disclosure provides a memory device that performs a convolution operation capable of reducing manufacturing cost. According to an embodiment of the present disclosure, a memory device performs a convolution operation. The memory device includes first to N-th processing elements (PEs), a first analog-to-digital converter (ADC), a first shift adder, and a first accumulator. The first to N-th PEs are respectively associated with at least one weight data piece included in a weight feature map and are configured to perform a partial convolution operation with at least one input data piece included in an input feature map. The first ADC is configured to receive a first result of partial convolution operation from the first to N-th PEs. The first shift adder shifts and adds an output of the first ADC. The first accumulator accumulates an output from the first shift adder. Here, N may be a natural number equal to or greater than 2. In an embodiment of the present disclosure, each of the first to N-th PEs may include first to k-th synaptic arrays. Here, k may be a natural number equal to or greater than 2. In an embodiment of the present disclosure, the first ADC may receive an output of the first synaptic array of each of the first to N-th PEs as the first result. In an embodiment of the present disclosure, the first ADC may receive a sum of an output current of the first synaptic array of each of the first to N-th PEs. In an embodiment of the present disclosure, the memory device may further include a second ADC configured to receive a second result of the partial convolution operation from the first to N-th PEs, a second shift adder configured to shift an output of the second ADC, and a second accumulator configured to accumulate an output from the second shift adder. In an embodiment of the present disclosure, the second ADC may receive an output of a second synaptic array of each of the first to N-th PEs as the second result. In an embodiment of the present disclosure, the second ADC may receive a sum of an output current of the second synaptic array of each of the first to N-th PEs. In an embodiment of the present disclosure, each of the first to k-th synaptic arrays may include a plurality of memristors. According to another embodiment of the present disclosure, a convolution operational apparatus included in a memristor-based deep learning accelerator includes plural processing elements (PEs) and a digital operating circuit. The plural processing elements (PEs) are configured to perform an operation of equation 2 on a partial map of an input feature map with a weight feature map through an analog MAC operation to generate respective currents. The digital operating circuit is configured to comprehensively convert the currents into respective binary values and perform an operation of equation 1. PRRiCj=I_⊗W_=[I11…I1⁢M⋮⋱⋮IN⁢1…INM]⊗[W11…W1⁢M⋮⋱⋮WN⁢1…WNM]=∑L=1NM IL*WL=∑L=1NM (∑K=0P-1 VIL⁡(2)*VWLK⁡(2)*2K)=∑L=1NM (∑K=0P-1 CLK*2K)[Equation⁢1]CLK=VIL⁡(2)*VWLK⁡(2)[Equation⁢ 2] Here, “PRRiCj” is a result of the operation of the shift adder, “Ī” is the partial map, “W” is the weight feature map, “⊗” is a convolution operator, “IL” is an element of the partial map, “WL” is an element of the weight feature map and corresponds to one of the PEs, “VIL(2)” is a binary value of the element of the partial map, “VWLK(2)” is a binary value of K-th bit within the element of the weight feature map, “CLK” is the current, “N” is a number of rows in each of the partial map and the weight feature map, “M” is a number of columns in the partial map or the weight feature map, and “P” is a number of bits of the element of each of the partial map and the weight feature m