US-12626088-B2 - Neural network apparatus, neural network processor, and method of operating neural network processor

US12626088B2US 12626088 B2US12626088 B2US 12626088B2US-12626088-B2

Abstract

A neural network processor and method include a fetch controller configured to receive input feature information, indicating whether each of a plurality of input features of an input feature map includes a non-zero value, and weight information, indicating whether each of a plurality of weights of a weight map includes a non-zero value, and configured to determine input features and weights to be convoluted, from among the plurality of input features and the plurality of weights, based on the input feature information and the weight information. The neural network processor and method also include a data arithmetic circuit configured to convolute the determined weights and input features to generate an output feature map.

Inventors

SeHwan Lee
Dongyoung Kim
Sungjoo YOO

Assignees

SAMSUNG ELECTRONICS CO., LTD.
SEOUL NATIONAL UNIVERSITY R&DB FOUNDATION

Dates

Publication Date: 20260512
Application Date: 20180112
Priority Date: 20170306

Claims (14)

1 . A neural network processor, comprising: one or more first processors; an array of second processors configured to be executed in parallel with respect to each other, wherein the one or more first processors are not included in the array of second processors; a memory storing instructions configured to cause the one or more first processors to control operations of the second processors, the control including: dividing the second processors into processor groups, wherein each processor group has at least two of the second processors, and wherein each second processor is in only one processor group; based on a received input feature map, spatially dividing the input feature map into feature map parts; based on received weight maps of a convolutional neural network (CNN), determining non-zero ratios of the respective weight maps, wherein each weight map has a respectively determined non-zero ratio, and wherein each non-zero ratio is a ratio of non-zero weights in the corresponding weight map; sorting the weight maps into an ordering of increasing order of the non-zero ratios thereof; segmenting the ordering of the weight maps to form weight map groups, wherein each weight map group consists of weight maps whose non-zero ratios are higher than those of its preceding weight map group; according to spatial ordering of the feature map parts in the feature map, allocating the feature map parts to the respective processor groups, wherein each second processor in a processor group processes the feature map part allocated to its processor group; allocating the weight maps to the second processors according to the increasing non-zero ratio ordering of the weight maps; and performing, by the second processors, convolution operations between the weight maps allocated thereto and the feature map parts allocated thereto.
2 . The neural network processor of claim 1 , wherein the convolution operations are based on an input feature vector and a weight feature vector, wherein the input feature vector and the weight vector are determined based on a collective consideration of both the input feature map and the weight maps.
3 . The neural network processor of claim 2 , wherein the input feature vector indicates zero-valued features of the feature map with zeros and indicates non-zero-valued features of the feature map with ones, and the weight vector indicates zero-valued weights of the weight maps by zeroes and indicates non-zero-valued weights with ones.
4 . A method of operating a neural network processor, the neural network processor comprising one or more first processors and an array of second processors that do not include the first processors, the method, performed by the one or more first processors, comprising: dividing the second processors into processor groups, wherein each processor group has at least two of the second processors, and wherein each second processor is in only one processor group; based on a received input feature map, spatially dividing the input feature map into feature map parts, the spatially dividing corresponding to a spatial order of convolution processing of the input feature map; based on received weight maps of a convolutional neural network (CNN), determining non-zero ratios of the respective weight maps, wherein each weight map has a respectively determined non-zero ratio; sorting the weight maps into an ordering of increasing non-zero ratios thereof; segmenting the ordering of the weight maps to form weight map groups, wherein each weight map group, consists of weight maps whose non-zero ratios are higher than those of its preceding weight map group; according to the spatial order of convolution processing of the feature map parts in the feature map, allocating the feature map parts to the respective processor groups, wherein each second processor in a processor group convolves the same feature map part that is allocated to its processor group; allocating the weight maps to the second processors according to the increasing non-zero ratio ordering of the weight maps; and performing, by the second processors, convolution operations between the weight maps allocated thereto and the feature map parts allocated thereto.
5 . The method of claim 4 , wherein, the convolution operations are based on a comparison of input features of the feature map and weights of the weight maps, wherein convolution of zero weights or zero features are prevented, and the input features and the weights are spatially corresponding with respect to the convolution operations.
6 . The method of claim 5 , wherein the convolution operations are controlled by logical ANDing between the input features and the weights.
7 . A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 4 .
8 . A neural network apparatus, comprising: a processor array comprising second processors; a memory configured to store an input feature map and weight maps; and one or more first processors configured, according to instructions executable thereby, to: divide the second processors into processor groups, wherein each processor group has at least two of the second processors, and wherein each second processor is in only one processor group; based on an input feature map, spatially divide the input feature map into feature map parts; based on received weight maps, determine non-zero weight ratios of the respective weight maps and determine an ordering of the weight maps such that their respectively corresponding non-zero weight ratios are in increasing order in the ordering; allocate the weight maps to the second processors based on the ordering of the weight maps; allocate the feature map parts to the processor groups, respectively, based on spatial correspondences thereof with the weight maps; and perform, by each second processor, convolution operations between the corresponding allocated weight maps and feature maps to generate an output feature map.
9 . The neural network apparatus of claim 8 , wherein the ordering of the weight maps is segmented to form groups of the weight maps, and the groups of the weight maps are allocated to the processor groups, respectively.
10 . The neural network apparatus of claim 8 , wherein the one or more first processors are further configured to: provide input feature information that indicates non-zero input features of the input feature map weight information that indicates non-zero weights of the weight maps, and wherein the processor array is further configured to convolve the input feature map and the weight maps, as allocated to the second processors, based on the input feature information and the weight information to generate the output feature map.
11 . The neural network apparatus of claim 8 , wherein the input feature map is divided into the input feature map parts based on a geometry of the weight maps.
12 . The neural network apparatus of claim 9 , wherein the weight maps are divided into the groups of the weight maps based on the number of second processors.
13 . The neural network apparatus of claim 9 , wherein the allocating a weight group, among the groups of weights, to a processor group comprises: allocating each weight map in the weight group to a respectively corresponding second processor in the processor group, such that each second processor in the weight group convolves the same feature map part with its corresponding feature map part.
14 . The neural network apparatus of claim 8 , wherein each second processor within a processor group receives a different weight map in a same group of weight maps, such that each second processor in the processor group convolves a different weight map of the processor group with a same feature map part.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefits of Korean Patent Application No. 10-2017-0028545, filed on Mar. 6, 2017 and Korean Patent Application No. 10-2017-0041160, filed on Mar. 30, 2017 in the Korean Intellectual Property Office, the disclosures of which are incorporated herein in their entireties by reference. BACKGROUND 1. Field The following description relates to a neural network apparatus, a neural network processor, and a method of operating the neural network processor. 2. Description of Related Art A neural network refers to a computational architecture that models a biological brain. Recently, with the development of neural network technology, various kinds of electronic systems have been actively studied to analyze input data and extract valid information using a neural network apparatus. A neural network apparatus performs multiple operations to process complex input data. In order for the neural network apparatus to analyze high-quality input, in real time, and extract information, an apparatus and method capable of efficiently processing neural network operations are needed. SUMMARY This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. Provided are a neural network apparatus, a neural network processor, and a method of operating the neural network processor. In accordance with an embodiment, there may be provided a neural network processor, including: a fetch controller configured to receive input feature information, indicating whether each of a plurality of input features of an input feature map includes a non-zero value, and weight information, indicating whether each of a plurality of weights of a weight map includes a non-zero value, and configured to determine input features and weights to be convoluted, from among the plurality of input features and the plurality of weights, based on the input feature information and the weight information; and a data arithmetic circuit configured to convolute the determined weights and input features to generate an output feature map. The data arithmetic circuit may be configured to selectively convolute the determined weights and the input features from among the plurality of the input features and the plurality of weights. The fetch controller may be configured to detect the input features and the weights may also include non-zero values based on the input feature information and the weight information, and the data arithmetic circuit may be configured to convolute the detected input features and weights. The input feature information may also include an input feature vector in which a zero-valued feature may be denoted by 0 and a non-zero-valued feature may be denoted by 1, and the weight information may also include a weight vector in which a zero-valued weight may be denoted by 0 and a non-zero-valued weight may be denoted by 1. In response to the determined input features being a first input feature and a second input feature and the determined weights being a first weight and a second weight, the data arithmetic circuit may be configured to in a current cycle, read the first input feature and the first weight from the input feature map and the weight map to perform the convolution, and in a subsequent cycle, read the second input feature and the second weight from the input feature map and the weight map to perform the convolution. In accordance with an embodiment, there may be provided a method of operating a neural network processor, the method including: receiving input feature information indicating whether each of a plurality of input features of an input feature map includes a non-zero value and weight information, indicating whether each of a plurality of weights of a weight map includes a non-zero value; determining input features and weights to be convoluted from among the plurality of input features and the plurality of weights based on the input feature information and the weight information; and convoluting on the determined weights and input features to generate an output feature map. The method may also include: selectively convoluting the determined weights and the input features from among the plurality of the input features and the plurality of weights. The determining may also include detecting the input features and the weights having non-zero values based on the input feature information and weight information. The method may also include: performing the convolution on the detected input features and weights. The input feature information may also include an input feature vector in which a zero-valued feature may be denoted by 0 and a non-zero-valued feature may be denoted by 1, and the weight informat