Search

KR-102964032-B1 - ELECTRONIC DEVICE FOR HIGH-SPEED COMPRESSION PROCESSING OF FEATURE MAP OF CNN UTILIZING SYSTEM AND CONTROLLING METHOD THEREOF

KR102964032B1KR 102964032 B1KR102964032 B1KR 102964032B1KR-102964032-B1

Abstract

The present invention is based on the aforementioned necessity, and the objective of the present invention is to provide an efficient compression processing structure for feature maps. Specifically, it provides efficient feature map compression in an embedded system by utilizing information obtained based on the characteristics of feature maps during the CNN training process. In particular, the control method of the electronic device of the present disclosure includes the steps of inputting an input image into an artificial intelligence model, obtaining a feature map for the input image by a first layer included in the artificial intelligence model, converting the feature map through a lookup table corresponding to the feature map, and compressing and storing the converted feature map through a compression mode corresponding to the feature map among a plurality of compression modes.

Inventors

  • 조인상
  • 이원재
  • 황찬영

Assignees

  • 삼성전자주식회사

Dates

Publication Date
20260513
Application Date
20190705
Priority Date
20180831

Claims (20)

  1. In a method for controlling an electronic device, Step of inputting the input video into the artificial intelligence model; A step of acquiring a feature map for the input image by a first layer included in the artificial intelligence model; A step of identifying the type of the feature map by analyzing the histogram information of the feature map; A step of obtaining a lookup table corresponding to the identified type among a plurality of lookup tables; A step of transforming the feature map through the above lookup table; and The method includes the step of compressing and storing the transformed feature map using a compression mode corresponding to the feature map among a plurality of compression modes. The above plurality of lookup tables are, It is generated in the process of acquiring a learning feature map for each input image by inputting multiple input images into the artificial intelligence model and determining the type of the learning feature map by analyzing the histogram information of the learning feature map. The above plurality of lookup tables are, It is generated to reduce the maximum residual between pixel values of multiple pixels included in the above feature map, and An electronic device control method in which each of the above plurality of lookup tables is generated to correspond to each type of learning feature map.
  2. delete
  3. In paragraph 1, The above lookup table is, An electronic device control method comprising a lookup table for converting a pixel having a high-frequency pixel value among a plurality of pixels included in the above feature map to correspond to a value close to the median value of the pixel value variable range of the plurality of pixels.
  4. In paragraph 1, The above step of compressing and storing is, A step of analyzing a learning feature map corresponding to the type of the above feature map to identify a compression mode corresponding to the above feature map among a plurality of compression modes; A step of compressing the transformed feature map according to the identified compression mode; A method for controlling an electronic device, comprising the step of storing information about the identified compression mode in a header.
  5. In paragraph 4, The above compression step is, A step of identifying the value of at least one pixel based on the identified compression mode among a plurality of pixels adjacent to each of the plurality of pixels included in the transformed feature map; A step of predicting the value of each pixel using the value of at least one pixel identified above; and A method for controlling an electronic device comprising the step of decreasing the value of each pixel by the predicted value.
  6. In paragraph 1, The above control method is, A step of restoring the residual of the above-described compressed feature map through the above-described compression mode; A step of inversely transforming the restored feature map through the above lookup table; and The method further includes the step of inputting the inversely transformed feature map into a second layer included in the artificial intelligence model. The above restored feature map is identical to the above transformed feature map, an electronic device control method.
  7. In paragraph 1, The above control method is, A step of obtaining multiple pixel groups by grouping multiple pixels included in the above-mentioned compressed feature map into units of a preset number of pixels; A step of identifying, among the plurality of pixel groups above, the pixel group having the minimum bit amount after compressing the plurality of pixels included in the group as a header group; and An electronic device control method further comprising the step of determining the number of bits corresponding to the difference in pixel values within the header group and storing the compressed feature map based on the number of bits.
  8. In Paragraph 7, The above-mentioned saving step is, An electronic device control method that stores information about the above header group by including it in a header.
  9. In electronic devices, Memory; When an input image is input into an artificial intelligence model, a feature map for the input image is obtained by a first layer included in the artificial intelligence model, and Identify the type of the feature map by analyzing the histogram information of the above feature map, and Among a plurality of lookup tables, obtain a lookup table corresponding to the identified type, and The feature map is transformed through the above lookup table, and A processor that compresses the transformed feature map using a compression mode corresponding to the feature map among a plurality of compression modes and stores it in the memory; The above plurality of lookup tables are, It is generated in the process of acquiring a learning feature map for each input image by inputting multiple input images into the artificial intelligence model and determining the type of the learning feature map by analyzing the histogram information of the learning feature map. The above plurality of lookup tables are, It is generated to reduce the maximum residual between pixel values of multiple pixels included in the above feature map, and Each of the above plurality of lookup tables is an electronic device generated to correspond to each type of learning feature map.
  10. delete
  11. In Paragraph 9, The above lookup table is, An electronic device that is a lookup table for converting a pixel having a high-frequency pixel value among a plurality of pixels included in the above feature map to correspond to a value close to the median value of the variable range of pixel values of the plurality of pixels.
  12. In Paragraph 9, The above processor is, By analyzing a learning feature map corresponding to the type of the above feature map, a compression mode corresponding to the above feature map among a plurality of compression modes is identified, and Compress the transformed feature map according to the above-identified compression mode, and An electronic device that stores information about the identified compression mode within a header.
  13. In Paragraph 12, The above processor is, Identifying the value of at least one pixel among a plurality of pixels adjacent to each of the plurality of pixels included in the above-described transformed feature map based on the identified compression mode, and Predicting the value of each pixel using the value of at least one pixel identified above, and An electronic device that reduces the value of each of the above pixels by the predicted value.
  14. In Paragraph 9, The above processor is, The above-mentioned compressed feature map stored in the memory is restored as a residual through the above-mentioned compression mode, the above-mentioned restored feature map is inversely transformed through the above-mentioned lookup table, and the above-mentioned inversely transformed feature map is input into a second layer included in the above-mentioned artificial intelligence model, and The above restored feature map is identical to the above transformed feature map, an electronic device.
  15. In Paragraph 9, The above processor is, Multiple pixels included in the above compressed feature map are grouped into units of a preset number of pixels to obtain multiple pixel groups, and Among the above plurality of pixel groups, the pixel group with the minimum bit amount after compressing the plurality of pixels included in the group is identified as the header group, and An electronic device that determines the number of bits corresponding to the difference in pixel values within the header group and stores the compressed feature map in the memory based on the number of bits.
  16. In paragraph 15, The above processor is, An electronic device that stores information about the above header group in the memory by including it in the header.
  17. Regarding the training method of the server's artificial intelligence model, A step of inputting multiple training videos into a target model; A step of acquiring multiple feature maps for multiple training images; A step of analyzing the histograms of the plurality of feature maps to identify the types of the feature maps; A step of obtaining a lookup table corresponding to the identified type among a plurality of lookup tables; A step of determining a compression mode corresponding to the identified type among a plurality of compression modes; The method includes the step of transmitting information regarding the lookup table and compression mode to an external device; The above plurality of lookup tables are, It is generated in the process of acquiring a learning feature map for each input image by inputting multiple input images into the artificial intelligence model and determining the type of the learning feature map by analyzing the histogram information of the learning feature map. The above plurality of lookup tables are, It is generated to reduce the maximum residual between pixel values of multiple pixels included in the above feature map, and A learning method in which each of the above plurality of lookup tables is generated to correspond to each type of learning feature map.
  18. In Paragraph 17, The compression mode corresponding to the type of the above feature map is, A learning method for a compression mode determined to minimize the bit size after compression on the feature map among a plurality of compression modes.
  19. In terms of the server, Communications Department; When multiple training images are input into the target model, multiple feature maps for the multiple training images are obtained, and By analyzing the histograms of the plurality of feature maps above, the types of feature maps are identified, and Among a plurality of lookup tables, obtain a lookup table corresponding to the identified type, and Determine a compression mode corresponding to the identified type among a plurality of compression modes, and A processor that controls the communication unit to transmit information regarding the lookup table and compression mode to an external device; comprising The above plurality of lookup tables are, It is generated in the process of acquiring a learning feature map for each input image by inputting multiple input images into the artificial intelligence model and determining the type of the learning feature map by analyzing the histogram information of the learning feature map. The above plurality of lookup tables are, It is generated to reduce the maximum residual between pixel values of multiple pixels included in the above feature map, and Each of the above plurality of lookup tables is a server generated to correspond to each type of learning feature map.
  20. In Paragraph 19, The compression mode corresponding to the type of the above feature map is, A server, which is a compression mode determined to minimize the bit amount after compression on the feature map among a plurality of compression modes.

Description

Electronic device for high-speed compression processing of feature map of CNN utilizing system and control method thereof The present disclosure relates to an electronic device and a control method for compressing a multi-channel feature map image generated during a CNN (Convolutional Neural Network)-based media processing process. Multi-channel feature map images are generated during the intermediate stages of CNN computation; the structure is such that multiple feature map images are generated for each CNN layer and converted into the final result at the last layer. In CNN-based media processing, excessive image transmission capacity is required when storing or reading feature map images from memory. The present disclosure discloses an efficient compression method and apparatus for feature map images to reduce the image transmission capacity required when storing or reading feature map images generated during the processing of a CNN in memory. In fields such as image recognition, when using Multi-Layer Perceptrons (MLPs) or multi-layered neural networks, all inputs are assigned the same level of importance regardless of their position. Consequently, constructing a fully-connected neural network using this approach leads to a problem where the parameter size becomes enormous. Traditionally, this issue has been addressed by using Convolutional Neural Networks (CNNs). Meanwhile, conventionally, to compress multi-channel images generated in each channel of the CNN operation process, the storage capacity of feature map images can be reduced by applying existing JPEG, JPEG2000, PNG, or Lempel-Ziv Run-length Coding methods to the images of each channel. To further improve compression performance based on prediction between image channels, MPEG-based compression applied to video compression can be applied, or the storage capacity of feature map images can be reduced by applying the 3D SPIHT (Set partitioning in hierarchical trees) method, which extends the wavelet compression method of a single image to multi-channel images for multi-spectral image compression of satellite images as shown in Figure 3. While applying conventional image compression methods to feature map image compression can efficiently reduce image storage capacity, it is difficult to utilize effectively because the algorithms are not designed for operation in embedded systems. An efficient compression algorithm is required within a complexity level feasible for implementation in embedded systems. Furthermore, existing compression methods were developed to efficiently compress general images and are therefore not optimized for feature map compression. FIG. 1 is a drawing for explaining a system including an electronic device and a server for using an artificial intelligence model according to one embodiment of the present disclosure. FIG. 2 is a diagram illustrating a process in which encoding and decoding are performed when an electronic device inputs an input image to a CNN-based artificial intelligence model according to one embodiment of the present disclosure. FIG. 3 is a simple block diagram for explaining the configuration of an electronic device according to one embodiment of the present disclosure. FIG. 4 is a detailed block diagram for explaining the configuration of an electronic device according to one embodiment of the present disclosure. FIG. 5 is a block diagram illustrating the configuration of a server (200) according to one embodiment of the present disclosure. FIGS. 6 to 8 are drawings for explaining the generation of a lookup table and the determination of a compression mode during the process of training an artificial intelligence model according to one embodiment of the present disclosure. FIGS. 9 and FIGS. 10 are drawings for explaining how to convert and compress an input image using a lookup table and a compression mode according to an embodiment of the present disclosure. FIG. 11 is a sequence diagram illustrating a method for compressing a feature map through a system including a server and an electronic device according to one embodiment of the present disclosure. FIGS. 12a to 12c are drawings for specifically explaining the process of encoding and decoding a feature map by an electronic device according to one embodiment of the present disclosure. FIGS. 13a and 13b are drawings illustrating compression ratio results using a specific lookup table and a compression mode on a plurality of images according to one embodiment of the present disclosure. Hereinafter, various embodiments of this document are described with reference to the accompanying drawings. However, this is not intended to limit the technology described in this document to specific embodiments and should be understood to include various modifications, equivalents, and/or alternatives to the embodiments of this document. In relation to the description of the drawings, similar reference numerals may be used for similar components. Additionally, expressions