KR-20260062574-A - Pooling operation apparatus capable of parallel processing of data at low power, pooling operation method, and computer-readable recording medium for performing the same
Abstract
A pooling operation device according to exemplary embodiments of the present invention may be composed of a processor and a memory unit, and the processor of the pooling operation device may include an input buffer for retrieving input data from the memory unit and temporarily storing the input data, a first operation unit for performing an intermediate operation on the input data, a mid result buffer for temporarily storing the result of the intermediate operation performed by the first operation unit, a second operation unit for performing a final operation on the intermediate operation data retrieved from the mid result buffer, and a data output unit for outputting the final operation data as output data, and the intermediate operation performed by the first operation unit may be controlled to perform a pooling operation by applying a stride value smaller than the kernel size to the input data.
Inventors
- 김인원
- 기문철
- 이상원
Assignees
- 수퍼게이트 주식회사
Dates
- Publication Date
- 20260507
- Application Date
- 20241029
Claims (20)
- In a pooling arithmetic device composed of a processor and a memory unit, The above processor is, An input buffer for retrieving input data input from the above memory unit and temporarily storing the input data; A first operation unit for performing intermediate operations on the above input data; An intermediate buffer (Mid result buffer) for temporarily storing the intermediate operation result performed by the first operation unit above; A second operation unit for performing a final operation on intermediate operation data retrieved from the above intermediate buffer; and It includes a data output unit for outputting the above-mentioned final calculated data as output data; and A pooling operation device capable of parallel processing of data with low power, characterized in that the intermediate operation performed by the first operation unit is controlled to perform a pooling operation by applying a stride value smaller than the kernel size to the input data.
- In paragraph 1, A pooling operation device capable of parallel processing of data with low power, characterized in that the intermediate operation performed by the first operation unit is max pooling, min pooling, or average pooling.
- In paragraph 1, The above input data has a two-dimensional matrix structure, and A pooling operation device capable of parallel processing of data with low power, characterized in that the input buffers are provided in multiple numbers corresponding to the number of rows constituting the two-dimensional matrix structure.
- In paragraph 3, The above input data consists of a two-dimensional matrix structure with n rows, and A pooling operation device capable of parallel processing of data with low power, characterized in that the above input buffer includes a first to nth input buffer.
- In paragraph 3, A pooling operation device capable of parallel processing of data with low power, characterized in that the above input buffer is provided in a number smaller than the number of rows constituting the above two-dimensional matrix structure.
- In paragraph 3, A pooling operation device capable of parallel processing of data with low power, characterized in that the plurality of input buffers are controlled to read data corresponding to one row of the input data line by line and then provide it to the first operation unit.
- In paragraph 6, A pooling operation device capable of parallel processing of data with low power, characterized in that the first operation unit is controlled to simultaneously process filtered data line by line corresponding to the kernel size.
- In Paragraph 7, The above first operation unit is provided in a single unit when the kernel size is less than or equal to a preset value, and A pooling operation device capable of parallel processing of data with low power, characterized in that the first operation unit is controlled to perform at least partially simultaneous operations on line-unit data provided from the plurality of input buffers.
- In Paragraph 7, The above first operation unit is provided in multiple numbers when the kernel size exceeds a preset value, and A pooling operation device capable of parallel processing of data with low power, characterized in that the plurality of first operation units are controlled to distribute line-unit data received from the plurality of input buffers without duplication and perform simultaneous operations.
- In Paragraph 7, The intermediate operation data calculated on the above line-by-line basis is sequentially stored in the above intermediate buffer, and A pooling operation device capable of parallel processing of data with low power, characterized in that the second operation unit is controlled to accumulate the sequentially stored intermediate operation data and perform a pooling operation.
- In Paragraph 10, A pooling operation device capable of parallel processing of data with low power, characterized in that the intermediate operation performed by the second operation unit is max pooling, min pooling, or average pooling.
- In Paragraph 11, A pooling operation device capable of parallel processing of data with low power, characterized in that the pooling operation performed by the first operation unit and the pooling operation performed by the second operation unit are of the same type of pooling operation.
- In Paragraph 11, A pooling operation device capable of parallel processing of data with low power, characterized in that the pooling operation performed by the first operation unit and the pooling operation performed by the second operation unit are different types of pooling operations.
- In Paragraph 10, The above intermediate buffer includes a first and a second intermediate buffer, and The first intermediate buffer is configured to store intermediate result values of a pooling operation performed on data blocks constituting the same row of the input data, and A pooling operation device capable of parallel processing of data with low power, characterized in that the second intermediate buffer is configured to store intermediate result values of a pooling operation performed on data blocks constituting different rows of the input data.
- In Paragraph 14, The intermediate result value stored in the first intermediate buffer is the result value of the first pooling operation performed by the first operation unit, and The intermediate result value stored in the second intermediate buffer is the result value of the second pooling operation performed by the second operation unit, and A pooling operation device capable of parallel processing of data with low power, characterized in that the second pooling operation result value is line-unit data output by accumulating the first pooling operation result values and performing a pooling operation.
- In a pooling operation method using a pooling operation device composed of a processor and a memory unit, A step of retrieving input data from the above memory unit; A step of filtering the above input data into line-unit data blocks; A step of performing a first pooling operation on the above line-unit data blocks and then temporarily storing intermediate result values on a line-unit basis; A step of performing a second pooling operation by accumulating intermediate result values in line units as described above; and The method includes the step of outputting line-unit data output through the second pooling operation as output data; A pooling operation method capable of parallel processing of data with low power, characterized in that the first pooling operation is performed by applying a stride value smaller than the kernel size to the input data to perform the pooling operation.
- In Paragraph 16, The above input data has a two-dimensional matrix structure, and A pooling operation device capable of parallel processing of data with low power, characterized in that the above filtering step is performed by filtering the input data into multiple line-unit data blocks corresponding to the number of rows constituting the two-dimensional matrix structure.
- In Paragraph 16, A pooling operation device capable of parallel processing of data with low power, characterized in that some of the intermediate result values temporarily stored through the first pooling operation are controlled to be reused in the first pooling operation of another data block filtered from the same row.
- A computer-readable recording medium having a program recorded thereon for executing a pooling operation method capable of parallel processing of data with low power as described in any one of claims 16 to 18.
- A computer program comprising program code for executing a pooling operation method capable of parallel processing of data with low power as described in any one of claims 16 to 19, stored on a computer-readable recording medium.
Description
Pooling operation apparatus capable of parallel processing of data at low power, pooling operation method, and computer-readable recording medium for performing the same The present invention relates to a pooling operation device capable of parallel processing of data with low power, a pooling operation method, and a computer-readable recording medium for performing the same. In the fields of conventional computer vision and deep learning, pooling operations play a crucial role in image or signal processing. Generally, pooling operations are used to increase processing speed by reducing input data through operations such as maximum (Max), minimum (Min), and average (Avg), and to remove unnecessary information or noise. However, various problems exist in the method of processing input data depending on the filter size and stride value in pooling operations, and various devices and methods are being proposed to address these issues. The basic principle of pooling operations is to read data equal to the filter size from memory and perform operations. However, in reality, input data is often not arranged contiguously in memory, requiring multiple memory read operations. This has the disadvantage of negatively impacting memory access speed and computational efficiency, particularly when the filter size is large or the stride value is small, as the same data must be read multiple times. Furthermore, during the process of storing and processing data read for filter operations, a buffer is required to temporarily store a large amount of input data; in this case, as the buffer size increases, hardware resources are wasted, and this problem becomes even more severe, especially when processing large-scale images or signals. Existing technologies have used methods such as increasing the buffer size or allocating more memory to solve these problems, but this leads to inefficient use of hardware resources and can result in issues such as reduced computation speed. In particular, when the Stride value is smaller than the Kernel size, existing systems are unable to effectively handle the reuse of the same data, even though it is possible to reuse it. FIGS. 1 and FIGS. 2 are configuration diagrams for explaining a pooling operation device according to exemplary embodiments of the present invention. FIGS. 3 to 7 are drawings for explaining a data parallel processing process using a pooling operation device according to exemplary embodiments of the present invention. FIG. 8 is a flowchart illustrating a pooling operation method according to exemplary embodiments of the present invention. Specific embodiments of the present invention will be described below. The following detailed description is provided to facilitate a comprehensive understanding of the methods, devices, and/or systems described herein. However, this is merely illustrative and the present invention is not limited thereto. In describing the embodiments of the present invention, detailed descriptions of known technologies related to the present invention are omitted if it is determined that such detailed descriptions may unnecessarily obscure the essence of the present invention. Furthermore, the terms described below are defined in consideration of their functions within the present invention, and these may vary depending on the intentions or practices of the user or operator. Therefore, such definitions should be based on the content throughout this specification. Terms used in the detailed description are intended merely to describe the embodiments of the present invention and should not be limiting in any way. Unless explicitly stated otherwise, expressions in the singular form include the meaning of the plural form. In this description, expressions such as "include" or "comprise" are intended to refer to certain characteristics, numbers, steps, actions, elements, parts thereof, or combinations thereof, and should not be interpreted to exclude the existence or possibility of one or more other characteristics, numbers, steps, actions, elements, parts thereof, or combinations thereof other than those described. In addition, terms such as first, second, A, B, (a), (b), etc. may be used when describing the components of the embodiments of the present invention. These terms are used merely to distinguish the components from other components, and the essence, order, or sequence of the components is not limited by the terms. FIGS. 1 and FIGS. 2 are configuration diagrams for explaining a pooling operation device according to exemplary embodiments of the present invention. Referring to FIG. 1, a pooling operation device (1) according to exemplary embodiments of the present invention may be composed of a processor (200) and a memory unit (100), and the processor (200) of the pooling operation device may include an input buffer (210) for retrieving input data from the memory unit (100) and temporarily storing the input data, a first operation unit (220) for performin