US-12626499-B2 - Image processing device and image processing method
Abstract
An image processing device includes: a reception circuit receiving image data spread in a height direction and in a horizontal direction; a line memory having a register group for each channel, the resister group being capable of holding data spread in the horizontal direction in units of row along the height direction; a shift data generation circuit generating a plurality of pieces of first intermediate data in which spatial positions including the height direction of the image data are shifted by different shift amounts, and storing them in a plurality of register groups mutually corresponding to a plurality of channels of the line memory; a filtering processing circuit extracting, in the plurality of pieces of first intermediate data, data indicating the maximum value among a plurality of pieces of data having the same spatial position and having different channels; and a pooling processing circuit extracting, in second intermediate data indicating the maximum value among the plurality of pieces of data for each predetermined spatial region, and generating output data.
Inventors
- Tomochika Murakami
Assignees
- RENESAS ELECTRONICS CORPORATION
Dates
- Publication Date
- 20260512
- Application Date
- 20231003
- Priority Date
- 20221116
Claims (14)
- 1 . An image processing device comprising: a reception circuit receiving image data spread in a height direction and a horizontal direction; a line memory having a plurality of channels, each channel having a register group configured to hold pixel data spread in the horizontal direction in units of row along the height direction; a shift data generation circuit generating, for an entirety of the image data, a plurality of pieces of first intermediate data in which a spatial position of the image data is shifted by respective shift amounts that are different from one another in one or both of the height direction and the horizontal direction, the shift data generation circuit storing the plurality of pieces of first intermediate data in a plurality of register groups respectively corresponding to the plurality of channels of the line memory; a filtering processing circuit extracting data, which indicates a maximum value among a plurality of pieces of data having a same spatial position and having different channels, from the plurality of pieces of first intermediate data stored in the line memory, the filtering processing circuit generating second intermediate data; and a pooling processing circuit extracting, from the second intermediate data, data indicating the maximum value among the plurality of pieces of data for each predetermined spatial region, and generating output data.
- 2 . The image processing device according to claim 1 , further comprising a convolution operation circuit, and wherein the convolution operation circuit is used for the shift data generation circuit.
- 3 . The image processing device according to claim 2 , wherein the shift data generation circuit uses a plurality of kernels, in which the spatial position of the image data is shifted by the respective shift amounts that are different from one another in one or both of the height direction and the horizontal direction, to generate the plurality of pieces of intermediate data from the plurality of pieces of image data.
- 4 . The image processing device according to claim 3 , wherein the plurality of kernels are stored in a predetermined register in advance.
- 5 . The image processing device according to claim 1 , further comprising a Cross Channel Operation circuit, and wherein the Cross Channel Operation circuit is used for the filtering processing circuit.
- 6 . The image processing device according to claim 1 , when a Max Pooling processing whose kernel size is K and whose stride is S is needed per direction with respect to the image data, wherein each of K and S is a positive integer, and wherein K is greater than S, wherein the shift data generation circuit generates K−S+1 pieces of first intermediate data per direction, and wherein the pooling processing circuit generates the output data by setting, as S, a size of the predetermined spatial region.
- 7 . An image processing device comprising: a line memory having a plurality of channels, each channel having a register group configured to hold data spread in a horizontal direction in units of row along a height direction; a reception circuit receiving a plurality of pieces of first intermediate data in which a spatial position including the height direction of image data is shifted by respective shift amounts that are different from one another in one or both of the height direction and the horizontal direction, the reception circuit storing the plurality of pieces of first intermediate data in a plurality of register groups respectively corresponding to the plurality of channels of the line memory; a filtering processing circuit extracting data, which indicates a maximum value among a plurality of data having a same spatial position and having different channels, from the plurality of pieces of first intermediate data stored in the line memory, the filtering processing circuit generating second intermediate data; and a pooling processing circuit extracting, from the second intermediate data, data indicating the maximum value among the plurality of pieces of data for each predetermined spatial region, and generating output data.
- 8 . The image processing device according to claim 7 , wherein the reception circuit is configured to further receive the image data spread in the height direction and in the horizontal direction, and wherein the image processing device further comprises a shift data generation circuit generating the plurality of pieces of first intermediate data in which a spatial position including the height direction of the image data received by the reception circuit is shifted in the height direction by the respective shift amounts that are different from one another, the shift data generation circuit storing the plurality of pieces of first intermediate data in the plurality of register groups respectively corresponding to the plurality of channels of the line memory.
- 9 . An image processing method by an image processing device including a reception circuit receiving image data spread in a height direction and in a horizontal direction, a line memory having a plurality of channels, each channel having a register group configured to hold data spread in the horizontal direction in units of row along a height direction, a shift data generation circuit, a filtering processing circuit, and a pooling processing circuit, the image processing method comprising: receiving the image data by the reception circuit; generating a plurality of pieces of intermediate data in which a spatial position including the height direction of the image data is shifted by respective shift amounts that are different from one another in one or both of the height direction and the horizontal direction, by the shift data generation circuit; storing the plurality of pieces of first intermediate data in a plurality of register groups mutually corresponding to a plurality of channels of the line memory, respectively; extracting, by the filtering processing circuit, data indicating a maximum value among a plurality of pieces of data, which have a same spatial position and have a different channel, among the plurality of pieces of first intermediate data stored in the plurality of resister-register groups of the line memory, to generate second intermediate data; and extracting, by the pooling processing circuit, data indicating the maximum value among the plurality of pieces data for each predetermined spatial region, to generate output data.
- 10 . The image processing method according to claim 9 , further comprising generating the plurality of pieces of first intermediate data from the image data by a convolution operation circuit used as the shift data generation circuit.
- 11 . The image processing method according to claim 10 , further comprising generating the plurality of pieces of first intermediate data, by the convolution operation circuit, from the image data by using a plurality of kernels in which the spatial position of the image data is shifted by the respective shift amounts that are different from one another in one or both of the height direction and the horizontal direction.
- 12 . The image processing method according to claim 11 , further comprising generating the plurality of pieces of first intermediate data, by the convolution operation circuit, from the image data by using the plurality of kernels stored in a predetermined register in advance.
- 13 . The image processing method according to claim 9 , further comprising generating the second intermediate data from the plurality of pieces of first intermediate data by a Cross Channel Operation circuit used as the filtering processing circuit.
- 14 . The image processing method according to claim 9 , further comprising, when a Max Pooling processing whose kernel size is K and whose stride is S is required for the image data per direction, generating the K−S+1 pieces of first intermediate data per direction and the output data in which a size of the predetermined spatial region per direction is set as S, wherein each of K and S is a positive integer, and wherein K is greater than S.
Description
CROSS-REFERENCE TO RELATED APPLICATION The present application claims priority from Japanese Patent Application No. 2022-183153 filed on Nov. 16, 2022, the content of which is hereby incorporated by reference into this application. BACKGROUND The present disclosure relates to an image processing device and an image processing method and to an image processing device and an image processing method suitable for efficiently performing an image processing without increasing a circuit scale. In recent years, with dramatic improvement in a recognition rate of an image recognition processing using a Convolutional Neural Network (CNN), automobile manufacturers around the world are competing to develop an Advanced Driver-Assistance Systems (ADAS) using the CNN and automatic driving technology. Under such circumstances, semiconductor manufacturers that supply image recognition processors and the like to the automobile manufacturers are required to further improve performance of image recognition processing using the CNN. There is a disclosed technique listed below. [Patent Document 1] Japanese Unexamined Patent Application Publication No. 2019-207458 For example, Patent Document 1 discloses a technique for speeding up CNN-Intellectual Property (IP). SUMMARY By the way, since the CNN-IP disclosed in Patent Document 1 does not support an Overlap Pooling processing, the Overlap Pooling processing requires to be assigned to, for example, an IP, which is capable of performing another Overlap Pooling processing (for example, a programmable processor Computer Vision engine (CVe) capable of various processing) and is different from the CNN-IP. Incidentally, the Overlap Pooling processing is a max pooling processing in which a kernel size is larger than a stride. In using the Overlap Pooling processing in a neural network, it is said that there is effective in preventing over-learning and enhancing a recognition rate of recognition objects. The Overlap Pooling processing is also used in prominent and well-known neural networks such as ResNet50. However, since programmable processors such as CVe are general-purpose, their processing performance is lower than that of the CNN-IP specialized for specific processing. In addition, switching allocation of the processing from a CNN-IP to another IP takes time to implement including system support containing data transfer between the two IPs. In order to solve the above-mentioned problems, it is strongly desired that the CNN-IP be configured to be able to perform the Overlap Pooling processing. However, in order for the CNN-IP to be configured to be able to perform the Overlap Pooling processing, it is necessary to additionally provide a buffer (register) for the overlap in the CNN-IP, which brings a problem in which the circuit scale of the CNN-IP is increased. The other problems and novel features will be apparent from the present specification and accompanying drawings. According to one embodiment, an image processing device includes: a reception circuit receiving image data spread in a height direction and a horizontal direction; a line memory having a register group for each channel, the resister group being capable of holding data spread in the horizontal direction in units of row along the height direction; a shift data generation circuit generating a plurality of pieces of first intermediate data in which a spatial position including the height direction of the image data is shifted by a different shift amount, the shift data generation circuit storing them in a plurality of register groups respectively corresponding to the plurality of channels of the line memory; a filtering processing circuit extracting data, which indicates a maximum value among the plurality of pieces of data having the same spatial position and having a different channel, from the plurality of pieces first intermediate data stored in the line memory, and generating second intermediate data; and a pooling processing circuit extracting, from the second intermediate data, data indicating the maximum value among the plurality of pieces of data for each predetermined spatial region, and generating output data. According to one embodiment, an image processing device includes: a line memory having a resister group for each channel, the resister group capable of holding data spread in a horizontal direction in units of row along a height direction; a reception circuit receiving a plurality of pieces of first intermediate data in which a spatial position including a height direction of image data spread in the height direction and in the horizontal direction is shifted by a different shift amount, the reception circuit storing them in a plurality of resister groups respectively corresponding to a plurality of channels of the line memory; a filtering processing circuit extracting data, which indicates a maximum value among a plurality of data having a same spatial position and having different channels, from the p