CN-122023148-A - Frequency domain visual processing method, system, equipment and medium based on RAW image
Abstract
The invention provides a frequency domain visual processing method, a system, equipment and a medium based on RAW images. The method comprises the steps of obtaining Bayer array RAW image data, carrying out channel rearrangement processing on the RAW image data according to a pixel arrangement mode of the Bayer array RAW image data to generate an RGB image, dividing the RGB image into a plurality of non-overlapped RGB image blocks, carrying out frequency domain transformation processing on the RGB image blocks based on a preset frequency domain coefficient set for each RGB image block to obtain compressed RGB frequency domain characteristics of the RGB image blocks, wherein the frequency domain coefficient set is used for indicating frequency domain coordinates required to be generated in a frequency domain transformation result, carrying out frequency domain color space transformation processing on the compressed RGB domain frequency domain characteristics to generate YCbCr domain frequency domain characteristics, carrying out channel splicing on the YCbCr domain frequency domain characteristics of each RGB image block to form frequency domain characteristics, and inputting the frequency domain characteristics into a visual task processing model to generate a corresponding visual task result. The invention improves the image processing rate at the colleague who guarantees the task precision.
Inventors
- LOU XIN
- LI HAOYAN
- ZHANG XIANGYU
Assignees
- 上海科技大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260212
Claims (10)
- 1. A method for frequency domain visual processing based on RAW images, the method comprising: Acquiring Bayer array RAW image data; According to the pixel arrangement mode of the Bayer array RAW image data, channel rearrangement processing is carried out on the RAW image data to generate an RGB image; Dividing the RGB image into a plurality of non-overlapping RGB image blocks; for each RGB image block: Performing frequency domain transformation processing on an RGB image block based on a preset frequency domain coefficient set to obtain compressed RGB frequency domain characteristics of the RGB image block, wherein the frequency domain coefficient set is used for indicating frequency domain coordinates needing to be generated in a frequency domain transformation result; performing frequency domain color space transformation processing on the compressed RGB domain frequency domain features to generate YCbCr domain frequency domain features; Channel splicing is carried out on the YCbCr domain frequency domain characteristics of each RGB image block to form frequency domain characteristics; And inputting the frequency domain features into a visual task processing model to generate a corresponding visual task result.
- 2. The method of claim 1, wherein the step of performing channel rearrangement processing on the RAW image data according to a pixel arrangement of the Bayer array RAW image data to generate an RGB image comprises: Dividing the RAW image data into a plurality of pixel areas according to the pixel arrangement mode of the Bayer array RAW image data; selecting a first color component pixel value, a second color component pixel value and a third color component pixel value as channel pixels in each pixel region, and discarding the rest pixel values; Determining a channel mapping relation between the RAW image and the RGB image according to the spatial positions of the pixel points in the selected channel pixels in the pixel region; And according to the channel mapping relation, carrying out channel rearrangement and space reconstruction on the pixel values selected in each pixel area to generate an RGB image.
- 3. The RAW image-based frequency domain visual processing method according to claim 1, wherein the step of performing frequency domain transform processing on an RGB image block based on a preset frequency domain coefficient set to obtain compressed RGB frequency domain features of the RGB image block comprises: determining a pixel combination for generating each frequency domain coordinate according to the mapping relation between each frequency domain coordinate and the space pixel in the frequency domain coefficient set; Performing frequency domain transformation operation on each pixel combination to generate RGB domain frequency domain characteristics corresponding to each frequency domain coordinate; And arranging the RGB domain frequency domain features in sequence to form compressed RGB domain features of the RGB image block.
- 4. The RAW image-based frequency domain visual processing method according to claim 3, wherein the step of performing frequency domain transform operation on the pixel combinations for each pixel combination, respectively, and generating RGB domain frequency domain features corresponding to the frequency domain coordinates comprises: According to the mapping relation between each frequency domain coordinate and the space pixel in the preset frequency domain coefficient set, carrying out weighted summation operation on each pixel point in the pixel combination to obtain a corresponding intermediate frequency domain result; And carrying out normalization processing on the intermediate frequency domain result according to a preset scaling factor to obtain final RGB domain frequency domain characteristics.
- 5. The RAW image-based frequency domain visual processing method according to claim 1, wherein the step of performing frequency domain color space transform processing on the compressed RGB domain frequency domain features to generate YCbCr domain frequency domain features comprises: Extracting R domain frequency domain features, G domain frequency domain features and B domain frequency domain features from the compressed RGB domain frequency domain features respectively; Based on a preset linear mapping relation between an RGB color space and a YCbCr color space, determining transformation coefficients between RGB components and YCbCr components at each frequency domain coordinate in the frequency domain coefficient set; based on the transformation coefficient, carrying out linear combination operation on the corresponding R domain frequency domain characteristics, G domain frequency domain characteristics and B domain frequency domain characteristics to generate corresponding brightness component Y frequency domain characteristics, chromaticity component Cb frequency domain characteristics and Cr frequency domain characteristics; And sequentially arranging the Y frequency domain features, the Cb frequency domain features and the Cr frequency domain features to generate YCbCr frequency domain features.
- 6. The RAW image-based frequency domain visual processing method according to claim 5, wherein the step of sequentially arranging Y frequency domain features and Cb frequency domain features, cr frequency domain features to generate YCbCr frequency domain features comprises: respectively performing offset compensation treatment on the Cb frequency domain characteristics and the Cr frequency domain characteristics; respectively carrying out normalization processing on the Y frequency domain characteristics and the Cb frequency domain characteristics and the Cr frequency domain characteristics after compensation processing; and sequentially arranging the Y frequency domain features and the Cb frequency domain features and the Cr frequency domain features after normalization processing to generate YCbCr frequency domain features.
- 7. The RAW image-based frequency domain visual processing method according to claim 1, wherein the step of channel-stitching YCbCr domain frequency domain features of each RGB image block to form the frequency domain features comprises: And selecting and arranging frequency domain features corresponding to different color components in the YCbCr domain frequency domain features of each RGB image block according to a preset channel allocation rule to form the frequency domain features.
- 8. A RAW image-based frequency domain vision processing system, the system comprising: the image acquisition module is used for acquiring Bayer array RAW image data; The conversion module is used for carrying out channel rearrangement processing on the RAW image data according to the pixel arrangement mode of the Bayer array RAW image data to generate an RGB image; a segmentation module for segmenting the RGB image into a plurality of non-overlapping RGB image blocks; A frequency domain generation module for, for each RGB image block: Performing frequency domain transformation processing on an RGB image block based on a preset frequency domain coefficient set to obtain compressed RGB frequency domain characteristics of the RGB image block, wherein the frequency domain coefficient set is used for indicating frequency domain coordinates needing to be generated in a frequency domain transformation result; performing frequency domain color space transformation processing on the compressed RGB domain frequency domain features to generate YCbCr domain frequency domain features; the frequency domain splicing module is used for carrying out channel splicing on the YCbCr domain frequency domain characteristics of each RGB image block to form frequency domain characteristics; And the visual processing module is used for inputting the frequency domain characteristics into the visual task processing model and generating a corresponding visual task result.
- 9. An electronic device, the electronic device comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the electronic device to implement the RAW image-based frequency domain vision processing method of any one of claims 1 to 7.
- 10. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor of a computer, causes the computer to perform the RAW image-based frequency domain visual processing method according to any one of claims 1 to 7.
Description
Frequency domain visual processing method, system, equipment and medium based on RAW image Technical Field The present invention relates to the field of image processing technologies, and in particular, to a method, a system, an apparatus, and a medium for frequency domain visual processing based on RAW images. Background With the rapid development of applications such as intelligent perception, automatic driving, unmanned systems and edge computing, the vision system has provided higher and higher requirements on processing performance and energy efficiency in tasks such as target detection and target identification. On one hand, the image resolution in the actual application scene is continuously improved, the data scale is rapidly increased, and on the other hand, the embedded equipment, the mobile terminal and the vehicle-mounted system have strict constraints on power consumption, area and instantaneity, so that the traditional solution based on the general processor or the GPU is difficult to simultaneously meet the requirements of performance and energy efficiency. The existing mainstream vision system generally adopts a processing flow that an image sensor outputs an original image (RAW), generates an RGB image through a complete image signal processing pipeline (ISP), and inputs the RGB image into a back-end deep neural network for target detection or recognition task. The ISP pipeline typically includes several modules, demosaicing, denoising, white balancing, color correction, gamma correction, and compression encoding, whose design goals are primarily to serve human visual perception, rather than the machine vision tasks themselves. Because of lack of unified standard between ISP modules, different manufacturers have large difference in algorithm selection and processing sequence, resulting in unstable image distribution, and increased system complexity, power consumption and delay. Disclosure of Invention The invention provides a frequency domain visual processing method, a system, equipment and a medium based on RAW images, which are used for solving the technical problems that the existing visual processing mode is insufficient in cooperativity with hardware realization, and low-power consumption and high-energy efficiency visual processing are difficult to realize. The invention provides a frequency domain visual processing method based on a RAW image, which comprises the steps of obtaining Bayer array RAW image data, carrying out channel rearrangement processing on the RAW image data according to a pixel arrangement mode of the Bayer array RAW image data to generate an RGB image, dividing the RGB image into a plurality of non-overlapped RGB image blocks, carrying out frequency domain transformation processing on the RGB image blocks based on a preset frequency domain coefficient set for each RGB image block to obtain compressed RGB frequency domain characteristics of the RGB image blocks, wherein the frequency domain coefficient set is used for indicating frequency domain coordinates required to be generated in a frequency domain transformation result, carrying out frequency domain color space transformation processing on the compressed RGB frequency domain characteristics to generate YCbCr frequency domain characteristics, carrying out channel splicing on the YCbCr frequency domain characteristics of each RGB image block to form frequency domain characteristics, and inputting the frequency domain characteristics into a visual task processing model to generate a corresponding visual task result. In an embodiment of the invention, channel rearrangement processing is performed on RAW image data according to a pixel arrangement mode of the RAW image data of a Bayer array, and the step of generating an RGB image comprises dividing the RAW image data into a plurality of pixel areas according to the pixel arrangement mode of the RAW image data of the Bayer array, selecting a first color component pixel value, a second color component pixel value and a third color component pixel value as channel pixels in each pixel area, discarding the rest pixel values, determining a channel mapping relation between the RAW image and the RGB image according to the spatial positions of pixel points in each selected channel pixel in the pixel areas, and performing channel rearrangement and spatial reconstruction on the selected pixel values in each pixel area according to the channel mapping relation to generate the RGB image. In an embodiment of the invention, the step of performing frequency domain transformation processing on an RGB image block based on a preset frequency domain coefficient set to obtain compressed RGB frequency domain features of the RGB image block includes determining pixel combinations for generating each frequency domain coordinate according to a mapping relation between each frequency domain coordinate and a spatial pixel in the frequency domain coefficient set, performing frequency doma