KR-20260066709-A - Efficient demosaicing on neural processing units

KR20260066709AKR 20260066709 AKR20260066709 AKR 20260066709AKR-20260066709-A

Abstract

Techniques and systems for image demosaicing are provided. For example, a process comprises: obtaining multiple color channels for image data by performing a depth-wise convolution operation on a depth-wise convolution filter having predetermined parameter values; obtaining multiple processed color channels by performing a convolution operation on the multiple color channels; arranging the processed multiple color channels into a demosaiced image; and outputting the demosaiced image.

Inventors

파텔 슈밤 디팍
부드와니 파완 아수다람
폰나토타 라크쉬미 칸타 레디

Assignees

퀄컴 인코포레이티드

Dates

Publication Date: 20260512
Application Date: 20240826
Priority Date: 20230907

Claims (20)

As a device for demosaicing one or more images, At least one memory configured to store image data; and It includes at least one processor coupled to the above at least one memory, and the at least one processor is: Performing a depth-wise convolution operation on the image data and a depth-wise convolution filter having predetermined parameter values to obtain a plurality of color channels for the image data; Performing a convolution operation on the above plurality of color channels to obtain a plurality of processed color channels; Arranging the above-mentioned processed multiple color channels into a demosaiced image; and A device for demosaicing one or more images configured to output the above demosaiced image.
A device for demosaicing one or more images, wherein the image data is raw image data, in the first paragraph.
In paragraph 2, the device for demosaicing one or more images, wherein the raw image data includes a Bayer pattern.
An apparatus for demosaicing one or more images, wherein, in claim 1, the at least one processor is configured to perform the depth-wise convolution operation using at least one depth-wise convolution filter comprising predetermined parameter values.
A device for demosaicing one or more images, wherein, in paragraph 4, the predetermined parameter values of the at least one depth-wise convolution filter for the depth-wise convolution operation are predetermined based on the color filter of the image sensor used to capture the image data.
An apparatus for demosaicing one or more images, wherein the color filter is based on a Bayer pattern and the at least one depth-wise convolution filter comprises a kernel size of 2x2 and a stride of 2.
In claim 4, the apparatus for demosaicing one or more images, wherein the predetermined parameter values of the at least one depth-wise convolution filter are determined to extract pixel color values from the image data.
An apparatus for demosaicing one or more images, wherein, in claim 7, the predetermined parameter values of at least one depth-wise convolution filter are set to 0 or 1.
An apparatus for demosaicing one or more images, wherein, in claim 1, the at least one processor is configured to perform the convolution operation using a plurality of convolution filters, and the parameters of the plurality of convolution filters are tuned based on a training process.
In claim 9, an apparatus for demosaicing one or more images, wherein the number of the plurality of convolution filters is determined based on at least an upscaling factor.
A device for demosaicing one or more images, wherein, in claim 1, the at least one processor is configured to arrange the processed plurality of color channels into RGB channels based on depth-versus-space operations.
An apparatus for demosaicing one or more images, wherein, in order to perform the depth-versus-space operation, the at least one processor is configured to spatially arrange data blocks of the processed plurality of color channels to generate the demosaiced image.
An apparatus for demosaicing one or more images, wherein, in claim 1, the at least one processor is configured to upsample the processed plurality of color channels based on a pre-convolution operation.
An apparatus for demosaicing one or more images, wherein, in claim 1, the at least one processor is configured to perform depth-wise convolution operations using a machine learning model configured to perform image demosaicing.
In claim 1, the at least one processor is also configured to perform the depth-wise convolution operation and the convolution operation in multiplication and accumulation hardware units, for a device for demosaicing one or more images.
In paragraph 15, the device for demosaicing one or more images, wherein the at least one processor comprises a neural processing unit, and the neural processing unit comprises the multiplication and accumulation hardware units.
In claim 16, the neural processing unit comprises an application application integrated circuit (ASIC), and the ASIC incorporates a machine learning model configured to perform depth-wise convolution operations and arrange the processed plurality of color channels into the demosaiced image, for one or more images to demosaic.
In claim 15, the device for demosaicing one or more images, wherein the at least one processor comprises a graphics processing unit (GPU) or a central processing unit (CPU) including the multiplication and accumulation hardware units, and the depth-wise convolution operation, the machine learning model configured to perform the convolution operation and arrange the processed plurality of color channels into the demosaiced image, is executed on the GPU or CPU.
In claim 1, the device is a device for demosaicing one or more images, comprising one or more image sensors.
In claim 19, the above at least one processor is a device for demosaicing one or more images, which is integrated within one or more image sensors.

Description

Efficient demosaicing on neural processing units This application relates to image processing. For example, aspects of this application relate to systems and techniques for efficient demosaicing on processing units (e.g., neural processing units (NPUs)). Many devices and systems enable a scene to be captured by generating images (or frames) and/or video data (including multiple frames). For example, a camera or a device including cameras (cameras) can capture a sequence of frames of a scene (e.g., a video of the scene) based on light entering the camera. To enhance the quality of the frames captured by the camera, the camera may include lenses that focus the light entering the camera. The sequence of frames captured by the camera may be output for processing and/or consumption by other devices, among other uses. Cameras may include one or more processors, such as image signal processors (ISPs), capable of processing one or more image frames captured by an image sensor. For example, a raw image frame captured by an image sensor can be processed by an image signal processor (ISP) to generate a final image. Cameras may be configured with various image capture and image processing settings to alter the appearance of the image. Among other things, some camera settings, such as ISO, exposure time (also referred to as exposure duration), aperture size, f/stop, shutter speed, focus, and gain, are determined and applied before or during image capture. Image processing tasks performed before or during image capture are increasingly being carried out using artificial intelligence (AI)/machine learning (ML) models. Systems and technologies for image processing are described herein. The following presents a simplified summary relating to one or more embodiments disclosed herein. Accordingly, the following summary should not be construed as a comprehensive overview relating to all embodiments considered, nor should it be construed as identifying key or decisive elements relating to all embodiments considered, or describing categories associated with any particular embodiment. Accordingly, the following summary presents specific concepts relating to one or more embodiments of the mechanisms disclosed herein in a simplified form preceding the detailed description provided below. Systems, devices, methods, and computer-readable media for providing image processing are disclosed. In one exemplary example, an apparatus for image demosaicing is provided. The apparatus includes at least one memory configured to store image data and at least one processor coupled to at least one memory. The at least one processor is configured to perform a depth-wise convolution operation on the image data and a depth-wise convolution filter having predetermined parameter values to obtain a plurality of color channels for the image data; perform a convolution operation on the plurality of color channels to obtain a plurality of processed color channels; arrange the plurality of processed color channels into a demosaiced image; and output the demosaiced image. As another example, a method for image demosaicing is provided. The method comprises the steps of: performing a depth-wise convolution operation on image data and a depth-wise convolution filter having predetermined parameter values to obtain a plurality of color channels for the image data; performing a convolution operation on the plurality of color channels to obtain a plurality of processed color channels; arranging the processed plurality of color channels into a demosaiced image; and outputting the demosaiced image . In another example, a non-transient computer-readable medium is provided in which instructions are stored, and when the instructions are executed by at least one processor, at least one processor is made to: perform a depth-wise convolution operation on image data and a depth-wise convolution filter having predetermined parameter values to obtain a plurality of color channels for the image data; perform a convolution operation on the plurality of color channels to obtain a plurality of processed color channels; arrange the plurality of processed color channels into a demosaiced image; and output the demosaiced image. As another example, an apparatus for image demosaicing is provided. The apparatus includes means for obtaining a plurality of color channels for image data by performing a depth-wise convolution operation on an image data and a depth-wise convolution filter having predetermined parameter values; means for obtaining a plurality of processed color channels by performing a convolution operation on the plurality of color channels; means for arranging the processed plurality of color channels into a demosaiced image; and means for outputting the demosaiced image. In some embodiments, one or more of the devices described herein may include or be part of an extended reality device (e.g., a virtual reality (VR) device, an augmented reality (AR) device, or a mixed rea