US-12621402-B2 - Method and device for performing color twist for images
Abstract
A processing device and method for executing a color twist operation are provided. The processing device comprises memory and a processor configured to convert values of pixels of a frame from a first color domain to a hue, saturation and value (HSV) color domain, adjust hue values and saturation values of the pixels, store the adjusted hue and saturation values in a portion of the memory local to the processor and convert the frame from the HSV color domain to the first color domain using the adjusted hue and saturation values stored in local memory. The adjusted hue and saturation values are generated from pre-adjusted values, which are generated from masked vector values.
Inventors
- Rajy Meeyakhan Rawther
Assignees
- ADVANCED MICRO DEVICES, INC.
Dates
- Publication Date
- 20260505
- Application Date
- 20220930
Claims (20)
- 1 . A processing device for executing a color twist operation comprising: memory; and a processor configured to: convert values of pixels of a frame from a first color domain to a hue, saturation and value (HSV) color domain; apply the color twist by rotating a hue component by a twist angle Δ and adjusting a saturation component according to a factor β for the pixels in the HSV color domain; store the adjusted hue and saturation values in a portion of the memory local to the processor; and convert the frame from the HSV color domain to the first color domain using the adjusted hue and saturation values stored in the portion of the memory.
- 2 . The processing device of claim 1 , wherein the portion of the memory local to the processor is one of register files, local data store (LDS) memory and local cache memory.
- 3 . The processing device of claim 1 , wherein the first color domain is one of a red, green, blue (RGB) color domain and a Y component, U component, V component (YUV) color domain; and the processor is configured to convert the frame from the HSV color domain to one of the RGB color domain and the YUV color domain using the adjusted hue and saturation values stored in local memory.
- 4 . The processing device of claim 1 , further comprising a display device, wherein the processor is further configured to store pixel values of an output frame and provide the pixel values of the output frame to the display device for display.
- 5 . A method of executing a color twist operation comprising: converting values of pixels of a frame from a first color domain to a hue, saturation and value (HSV) color domain; applying the color twist by rotating a hue component by a twist angle Δ and adjusting a saturation component according to a factor β for the pixels in the HSV color domain; storing the adjusted hue and saturation values in local memory; and converting the frame from the HSV color domain to the first color domain using the adjusted hue and saturation values stored in the local memory.
- 6 . The method of claim 5 , wherein the local memory is one of register files, local data store (LDS) memory and local cache memory.
- 7 . The method of claim 5 , wherein the first color domain is one of a red, green, blue (RGB) color domain and a Y component, U component, V component (YUV) color domain; and the method further comprises converting the frame from the HSV color domain to one of the RGB color domain and the YUV color domain using the adjusted hue and saturation values stored in local memory.
- 8 . The method of claim 5 , further comprising converting the frame from the HSV color domain to the first color domain by, for each of a plurality of red, green and blue (RGB) color vector values: generating, masked vector values from maximum vector values and corresponding RGB color vector values; determining masked vector values equal to zero to be invalid output values; generating hue and saturation vector values from the masked vector values and the corresponding RGB color vector values; and generating, as pre-adjusted hue and saturation vector values, the hue and saturation vector values generated from the masked vector values and the corresponding RGB color vector values.
- 9 . The method of claim 5 , further comprising converting the frame from the HSV color domain to the first color domain without loading the adjusted hue and saturation values from non-local memory.
- 10 . The method of claim 5 , further comprising adjusting brightness values and contrast values in the first color domain after converting from the HSV domain to the first color domain.
- 11 . The method of claim 5 , further comprising: storing pre-adjusted pixel values in local memory resulting from converting from the first color domain to the HSV color domain; and adjusting hue values and saturation values of the pixels based on the pre-adjusted pixel values.
- 12 . The method of claim 5 , further comprising storing pixel values of an output frame and providing the pixel values of the output frame for display.
- 13 . A processing device for executing a color twist operation on a plurality of frames comprising: memory; and a processor configured to, for each of the plurality of frames: convert values of pixels from a first color domain to a hue, saturation and value (HSV) color domain; apply the color twist by rotating a hue component by a twist angle Δ and adjusting a saturation component according to a factor β for the pixels in the HSV color domain; store the adjusted hue and saturation values in a portion of the memory local to the processor; and convert the frame from the HSV color domain to the first color domain using the adjusted hue and saturation values stored in the portion of the memory.
- 14 . The processing device of claim 13 , wherein the plurality of frames is a batch of N number of frames, with each frame having a number of pixels in height (H) and a number of pixels in width (W); and the frames are processed in one of NHWC format and NCHW format, where C is a number of channels.
- 15 . The processing device of claim 14 , wherein when the N number of frames to be processed is less than a number of parallel processors available to process the N number of frames, the processing device is configured to dynamically divide the frames by one of the height H, the width W and the number of channels C such that each divided portion of the frame is processed in parallel.
- 16 . The processing device of claim 13 , wherein the portion of the memory is one of register files, local data store (LDS) memory and local cache memory.
- 17 . The processing device of claim 13 , wherein the processor is configured to convert the frame from the first color domain to the HSV color domain by, for each of a plurality of red, green and blue (RGB) color vector values: generating, masked vector values from maximum vector values and corresponding RGB color vector values; determining masked vector values equal to zero to be invalid output values; generating hue and saturation vector values from the masked vector values and the corresponding RGB color vector values; and generating, as pre-adjusted hue and saturation vector values, the hue and saturation vector values generated from the masked vector values and the corresponding RGB color vector values.
- 18 . The processing device of claim 1 , wherein the rotating and the adjusting are performed in a branch-free single-pass vector operation that loads vectors of HSV triplets and writes adjusted HSV values back.
- 19 . The processing device of claim 18 , wherein the branch-free single-pass vector operation reads and writes HSV triplets in coalesced bursts.
- 20 . The processing device of claim 1 , wherein Δ and β are provided as parameters that vary per frame, per region, or per pixel.
Description
BACKGROUND Machine learning (e.g., deep learning) is widely used in a variety of technologies (e.g., image classification) to make predictions or decisions to perform a particular task (e.g., whether an image includes a certain object). A convolutional neural network (CNN) is a class of deep learning algorithms widely used in machine learning applications. These networks typically include multiple layers. At each layer, a set of filters is applied to the output of previous layer, and the outputs of each layer are known as activations or feature maps. The first and last layers in a network are known as the input and output layers, respectively, and the layers in between the first and last layers are typically known as hidden layers. Machine learning models in supervised learning are trained in order to make predictions or decisions to perform a particular task (e.g., whether an image includes a certain object). During training, a model is exposed to different data. At each layer, the model transforms the data and receives feedback regarding the accuracy of its operations. During an inference stage, the trained model is used to infer or predict outputs on testing samples (e.g., input tensors). BRIEF DESCRIPTION OF THE DRAWINGS A more detailed understanding can be had from the following description, given by way of example in conjunction with the accompanying drawings wherein: FIG. 1 is a block diagram of an example device in which one or more features of the disclosure can be implemented; FIG. 2 is a block diagram of the device of FIG. 1, illustrating additional detail; FIG. 3 is a flow diagram illustrating an example method of executing a color twist operation according to features of the present disclosure; FIG. 4 illustrates example dimensions of a plurality of images to which a color twist operation is performed according to features of the present disclosure; and FIG. 5 is a diagram illustrating an example of using masks to compute hue values for conversion of a frame from an RGB domain to an HSV domain according to features of the present disclosure. DETAILED DESCRIPTION Color twist (also known as color jitter) is a technique which transforms the hue, saturation and brightness of an image. Color twist adjusts the hue, saturation, brightness, and contrast values of an image according to factors selected by a user. After color twist is performed, the hue, saturation and brightness values (e.g., pixel values) of the image are different (e.g., in a different color domain) from the values before color twist is performed. Color twist is often used to facilitate machine learning tasks, such as image classification, object detection and prediction. However, because the color twist transform results in the image having different color values (i.e., hue, saturation and brightness values), an object in an image can be incorrectly detected (e.g., detecting a cat as a dog) due to the changes in the color values of the object. To prevent incorrect detection due to the changes in color values, during training the color twist is typically performed multiple times using different hue, saturation and brightness values such that a machine (e.g., processor) can correctly identify and detect an object for different hue, saturation and brightness values. An input image to be processed is typically in a red, blue, green (RGB) color domain or a YUV domain, which includes a luminance component Y and two chrominance components, being a U component (blue projection) and V component (red projection), However, the hue, saturation and brightness values of the image cannot be changed in the RGB or YUV color domains. Accordingly, conventional color twist techniques convert the image from the RGB or YUV domain to the HSV domain to change the hue, saturation and brightness values and then convert the image back to the RGB domain for further processing. That is, conventional color twist techniques execute a first read of the image to convert the image from the RGB or YUV domain to the HSV domain, load the HSV image, execute a second read in the HSV domain to change the hue, saturation and brightness values, store (e.g., in main memory or another portion of non-local memory) the output image with the changed values, load the output image (e.g., load the values of the image to registers from main memory), convert the image back to the RGB domain, and then execute additional reads of the image and additional load and store operations for further processing (e.g., change the brightness and contrast values of the image). However, performing multiples reads of the image, including multiple memory loads and stores, are time consuming and expensive (e.g., increased power consumption). Conventional color twist techniques also include executing many branch instructions for various conditions (e.g., “if else” conditions and “switch case” conditions in a program). Execution of these branch instructions typically requires jumping between different po