CN-121998838-A - Generation method, processing method, device, apparatus, storage medium, and program product
Abstract
The invention relates to a generating method, a processing method, a device, equipment, a storage medium and a program product, wherein the generating method comprises the steps of carrying out feature extraction on image content of a first image to obtain a first feature image, carrying out feature extraction on style content of a second image to obtain a second feature image, carrying out alignment processing on content features in the first feature image and style features in the second feature image, obtaining a first predicted image based on the aligned content features and style features, inputting the first image and the second image into a first model to obtain a second predicted image, wherein the first model is used for carrying out image color mapping, updating model parameters of the first model based on difference information between the first predicted image and the second predicted image until a target model is obtained, so that data set acquisition is simple, difficulty in developing the target model is simplified, and meanwhile, occupation of a memory is reduced, and service performance of electronic equipment is improved.
Inventors
- CAO JUN
- MA LIANG
Assignees
- 北京小米移动软件有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20241108
Claims (13)
- 1. A method of generating an image model, the method comprising: extracting features from image content of a first image to obtain a first feature image, and extracting features from style content of a second image to obtain a second feature image, wherein the first image and the second image are acquired by the same electronic equipment under different acquisition conditions or by different electronic equipment respectively, and the image styles of the first image and the second image are different; Performing alignment processing on the content features in the first feature map and the style features in the second feature map, and obtaining a first predicted image based on the content features and the style features after the alignment processing, wherein the first predicted image is matched with the image style of the second image; Inputting the first image and the second image into a first model to obtain a second predicted image, wherein the first model is used for performing image color mapping; and updating the model parameters of the first model based on the difference information between the first predicted image and the second predicted image until a target model is obtained.
- 2. The method of claim 1, wherein the aligning the content features in the first feature map and the style features in the second feature map comprises: Establishing an association relation between the first feature map and the second feature map in a channel dimension to obtain a third feature map; obtaining an offset feature map based on offset between style features and content features corresponding to each pixel in the third feature map, wherein the spatial resolution of the offset feature map is the same as that of the third feature map; based on the offset feature map, the style features in the second feature map are adjusted to align with the content features in the first feature map.
- 3. The method of claim 2, wherein the adjusting the style feature in the second feature map to align with the content feature in the first feature map based on the offset feature map comprises: Determining target offset positions of the style features in the second feature map based on initial pixels corresponding to the style features in the second feature map and the offset feature map; performing interpolation processing on each target offset position to obtain a corresponding target pixel of each target offset position in the second feature map; And respectively moving each style characteristic from the corresponding initial pixel to the corresponding target pixel.
- 4. The method according to claim 1, wherein the obtaining a first predicted image based on the aligned content features and the wind pattern features includes: based on the aligned content characteristics and the wind grid characteristics, a first target characteristic diagram and a second target characteristic diagram are respectively obtained; Inputting the mean value and the variance corresponding to each channel in the first target feature map and the mean value and the variance corresponding to each channel in the second target feature map into a second model to obtain a fourth feature map output by the second model, wherein the second model is used for carrying out image style migration; and decoding the fourth feature map to obtain the first predicted image.
- 5. The method of claim 1, wherein said inputting the first image and the second image into the first model results in a second predicted image, comprising: performing coding processing and aggregation processing on each style characteristic in the second characteristic diagram to construct a style vector; and inputting the first image and the style vector into the first model to obtain the second predicted image.
- 6. The method according to any one of claims 1 to 5, wherein updating model parameters of the first model based on difference information between the first predicted image and the second predicted image until a target model is obtained, comprises: Weighting the difference information based on a preset weight coefficient to obtain a target loss value; and updating the model parameters of the first model based on the target loss value to obtain the target model.
- 7. The method according to any one of claims 1 to 5, wherein the aligning the content feature in the first feature map and the style feature in the second feature map includes: Sequentially inputting the first feature map and the second feature map into at least one feature extraction layer, and outputting a third target feature map corresponding to the first feature map and a fourth target feature map corresponding to the second feature map, wherein feature extraction granularity corresponding to different feature extraction layers is different, and the feature extraction layers are linearly connected; And carrying out alignment processing on the content features in the third target feature map and the style features in the fourth target feature map.
- 8. An image processing method, the method comprising: Inputting an image to be processed and a style vector into a target model to obtain a target image output by the target model; the target model is generated by the generating method according to any one of claims 1-7, and the image style of the target image is the same as the image style indicated by the style vector.
- 9. An apparatus for generating an image model, the apparatus comprising: The device comprises an extraction module, a first image processing module and a second image processing module, wherein the extraction module is configured to perform feature extraction on image content of a first image to obtain a first feature image and perform feature extraction on style content of a second image to obtain a second feature image, the first image and the second image are acquired by the same electronic equipment under different acquisition conditions or by different electronic equipment respectively, and the image styles of the first image and the second image are different; The first prediction module is configured to perform alignment processing on the content features in the first feature map and the style features in the second feature map, and obtain a first predicted image based on the aligned content features and the style features, wherein the first predicted image is matched with the image style of the second image; The second prediction module is configured to input the first image and the second image into a first model to obtain a second predicted image, wherein the first model is used for performing image color mapping; And the updating module is configured to update the model parameters of the first model based on the difference information between the first predicted image and the second predicted image until a target model is obtained.
- 10. An image processing apparatus, characterized in that the apparatus comprises: the generating device of claim 9; The output module is configured to input an image to be processed and a style vector into a target model to obtain a target image output by the target model; the target model is generated by the generating device, and the image style of the target image is the same as the image style indicated by the style vector.
- 11. An electronic device, comprising: A processor; a memory for storing a computer program or instructions; Wherein the processor executes the computer program or instructions to implement the steps of the generating method of any one of claims 1 to 7 or the image processing method of claim 8.
- 12. A non-transitory computer readable storage medium storing a computer program or instructions, characterized in that the computer program or instructions in the storage medium, when executed by a processor, implement the steps of the generating method of any one of claims 1 to 7 or the image processing method of claim 8.
- 13. A computer program product comprising a computer program or instructions which, when executed by a processor, implement the steps of the generating method of any one of claims 1 to 7 or the image processing method of claim 8.
Description
Generation method, processing method, device, apparatus, storage medium, and program product Technical Field The present disclosure relates to the field of image processing, and in particular, to a generating method, a processing method, an apparatus, a device, a storage medium, and a program product. Background Camera imaging devices are complex systems ranging from optical and analog to digital conversion in hardware to software algorithms for noise, color, exposure, and focus, etc., involving multiple image processing algorithms. Wherein color is an important aspect of photo presentation and has a large impact on the look and feel of the user. In the imaging process of the camera, the color processing can comprise three steps, namely, accurately recovering white through an automatic white balance algorithm (Auto White Balance, AWB) in a raw domain, keeping the color constancy of a sensor, converting a color space through a color correction matrix (Color Correction Matrix, CCM), and finally completing mapping of each color through a three-dimensional lookup table (3D Lookup Table,3D-LUT) to debug the color style which is recognized and liked by a user. While 3D-LUTs are one of the key components of image enhancement, current image signal processing schemes (ISPs) will exclusively support 3D-LUTs as part of the camera rendering scheme. Cameras typically provide a variety of photo style selections, each style typically obtained through a 3D-LUT that is specially adapted by an application technician. For example, either the high-pass platform or the MTK platform may design a 3D-LUT algorithm to provide to the user. Although current platforms apply 3D-LUT algorithms very fast, memory efficiency is not very high. For example, when two styles, i.e., a "classical mode" and a "vivid mode", are provided in the electronic device, two different 3D-LUTs need to be stored on the platform, which increases the memory footprint. In the related art, the neural network model replaces the 3D-LUT mode, so that memory occupation can be reduced, but training data acquisition of the neural network model is difficult, and development and use are not facilitated. Disclosure of Invention In order to overcome the problems in the related art, the present disclosure provides a generating method, a processing method, an apparatus, a device, a storage medium and a program product, so that data collection is simple, difficulty in developing a target model is simplified, and meanwhile, occupation of a memory is reduced, and service performance of an electronic device is improved. According to a first aspect of an embodiment of the present disclosure, there is provided a method for generating an image model, including: extracting features from image content of a first image to obtain a first feature image, and extracting features from style content of a second image to obtain a second feature image, wherein the first image and the second image are acquired by the same electronic equipment under different acquisition conditions or by different electronic equipment respectively, and the image styles of the first image and the second image are different; Performing alignment processing on the content features in the first feature map and the style features in the second feature map, and obtaining a first predicted image based on the content features and the style features after the alignment processing, wherein the first predicted image is matched with the image style of the second image; Inputting the first image and the second image into a first model to obtain a second predicted image, wherein the first model is used for performing image color mapping; and updating the model parameters of the first model based on the difference information between the first predicted image and the second predicted image until a target model is obtained. In some embodiments, the aligning the content feature in the first feature map and the style feature in the second feature map includes: Establishing an association relation between the first feature map and the second feature map in a channel dimension to obtain a third feature map; obtaining an offset feature map based on offset between style features and content features corresponding to each pixel in the third feature map, wherein the spatial resolution of the offset feature map is the same as that of the third feature map; based on the offset feature map, the style features in the second feature map are adjusted to align with the content features in the first feature map. In some embodiments, the adjusting the style feature in the second feature map to align with the content feature in the first feature map based on the offset feature map includes: Determining target offset positions of the style features in the second feature map based on initial pixels corresponding to the style features in the second feature map and the offset feature map; performing interpolation processing on each target offset p