CN-116523733-B - Image cross-domain migration method, computer device, readable storage medium, and program product

CN116523733BCN 116523733 BCN116523733 BCN 116523733BCN-116523733-B

Abstract

The application relates to an image cross-domain migration method, a computer device, a readable storage medium and a program product, wherein the method implements a network model, a model training process comprises the steps of obtaining source domain images and target domain images, extracting first content features and first style vectors of the source domain images, extracting second content features and second style vectors of the target domain images, combining to obtain first style migration images conforming to the styles of the source domain images, combining to obtain second style migration images conforming to the styles of the target domain images, extracting content features of the second style migration images, combining with the first style vectors to obtain first source domain reconstruction images, extracting style vectors of the first style migration images, combining with the first content features to obtain second source domain reconstruction images, and completing training of the image cross-domain migration network model and outputting the first style migration images and/or the second style migration images when training expectations are met. The application can realize the conversion from the source domain to the target domain and the conversion from the target domain to the source domain by one training.

Inventors

ZHAO LEI
ZHANG QUANWEI
LIN HUAIZHONG
XING WEI
SUN JIAJIE
CHEN JIAFU
Ji Baiyan
CHU TIANYI
CHEN HAIBO
ZHANG ZHANJIE
YIN HAOLIN
LAN ZEHUA

Assignees

浙江大学

Dates

Publication Date: 20260505
Application Date: 20230113

Claims (10)

1. The image cross-domain migration method is characterized in that the training process of the image cross-domain migration network model comprises the following steps: Obtaining a source domain image and a target domain image, extracting a first content feature and a first style vector of the source domain image, and extracting a second content feature and a second style vector of the target domain image; combining the second content features and the first style vector to obtain a first style migration image conforming to the style of the source domain image; Combining the first content features and the second style vectors to obtain a second style migration image conforming to the style of the target domain image; extracting content characteristics of the second style migration image, and combining the content characteristics with the first style vector to obtain a first source domain reconstructed image; Extracting a style vector of the first style migration image, and combining the style vector with the first content characteristics to obtain a second source domain reconstructed image; when a training expectation is met, training of the image cross-domain migration network model is completed, wherein the training expectation comprises that the source domain image, the first source domain reconstruction image and the second source domain reconstruction image meet a loss constraint condition; And outputting the first style migration image and/or the second style migration image by using the trained image cross-domain migration network model.
2. The image cross-domain migration method of claim 1, wherein the training process of the image cross-domain migration network model further comprises: Extracting content characteristics of the first style migration image and combining the content characteristics with the second style vector to obtain a first target domain reconstruction image; extracting a style vector of the second style migration image and combining the style vector with the second content characteristics to obtain a second target domain reconstructed image; the training expectation further includes that the target domain image, the first target domain reconstructed image, the second target domain reconstructed image satisfy a loss constraint condition.
3. The image cross-domain migration method of claim 2, wherein the loss constraint imposed by the loss constraint condition is a cross-loop consistency loss.
4. The image cross-domain migration method of claim 1, wherein the image cross-domain migration network model comprises a content encoder, and wherein the image cross-domain migration method further comprises extracting corresponding content features by using the content encoder, wherein the content encoder comprises a plurality of convolution layers, and after passing through one convolution layer, the content encoder is normalized by using parameter-free instance normalization and activated by using a ReLU function.
5. The image cross-domain migration method of claim 1, further comprising extracting corresponding content features with a content encoder, the content encoder comprising a pre-processing layer, a downsampling layer, and a residual layer, the downsampling layer comprising a compression extraction module, the compression extraction module comprising: the space compression-channel extraction module is used for carrying out global average pooling on the input features, obtaining weight vectors through nonlinear functions and representing the scaling scale of each channel in the global observation input features; The channel compression-space extraction module is used for obtaining a scaling scale through a Sigmoid activation function after convolving input features to obtain a space attention diagram; the output characteristics of the compression extraction module are larger values output by the space compression-channel extraction module and the channel compression-space extraction module.
6. The image cross-domain migration method of claim 5, wherein the image cross-domain migration network model comprises a style encoder, and wherein the image cross-domain migration method further comprises extracting corresponding style vectors with the style encoder, wherein the style encoder comprises a pre-processing layer, a downsampling layer, a pooling layer and a convolution layer in this order.
7. The image cross-domain migration method of claim 5, wherein the image cross-domain migration network model comprises a decoder, the image cross-domain migration method further comprising combining content features and style vectors with the decoder, obtaining corresponding images through a merged connection; the decoder sequentially comprises a residual error layer, an up-sampling layer and a convolution layer, wherein the up-sampling layer comprises a transpose convolution module, a layer normalization module and a compression extraction module.
8. Computer device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to implement the steps of the image cross-domain migration method of any one of claims 1 to 7.
9. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the image cross-domain migration method according to any one of claims 1 to 7.
10. Computer program product comprising computer instructions which, when executed by a processor, implement the steps of the image cross-domain migration method of any one of claims 1 to 7.

Description

Image cross-domain migration method, computer device, readable storage medium, and program product Technical Field The present application relates to the field of computer vision and deep learning, and more particularly, to an image cross-domain migration method, a computer device, a readable storage medium, and a program product. Background The image cross-domain migration mainly carries out style migration on images between two image domains, and editing and generating of the images between different image domains are achieved. The method mainly comprises two methods, namely an algorithm for non-deformation content migration, and a diversified image cross-domain conversion algorithm in a deformation task, wherein the algorithm focuses on the fact that the deformation task has a certain degree of geometric change before and after conversion (such as conversion of a cat face image into a dog face image, and the converted image should keep the gesture and position of the cat face in an input image and have the texture style of the dog face) unlike the effect to be realized by non-deformation. The constraint function for the content in the morphing task is difficult to define relative to the non-morphing task, mainly because the unsupervised algorithm is implemented based on the unpaired dataset, i.e., there is no target domain image with the same pose and position as it is for one source domain image. CycleGAN proposes an assumption that the output obtained by converting an input image from a source domain to a target domain and then from the target domain back to the source domain should be consistent with the input image, and by using this assumption CycleGAN, a cyclic consistency loss is achieved, which is widely applied to image cross-domain conversion algorithms of deformation tasks, such as AGGAN, U-GAT-IT, etc., but such algorithms can only generate a single conversion result for the input source domain image, and IT is difficult to meet application requirements. In order to obtain diversified generation results, DRIT and MUNIT respectively learn the content and style of images by utilizing the decoupling thought, and can obtain diversified conversion results by utilizing different images in a target domain and an input source domain image to be combined during conversion, but the content and style of the images cannot be accurately defined, so that certain difficulty is brought to decoupling of the images, and the conventional algorithm still has two types of problems that 1) the converted images have larger content differences (mainly expressed as gesture differences) with the corresponding source domain images, and 2) the texture similarity degree between the converted images and sample diagrams is lower. In addition, they also present "artifact" marks of varying degrees in the generated image, further reducing the image quality. DRIT and MUNIT are each recombined between the content and style of the generated image, but there is some degradation in quality of the generated image relative to the real image, and the content and style obtained therefrom are correspondingly distorted. Disclosure of Invention Based on the foregoing, it is necessary to provide an image cross-domain migration method for solving the above technical problems. The image cross-domain migration method of the application carries out style migration on the image between two image domains and is implemented in an image cross-domain migration network model, and the training process of the image cross-domain migration network model comprises the following steps: Obtaining a source domain image and a target domain image, extracting a first content feature and a first style vector of the source domain image, and extracting a second content feature and a second style vector of the target domain image; combining the second content features and the first style vector to obtain a first style migration image conforming to the style of the source domain image; Combining the first content features and the second style vectors to obtain a second style migration image conforming to the style of the target domain image; extracting content characteristics of the second style migration image, and combining the content characteristics with the first style vector to obtain a first source domain reconstructed image; Extracting a style vector of the first style migration image, and combining the style vector with the first content characteristics to obtain a second source domain reconstructed image; when a training expectation is met, training of the image cross-domain migration network model is completed, wherein the training expectation comprises that the source domain image, the first source domain reconstruction image and the second source domain reconstruction image meet a loss constraint condition; And outputting the first style migration image and/or the second style migration image by using the trained image cross-domain migration network m