US-12620071-B2 - Method for creating black-and-white photo that imitates expert retouching styles

US12620071B2US 12620071 B2US12620071 B2US 12620071B2US-12620071-B2

Abstract

The present invention relates to a method for creating a neural network model which can imitate retouching styles of a plurality of experts, and creating a black-and-white photo which gives aesthetics by inputting a color photo into the neural network model.

Inventors

Hae Gon Jeon
Seung Hyun SHIN
Ji Su SHIN
Ji Hwan BAE
Inwook SHIM

Assignees

GIST(Gwangju Institute of Science and Technology)
INHA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION

Dates

Publication Date: 20260505
Application Date: 20231114
Priority Date: 20230308

Claims (12)

1 . A method for creating a black-and-white photo, the method comprising: creating a first embedding vector according to a style and a shooting object by inputting a plurality of first training black-and-white photos into a first neural network, and extracting a plurality of proxy vectors representing each cluster formed by the first embedding vector to train the first neural network; creating a second embedding vector by converting a training color photo into a second training black-and-white photo and inputting the second training black-and-white photo into the first neural network; identifying a training proxy vector corresponding to a random style and the shooting object for the training color photo among the plurality of proxy vectors; converting a distribution of the second embedding vector into a distribution of the training proxy vector by inputting the training color photo, the training proxy vector, and the second embedding vector into a second neural network, and creating a decolored image by applying a pixel-wise weight corresponding to the converted distribution to the training color photo to train the second neural network; and converting a target color photo into a black-and-white photo corresponding to a target style by using the first and second neural networks for which training is completed.
2 . The method of claim 1 , wherein the first neural network includes a style classification neural network classifying the style of the first training black-and-white photo, an object classification neural network classifying the shooting object in the first training black-and-white photo, and a multi-layer perceptron (MLP) creating the first embedding vector by combining outputs of the style classification neural network and the object classification neural network.
3 . The method of claim 1 , wherein the training of the first neural network includes training the first neural network so that the first embedding vector is positioned to be close for the same style and to be far for a different shooting object in an embedding space.
4 . The method of claim 2 , wherein the creating of the second embedding vector includes creating the second training black-and-white photo by converting a gradation value of the training color photo into a grayscale, and inputting the second training black-and-white photo into each of the style classification neural network and the object classification neural network, and determining an output of the multi-layer perceptron as the second embedding vector.
5 . The method of claim 1 , wherein the identifying of the training proxy vector includes identifying any one training proxy vector corresponding to the random style arbitrarily set for the training color photo and the shooting object in the training color photo among the plurality of proxy vectors.
6 . The method of claim 5 , wherein the shooting object in the training color photo is identified by inputting the second training black-and-white photo into an object classification neural network in the first neural network.
7 . The method of claim 1 , wherein the second neural network includes an encoder extracting a feature from the training color photo, and converting the distribution of the second embedding vector into the distribution of the training proxy vector, and a decoder outputting the pixel-wise weight from the feature extracted from the encoder and the converted distribution of the second embedding vector.
8 . The method of claim 7 , wherein the second neural network further includes a fully connected layer (FCL) that extracts feature maps from the second embedding vector and the training proxy vector, respectively, and the encoder converts a distribution of the feature map for the second embedding vector into a distribution of the feature map for the training proxy vector.
9 . The method of claim 8 , wherein the encoder converts the distribution of the feature map of the second embedding vector into the distribution of the feature map of the training proxy vector according to [Equation 1] below, ′ = σ t ( - μ s σ s ) + μ t [ Equation ⁢ 1 ] ( represents the feature map for the second embedding vector, represents the feature map for the second embedding vector, μ s , μ t represent a mean and a distribution of the feature map for the second embedding vector, respectively, and μ t , σ t represent a mean and a distribution of the feature map for the training proxy vector, respectively).
10 . The method of claim 1 , wherein the pixel-wise weight is a bilateral grid type weight for a pixel-wise gradation value of the training color photo.
11 . The method of claim 1 , wherein the training of the second neural network includes training the second neural network so that a difference between the decolorized image created by multiplying a pixel-wise gradation value of the training color photo by the pixel-wise weight, and a ground truth (GT) black-and-white photo into which the training color photo is converted according to the random style becomes minimal.
12 . The method of claim 1 , wherein the converting into the black-and-white photo includes converting the target color photo into a grayscale image, and inputting the grayscale image into the first neural network, and inputting a proxy vector corresponding to the target style into the second neural network, and converting the target color photo into the black-and-white photo.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the priority of Korean Patent Application No. 10-2023-0030667 filed on Mar. 8, 2023, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference. BACKGROUND OF THE INVENTION Field of the Invention The present invention relates to a method for creating a neural network model which can imitate retouching styles of a plurality of experts, and creating a black-and-white photo which gives aesthetics through the created neural network model. Description of the Related Art Black-and-white photos have a dynamic range and clarity higher than color photos to provide a richer texture and contrast. Through this, the black and white photos provide unique aesthetic and emotions that are not seen in the color photos. The aesthetics of the black-and-white photo can be achieved only through the careful corrections of high-end cameras or experts for black-and-white shooting, not just an additional function of smartphones or DSLR cameras. Currently, various neural network networks for converting the black-and-white photos are proposed in a computer vision field, but they all focus on minimizing the loss of texture in converting RGB gradation values to a grayscale, and adding aesthetic elements to the creation of the black-and-white photo is not studied. SUMMARY OF THE INVENTION The present invention has been made in an effort to imitate retouching styles of experts through a neural network model to convert a color photo into a black-and-white photo. The objects of the present invention are not limited to the above-mentioned objects, and other objects and advantages of the present invention that are not mentioned can be understood by the following description, and will be more clearly understood by embodiments of the present invention. Further, it will be readily appreciated that the objects and advantages of the present invention can be realized by means and combinations shown in the claims. In order to achieve the object, an exemplary embodiment of the present invention provides a method for creating a black-and-white photo, which includes: creating a first embedding vector according to a style and a shooting object by inputting a plurality of first training black-and-white photos into a first neural network, and extracting a plurality of proxy vectors representing each cluster formed by the first embedding vector to train the first neural network; creating a second embedding vector by converting a training color photo into a second training black-and-white photo and inputting the second training black-and-white photo into the first neural network; identifying a training proxy vector corresponding to a random style and a shooting object for the training color photo among a plurality of proxy vectors; converting a distribution of a second embedding vector into a distribution of the training proxy vector by inputting the training color photo, the training proxy vector, and the second embedding vector into the second neural network, and creating a decolored image by applying a pixel-wise weight corresponding to the converted distribution to the training color photo to train the second neural network; and converting a target color photo into a black-and-white photo corresponding to a target style by using the first and second neural networks for which training is completed. In an exemplary embodiment, the first neural network includes a style classification neural network classifying the style of the first training black-and-white photo, an object classification neural network classifying the shooting object in the first training black-and-white photo, and a multi-layer perceptron (MLP) creating the first embedding vector by combining outputs of the style classification neural network and the object classification neural network. In an exemplary embodiment, the training of the first neural network includes training the first neural network so that the first embedding vector v is positioned to be close for the same style and to be far for a different shooting object in an embedding space. In an exemplary embodiment, the creating of the second embedding vector includes creating the second training black-and-white photo by converting a gradation value of the training color photo into a grayscale, and inputting the second training black-and-white photo into each of the style classification neural network and the object classification neural network, and determining an output of the multi-layer perceptron as the second embedding vector. In an exemplary embodiment, the identifying of the training proxy vector includes identifying any one training proxy vector corresponding to a random style arbitrarily set for the training color photo and a shooting object in the training color photo among the plurality of proxy vectors. In an exemplary embodiment, the shooting object in the training color photo is identified by inputting the se