US-12626370-B2 - Machine learning method distinguishing foreground and background of image

US12626370B2US 12626370 B2US12626370 B2US 12626370B2US-12626370-B2

Abstract

A method used in application of training a model of machine learning is described. Foregrounds and backgrounds of a first image are distinguished to generate a first mask image. The first image is cropped to generate second and third images. The first mask image is cropped to generate second and third mask images. Positions of the second and the third mask image correspond to positions of the second and the third image, respectively. First and second feature vector groups of the second image and the third image are generated by a model. A first matrix is generated according to the first and second feature vector groups. A second matrix is generated according to the second and third mask images. A function is generated according to the first and second matrices. The model is adjusted according to the function.

Inventors

Shen-Hsuan LIU
Van Nhiem TRAN
Kai-Lin Yang
Chi-En Huang
Muhammad Saqlain Aslam
Yung-Hui Li

Assignees

HON HAI PRECISION INDUSTRY CO., LTD.
FOXCONN TECHNOLOGY GROUP CO., LTD.

Dates

Publication Date: 20260512
Application Date: 20230921

Claims (20)

1 . A machine learning method comprises: distinguishing foregrounds of a first image and backgrounds of the first image to generate a first mask image; cropping the first image to generate a second image and a third image; cropping the first mask image to generate a second mask image and a third mask image, wherein a position of the second mask image and a position of the third mask image correspond to a position of the second image and a position of the third image, respectively; generating a first feature vector group of the second image and a second feature vector group of the third image by a model; generating a first matrix according to the first feature vector group and the second feature vector group; generating a second matrix according to the second mask image and the third mask image; generating a function according to the first matrix and the second matrix; and adjusting the model according to the function.
2 . The machine learning method of claim 1 , further comprising: cropping each of the first image and the first mask image, to generate a fourth image and the second mask image, wherein the position of the second mask image is same as a position of the fourth image; cropping each of the first image and the first mask image, to generate a fifth image and the third mask image, wherein the position of the third mask image is same as a position of the fifth image; processing the fourth image to generate the second image; and processing the fifth image to generate the third image.
3 . The machine learning method of claim 1 , further comprising: cropping each of the first image and the first mask image, to generate a fourth image and a fourth mask image, wherein a position of the fourth mask image is same as a position of the fourth image; generating a foreground rate of the fourth image according to the fourth mask image; when the foreground rate is larger than or equal to a preset foreground rate and an image size of the fourth image is larger than or equal to a preset size, selecting the fourth image and the fourth mask image as the second image and the second mask image; and when the foreground rate is smaller than the preset foreground rate or the image size of the fourth image is smaller than the preset size, cropping each of the first image and the first mask image again.
4 . The machine learning method of claim 1 , wherein generating the second matrix comprises: in response to each of foregrounds of the second mask image and foregrounds of the third mask image, generating a first portion of the second matrix; and in response to at least one of backgrounds of the second mask image and backgrounds of the third mask image, generating a second portion of the second matrix, wherein each pixel in the first portion has a first logic value, and each pixel in the second portion has a second logic value different from the first logic value.
5 . The machine learning method of claim 4 , further comprising: extracting features of the second image, to generate a first feature map group of the second image, each feature map in the first feature map group having a size; extracting features of the third image, to generate a second feature map group of the third image, each feature map in the second feature map group having the size; resizing the second mask image, to generate a fourth mask image having the size; resizing the third mask image, to generate a fifth mask image having the size; and calculating the fourth mask image and the fifth mask image to generate the second matrix.
6 . The machine learning method of claim 4 , wherein generating the function comprises: according to first positions of the first portion, selecting corresponding first similarity values of the first matrix as a third portion, wherein when the first similarity values of the third portion are increased, the function is decreased.
7 . The machine learning method of claim 6 , wherein generating the function further comprises: according to second positions of the second portion, selecting corresponding second similarity values of the first matrix as a fourth portion, wherein when the second similarity values of the fourth portion are increased, the function is increased.
8 . A machine learning method comprising: distinguishing foregrounds of a first image and backgrounds of the first image to generate a first mask image; cropping each of the first image and the first mask image, to generate a second image and a second mask image, wherein a position of the second mask image is same as a position of the second image; generating a foreground rate of the second image according to the second mask image, the foreground rate being a foreground area divided by an image size of the second image; and when the foreground rate is larger than or equal to a preset foreground rate and the image size of the second image is larger than or equal to a preset size, generating a function at least according to the second mask image and the second image, to train a model.
9 . The machine learning method of claim 8 , further comprising: cropping each of the first image and the first mask image, to generate a third image and a third mask image, wherein a position of the third mask image is same as a position of the third image; generating a first feature map group of the second image and a second feature map group of the third image by the model; generating a first feature vector group of the first feature map group and a second feature vector group of the second feature map group by the model; generating a first matrix according to the first feature vector group and the second feature vector group; generating a second matrix according to the second mask image and the third mask image; and generating the function according to similarity values of the first matrix and positions of a first portion of the second matrix, wherein the first portion corresponds to each of foregrounds of the second mask image and foregrounds of the third mask image.
10 . The machine learning method of claim 9 , further comprising: generating the function according to the similarity values of the first matrix and positions of a second portion of the second matrix, wherein the second portion corresponds to at least one of backgrounds of the second mask image and backgrounds of the third mask image, wherein when ones of the similarity values corresponding to the first portion are increased, the function is decreased, and when ones of the similarity values corresponding to the second portion are increased, the function is increased.
11 . The machine learning method of claim 9 , further comprising: stacking the first image and the first mask image to generate a first image group; performing first geometry augment operations to the first image group to generate a second image group; performing first augment operations to an image of the second image group to generate the second image; and outputting another image of the second image group as the second mask image, wherein the first augment operations are not geometry augment operations.
12 . The machine learning method of claim 11 , further comprising: performing second geometry augment operations to the first image group to generate a third image group; performing second augment operations to an image of the third image group to generate the third image; and outputting another image of the third image group as the third mask image, wherein the second geometry augment operations are different from the first geometry augment operations.
13 . The machine learning method of claim 9 , wherein generating the second matrix comprises: resizing the second mask image, to generate a fourth mask image; and resizing the third mask image, to generate a fifth mask image, wherein a size of the fourth mask image is same as a size of the fifth mask image.
14 . The machine learning method of claim 13 , wherein generating the second matrix further comprises: flattening the fourth mask image to generate a first vector; flattening the fifth mask image to generate a second vector; and performing an outer product to the first vector and the second vector to generate the second matrix.
15 . The machine learning method of claim 14 , further comprising: selecting first similarity values in the first matrix according to the second matrix; and calculating the function at least with the first similarity values.
16 . The machine learning method of claim 15 , further comprising: generating a third matrix opposite to the second matrix; selecting second similarity values in the first matrix according to the third matrix; and calculating the function with the first similarity values and the second similarity values.
17 . A machine learning method comprising: generating a first mask image including a first portion and a second portion; determining a logic value of a pixel of a second mask image according to a ratio of the first portion in a corresponding region of the first mask image; generating a third mask image including a third portion having and a fourth portion; determining a logic value of a pixel of a fourth mask image according to a ratio of the third portion in a corresponding region of the third mask image; generating a function according to the second mask image and the fourth mask image; and training a model according to the function, wherein each of the first portion and the third portion corresponds to foregrounds of a first image, and each of the second portion and the fourth portion corresponds to backgrounds of the first image.
18 . The machine learning method of claim 17 , further comprising: cropping the first image to generate a second image and a third image; and processing the second image and the third image by the model to generate a first matrix, wherein generating the function comprises generating the function according to the first matrix, and the second image and the third image corresponds to the first mask image and the third mask image, respectively.
19 . The machine learning method of claim 18 , wherein a position and a size of the second image is same as a position and a size of the first mask image, and a position and a size of the third image is same as a position and a size of the third mask image.
20 . The machine learning method of claim 18 , wherein generating the function further comprises: performing calculations to logic values of the second mask image and logic values of the fourth mask image, to generate a second matrix; generating a third matrix according to positions of the second matrix having a first logic value and similarity values of the first matrix; and generating a fourth matrix according to positions of the second matrix having a second logic value and the similarity values of the first matrix, wherein the first logic value is different from the second logic value, when similarity values of the third matrix are increased, the function is increased, and when similarity values of the fourth matrix are increased, the function is decreased.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application claims priority to U.S. Provisional Application Ser. No. 63/376,443, filed Sep. 21, 2022, which is herein incorporated by reference in its entirety. BACKGROUND Technical Field The present disclosure relates to a machine learning technique. More particularly, the present disclosure relates to a machine learning method. Description of Related Art When a model is trained, images belonging to the same category are inputted into the model to generate a loss function. Since the images belong to the same category, adjusting parameters of the model to decrease the loss function can improve results of respective downstream missions of the model, such as classify results. However, the approaches described above may not separate foregrounds and backgrounds of the image properly, such that the training result is poor. Thus, techniques associated with the development for overcoming the problems described above are important issues in the field. SUMMARY The present disclosure provides a machine learning method. The machine learning method includes: distinguishing foregrounds of a first image and backgrounds of the first image to generate a first mask image; cropping the first image to generate a second image and a third image; cropping the first mask image to generate a second mask image and a third mask image, wherein a position of the second mask image and a position of the third mask image correspond to a position of the second image and a position of the third image, respectively; generating a first feature vector group of the second image and a second feature vector group of the third image by a model; generating a first matrix according to the first feature vector group and the second feature vector group; generating a second matrix according to the second mask image and the third mask image; generating a function according to the first matrix and the second matrix; and adjusting the model according to the function. The present disclosure provides a machine learning method. The machine learning method includes: distinguishing foregrounds of a first image and backgrounds of the first image to generate a first mask image; cropping each of the first image and the first mask image, to generate a second image and a second mask image, wherein a position of the second mask image is same as a position of the second image; generating a foreground rate of the second image according to the second mask image, the foreground rate being a foreground area divided by an image size of the second image; and when the foreground rate is larger than or equal to a preset foreground rate and the image size of the second image is larger than or equal to a preset size, generating a function at least according to the second mask image and the second image, to train a model. The present disclosure provides a machine learning method. The machine learning method includes: generating a first mask image including a first portion and a second portion; determining a logic value of a pixel of a second mask image according to a ratio of the first portion in a corresponding region of the first mask image; generating a third mask image including a third portion having and a fourth portion; determining a logic value of a pixel of a fourth mask image according to a ratio of the third portion in a corresponding region of the third mask image; generating a function according to the second mask image and the fourth mask image; and training a model according to the function. Each of the first portion and the third portion corresponds to foregrounds of a first image, and each of the second portion and the fourth portion corresponds to backgrounds of the first image. It is to be understood that both the foregoing general description and the following detailed description are examples, and are intended to provide further explanation of the disclosure as claimed. BRIEF DESCRIPTION OF THE DRAWINGS Aspects of the present disclosure are best understood from the following detailed description when read with the accompanying figures. It is noted that, in accordance with the standard practice in the industry, various features are not drawn to scale. In fact, the dimensions of the various features may be arbitrarily increased or reduced for clarity of discussion. FIG. 1 is a flowchart diagram of a machine learning method illustrated according to some embodiment of present disclosure. FIG. 2 is a flowchart diagram of further details of the machine learning method shown in FIG. 1, illustrated according to some embodiments of present disclosure. FIG. 3 is a flowchart diagram of further details of the operation shown in FIG. 1, illustrated according to some embodiments of present disclosure. FIG. 4 is a flowchart diagram of a method illustrated according to some embodiments of present disclosure. FIG. 5 is a flowchart diagram of further details of the operation shown in FIG. 1, illustrated according to some embodimen