CN-115457260-B - Model optimization method, device, electronic equipment and storage medium
Abstract
The invention provides a model optimization method, a device, electronic equipment and a storage medium, and the method relates to the technical field of artificial intelligence, and comprises the steps of constructing first target images corresponding to images respectively based on at least one image in a first data set; the number of channels of the first target image is greater than that of the images; the method comprises the steps of inputting a first target image into an interactive segmentation model to obtain first characteristic data and a first segmentation result which are output by the interactive segmentation model, training the interactive segmentation model based on a second target image corresponding to a sample image in a second data set, determining pseudo tag data corresponding to the first target image based on the first characteristic data and the first segmentation result, and optimizing the interactive segmentation model based on the pseudo tag data and the true tag data. The method provided by the invention realizes the optimization of the interactive segmentation model and improves the accuracy of the interactive segmentation model for object instance segmentation in the image.
Inventors
- ZHANG ZHAOXIANG
- Gan Ruitong
- FAN JUNSONG
- WANG YUXI
Assignees
- 中国科学院香港创新研究院人工智能与机器人创新中心有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20220801
Claims (7)
- 1. A method of model optimization, comprising: constructing a first target image corresponding to each image based on at least one image in a first data set, wherein the number of channels of the first target image is more than that of the images; The first target image is input into an interactive segmentation model, and first characteristic data and a first segmentation result which are output by the interactive segmentation model are obtained, wherein the interactive segmentation model is obtained by training based on a second target image corresponding to a sample image in a second data set, the interactive segmentation model is used for segmenting an object instance in the second target image, and the second target image in the second data set comprises true label data; Determining pseudo tag data corresponding to the first target image based on the first feature data and the first segmentation result; optimizing the interactive segmentation model based on the pseudo tag data and the true tag data; the determining, based on the first feature data and the first segmentation result, pseudo tag data corresponding to the first target image includes: Determining at least one prototype feature of the object instance based on the first feature data and the first segmentation result; Based on each prototype feature, determining pseudo tag data corresponding to the first target image; the at least one prototype feature includes global prototype features, central prototype features, and edge prototype features; the determining at least one prototype feature of the object instance based on the first feature data and the first segmentation result comprises: determining second characteristic data corresponding to pixels belonging to foreground point positions in the first segmentation result in the first characteristic data based on the first characteristic data and the first segmentation result; summing the second feature data, and averaging the obtained sum values to obtain the global prototype feature of the object instance; Randomly selecting x first foreground point positions from a plurality of foreground point positions corresponding to the first segmentation result, wherein x is a positive integer; Third feature data corresponding to pixels of the first foreground point position and the second foreground point position are respectively determined based on the x first foreground point positions, the second foreground point position corresponding to the first target image and the first feature data; summing a plurality of third feature data, and averaging the obtained sum values to obtain the central prototype feature of the object instance; corroding the first segmentation result to obtain an edge position corresponding to the first segmentation result; Summing fourth characteristic data corresponding to pixels of the edge positions, and averaging the obtained sum to obtain the edge prototype characteristic of the object instance; The determining, based on each prototype feature, pseudo tag data corresponding to the first target image includes: based on the global prototype feature, the central prototype feature and the edge prototype feature, respectively calculating a square Euclidean distance corresponding to a feature vector of each pixel; determining at least one confidence coefficient matrix based on each square Euclidean distance, wherein the confidence coefficient matrix comprises at least one of a global confidence coefficient matrix, a central confidence coefficient matrix and an edge confidence coefficient matrix; carrying out weighted summation on the center confidence coefficient matrix and the edge confidence coefficient matrix to obtain a weighted summation confidence coefficient matrix; determining a first confidence value corresponding to the global confidence matrix and a second confidence value corresponding to the weighted sum confidence matrix based on the global confidence matrix and the weighted sum confidence matrix; Determining a second segmentation result corresponding to the first confidence value and a third segmentation result corresponding to the second confidence value based on the first confidence value and the second confidence value; Determining pseudo tag data corresponding to the first target image based on the first segmentation result, the second segmentation result and the third segmentation result; the optimizing the interactive segmentation model based on the pseudo tag data and the true tag data includes: Sampling a third target image in the first dataset based on the pseudo tag data; determining a fourth target image in the second dataset based on the third target image and the true tag data; Optimizing the interactive segmentation model based on the third target image and the fourth target image.
- 2. The model optimization method according to claim 1, wherein the determining pseudo tag data corresponding to the first target image based on the first segmentation result, the second segmentation result, and the third segmentation result includes: judging whether the sum of values corresponding to the positions of each pixel is larger than a preset threshold value or not based on the first segmentation result, the second segmentation result and the third segmentation result; Under the condition that the sum of the values corresponding to the positions of the pixels is larger than the preset threshold value, the pixels are foreground pixels; and under the condition that the sum of the values corresponding to the positions of the pixels is not greater than the preset threshold value, the pixel is a background pixel.
- 3. The model optimization method of claim 1, wherein the optimizing the interactive segmentation model based on the third target image and the fourth target image comprises: Optimizing the interactive segmentation model based on the third target image and the fourth target image by adopting a formula (1) and a formula (2); (1) (2) Wherein, the A feature alignment loss function is represented and, Representing global prototype features corresponding to the third target image, Representing global prototype features corresponding to the fourth target image, Representing a global confidence matrix corresponding to the fourth target image, Representing the characteristic data corresponding to the ith pixel point in the fourth target image, Representing a maximum square error loss function, H representing the length of the fourth target image, W representing the width of the fourth target image, C representing the category corresponding to the pseudo tag, C representing the total number of categories, And representing the probability that the ith pixel point in the fourth target image belongs to the category c.
- 4. A model optimizing apparatus, characterized by comprising: the system comprises a construction module, a first data acquisition module and a second data acquisition module, wherein the construction module is used for constructing first target images corresponding to at least one image in a first data set respectively based on the at least one image; The interactive segmentation model is obtained by training based on a second target image corresponding to a sample image in a second data set, and is used for segmenting an object instance in the target image, wherein the second target image in the second data set comprises true label data; The determining module is used for determining pseudo tag data corresponding to the first target image based on the first characteristic data and the first segmentation result; An optimization module for optimizing the interactive segmentation model based on the pseudo tag data and the true tag data; the determining module is specifically configured to: determining second characteristic data corresponding to pixels belonging to foreground point positions in the first segmentation result in the first characteristic data based on the first characteristic data and the first segmentation result; Summing the second feature data, and averaging the obtained sum values to obtain global prototype features of the object instance; Randomly selecting x first foreground point positions from a plurality of foreground point positions corresponding to the first segmentation result, wherein x is a positive integer; Third feature data corresponding to pixels of the first foreground point position and the second foreground point position are respectively determined based on the x first foreground point positions, the second foreground point position corresponding to the first target image and the first feature data; Summing the plurality of third characteristic data, and averaging the obtained sum values to obtain a central prototype characteristic of the object instance; corroding the first segmentation result to obtain an edge position corresponding to the first segmentation result; Summing fourth characteristic data corresponding to the pixels of the edge positions, and averaging the obtained sum to obtain edge prototype characteristics of the object instance; based on the global prototype feature, the central prototype feature and the edge prototype feature, respectively calculating a square Euclidean distance corresponding to a feature vector of each pixel; determining at least one confidence coefficient matrix based on each square Euclidean distance, wherein the confidence coefficient matrix comprises at least one of a global confidence coefficient matrix, a central confidence coefficient matrix and an edge confidence coefficient matrix; carrying out weighted summation on the center confidence coefficient matrix and the edge confidence coefficient matrix to obtain a weighted summation confidence coefficient matrix; determining a first confidence value corresponding to the global confidence matrix and a second confidence value corresponding to the weighted sum confidence matrix based on the global confidence matrix and the weighted sum confidence matrix; Determining a second segmentation result corresponding to the first confidence value and a third segmentation result corresponding to the second confidence value based on the first confidence value and the second confidence value; Determining pseudo tag data corresponding to the first target image based on the first segmentation result, the second segmentation result and the third segmentation result; the optimizing module is specifically configured to: Sampling a third target image in the first dataset based on the pseudo tag data; determining a fourth target image in the second dataset based on the third target image and the true tag data; Optimizing the interactive segmentation model based on the third target image and the fourth target image.
- 5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the model optimization method according to any one of claims 1 to 3 when executing the program.
- 6. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the model optimization method according to any one of claims 1 to 3.
- 7. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements a model optimization method according to any one of claims 1 to 3.
Description
Model optimization method, device, electronic equipment and storage medium Technical Field The present invention relates to the field of artificial intelligence technologies, and in particular, to a model optimization method, apparatus, electronic device, and storage medium. Background Image segmentation is an important and classical computer vision task and has wide application in the fields of intelligent driving, video analysis, remote sensing monitoring and the like. The interactive segmentation requires that the user select specific object instances in the image according to interests to segment out corresponding objects, how to efficiently select the interaction strategy, and learn the content contained in the interaction information in the network model is the key point of research. In the related art, the conventional interactive segmentation often depends on a large number of fine labeling data sets to train the network model, and for the data sets containing domain differences, the network model needs to be trained again. This presents difficulties for real-world applications of the interactive segmentation model to open environments, making the accuracy of image segmentation low. Disclosure of Invention The invention provides a model optimization method, a model optimization device, electronic equipment and a storage medium, which are used for solving the defect of low accuracy of image segmentation in the prior art, realizing continuous optimization of an interactive segmentation model according to pseudo tag data, and improving the accuracy of the interactive segmentation model for object instance segmentation in an image. The invention provides a model optimization method, which comprises the following steps: constructing a first target image corresponding to each image based on at least one image in a first data set, wherein the number of channels of the first target image is more than that of the images; The first target image is input into an interactive segmentation model, and first characteristic data and a first segmentation result which are output by the interactive segmentation model are obtained, wherein the interactive segmentation model is obtained by training based on a second target image corresponding to a sample image in a second data set, the interactive segmentation model is used for segmenting an object instance in the second target image, and the second target image in the second data set comprises true label data; Determining pseudo tag data corresponding to the first target image based on the first feature data and the first segmentation result; Optimizing the interactive segmentation model based on the pseudo tag data and the true tag data. According to the model optimization method provided by the invention, the determining of the pseudo tag data corresponding to the first target image based on the first feature data and the first segmentation result comprises the following steps: Determining at least one prototype feature of the object instance based on the first feature data and the first segmentation result; And determining the pseudo tag data corresponding to the first target image based on each prototype feature. According to the model optimization method provided by the invention, the at least one prototype feature comprises a global prototype feature, a central prototype feature and an edge prototype feature; the determining at least one prototype feature of the object instance based on the first feature data and the first segmentation result comprises: determining second characteristic data corresponding to pixels belonging to foreground point positions in the first segmentation result in the first characteristic data based on the first characteristic data and the first segmentation result; summing the second feature data, and averaging the obtained sum values to obtain the global prototype feature of the object instance; Randomly selecting x first foreground point positions from a plurality of foreground point positions corresponding to the first segmentation result, wherein x is a positive integer; Third feature data corresponding to pixels of the first foreground point position and the second foreground point position are respectively determined based on the x first foreground point positions, the second foreground point position corresponding to the first target image and the first feature data; summing a plurality of third feature data, and averaging the obtained sum values to obtain the central prototype feature of the object instance; corroding the first segmentation result to obtain an edge position corresponding to the first segmentation result; and summing fourth characteristic data corresponding to the pixels of the edge positions, and averaging the obtained sum to obtain the edge prototype characteristic of the object example. According to the model optimization method provided by the invention, the determining of the pseudo tag data corresponding to