EP-4742182-A1 - DATA PROCESSING METHOD AND RELATED APPARATUS, AND DEVICE AND STORAGE MEDIUM

EP4742182A1EP 4742182 A1EP4742182 A1EP 4742182A1EP-4742182-A1

Abstract

Disclosed in the present application are a data processing method and related apparatus, and a device and a storage medium. The method comprises: sending to a server K images which are captured in the current site environment, and the server acquiring K first prediction results by means of an image recognition model; constructing a fine-tuning training set according to the K images and the K first prediction results; by means of a model to be trained, acquiring a second prediction result of each image in the fine-tuning training set; according to the second prediction result of each image and a first prediction result of the image in the fine-tuning training set, updating model parameters of the model to be trained, so as to obtain a local recognition model and model tuning parameters; and if a model fine-tuning condition is met, sending the model tuning parameters to the server, and the server updating model parameters of the image recognition model according to a set of model tuning parameters. Thus, the present application enables an image recognition model to be applicable to various site environments, improves the precision of model recognition, and also saves on processing resources of a server and improves the efficiency of model learning.

Inventors

ZHANG, ZHENHONG

Assignees

Tencent Technology (Shenzhen) Company Limited

Dates

Publication Date: 20260513
Application Date: 20240603

Claims (20)

A data processing method, performed by a terminal, comprising: obtaining K reference images through photographing in an environment where the terminal is currently located, K being an integer greater than or equal to 1; transmitting the K reference images to a server, and receiving, from the server, K first prediction results generated based on the K reference images by using an image recognition model; constructing a fine-tuning training set according to the K reference images and the K first prediction results, the fine-tuning training set comprising K groups of fine-tuning training data, and each group of fine-tuning training data comprising one of the K reference images and one of the K first prediction result corresponding to the reference image; fine-tuning a to-be-trained model on the terminal by using the fine-tuning training set through: obtaining, for a reference image comprised in each group of fine-tuning training data in the fine-tuning training set and by using the to-be-trained model, a second prediction result corresponding to the reference image, and updating a model parameter of the to-be-trained model according to the second prediction result corresponding to each reference image and the first prediction result of the reference image in the fine-tuning training set, to obtain a fine-tuned recognition model and a model adjustment parameter corresponding to the fine-tuned recognition model; and transmitting the model adjustment parameter to the server in response to the fine-tuned recognition model satisfying a model fine-tuning condition, to enable the server to update a model parameter of the image recognition model according to a model adjustment parameter set, the model adjustment parameter set comprising the model adjustment parameter.
The method according to claim 1, further comprising: transmitting a model training request to the server; receiving an initial training set transmitted from the server, wherein the initial training set is determined by the server according to the model training request, and comprises M groups of initial training data, and each group of initial training data comprises a training image and an annotation result of the training image; obtaining M initial prediction results based on M training images comprised in the initial training set and by using an initial recognition model, wherein each initial prediction result comprises a predicted category and a category score of a training image; and updating a model parameter of the initial recognition model according to the M initial prediction results and M annotation results comprised in the initial training set, to obtain the to-be-trained model.
The method according to claim 1, before the obtaining K reference images in an environment where the terminal is currently located , further comprising: obtaining environment information of the environment, wherein the environment information comprises at least one of light intensity and background noise; if the environment information comprises the light intensity and the light intensity does not fall within a light intensity range, adjusting a first application parameter of an image capturing apparatus for performing the photographing in response to a first adjustment operation being performed on the image capturing apparatus, wherein the first application parameter comprises at least one of a shutter speed, a light sensitivity parameter, and an exposure compensation parameter; and if the environment information comprises the background noise and the background noise is greater than or equal to a background noise threshold, adjusting a second application parameter of the image capturing apparatus in response to a second adjustment operation being performed on the image capturing apparatus, wherein the second application parameter comprises at least one of an acutance parameter, the light sensitivity parameter, and a denoising parameter.
The method according to claim 1, after the updating a model parameter of the to-be-trained model according to the second prediction result corresponding to each reference image and the first prediction result of the reference image in the fine-tuning training set, to obtain a fine-tuned recognition model and a model adjustment parameter corresponding to the fine-tuned recognition model, further comprising: obtaining recognition accuracy of the fine-tuned recognition model for N new images, wherein N is an integer greater than or equal to 1, and the N new images are obtained by photographing through the image capturing apparatus; and determining that the fine-tuned recognition model satisfies the model fine-tuning condition if the recognition accuracy is greater than or equal to an accuracy threshold; or transmitting the model adjustment parameter to T reference terminals if the recognition accuracy is greater than or equal to the accuracy threshold, so that each of the T reference terminals updates, according to the model adjustment parameter, a model parameter of a corresponding to-be-trained model on the reference terminal, to obtain T updated recognition models, wherein the T reference terminals are associated with the terminal, and T is an integer greater than or equal to 1; obtaining a voting score of each of the T reference terminals, wherein the voting score of a reference terminal is determined according to a prediction result of an updated recognition model on the reference terminal and a prediction result of the image recognition model for a same image; determining a comprehensive recognition score according to the voting score of each reference terminal; and determining, if the comprehensive recognition score is greater than or equal to a recognition score threshold, that the fine-tuned recognition model satisfies the model fine-tuning condition.
The method according to claim 4, wherein the obtaining recognition accuracy of the fine-tuned recognition model for N new images comprises: transmitting the N new images obtained by photographing through the image capturing apparatus to the server; receiving N third prediction results transmitted from the server, wherein the N third prediction results are generated based on the N new images by using the image recognition model; obtaining N fourth prediction results based on the N new images by using the fine-tuned recognition model; and performing verification on the N fourth prediction results according to the N third prediction results, to obtain the recognition accuracy for the N new images.
The method according to claim 4, wherein the determining a comprehensive recognition score according to the voting score of each reference terminal comprises: summing up voting scores of the T reference terminals, to obtain a total voting score; and obtaining the comprehensive recognition score according to a ratio of the total voting score to the value T.
The method according to claim 4, wherein the determining a comprehensive recognition score according to the voting score of each reference terminal comprises: obtaining a weight parameter set of each of the T reference terminals, wherein the weight parameter set comprises at least one of a device weight, an environment weight, and a preference weight; weighting, for each of the T reference terminals, the voting score of the reference terminal by using the weight parameter set of the reference terminal, to obtain a weighted voting score of the reference terminal; and determining the comprehensive recognition score according to the weighted voting score of each of the T reference terminals.
The method according to claim 4, before the transmitting the model adjustment parameter to T reference terminals, further comprising: determining, if the terminal and at least one to-be-determined terminal are located in the same region, that the at least one to-be-determined terminal is associated with the terminal, and determining each of the at least one to-be-determined terminal as one of the T reference terminals; or determining, if the same binding object is bound to the terminal and at least one to-be-determined terminal, that the at least one to-be-determined terminal is associated with the terminal, and determining each of the at least one to-be-determined terminal as one of the T reference terminals; or determining, if the terminal and at least one to-be-determined terminal are connected to the same access point, that the at least one to-be-determined terminal is associated with the terminal, and determining each of the at least one to-be-determined terminal as one of the T reference terminals.
The method according to claim 4, after the determining that the fine-tuned recognition model satisfies the model fine-tuning condition, further comprising: obtaining a to-be-tested image; obtaining a fifth prediction result based on the to-be-tested image by using the fine-tuned recognition model; obtaining T sixth prediction results from the T reference terminals, wherein each sixth prediction result is obtained by a reference terminal based on the to-be-tested image by using an updated recognition model on the reference terminal; and performing, if determining that the fine-tuned recognition model has been in a model stable state according to the fifth prediction result and the T sixth prediction results, a preset service by using the fine-tuned recognition model.
The method according to claim 1, further comprising: transmitting a model fine-tuning request to a reference terminal of T reference terminals if the fine-tuned recognition model does not satisfy the model fine-tuning condition, wherein the T terminals are associated with the terminal, and T is an integer greater than or equal to 1; receiving a model adjustment parameter transmitted from the reference terminal, wherein the transmitted model adjustment parameter is for an updated recognition model obtained by the reference terminal through updating a model parameter of a to-be-trained model on the reference terminal according to the model fine-tuning request; and updating the model parameter of the to-be-trained model of the terminal by using the model adjustment parameter transmitted from the reference terminal.
The method according to any one of claims 1 to 10, after the updating a model parameter of the to-be-trained model according to the second prediction result corresponding to each reference image and the first prediction result of the reference image in the fine-tuning training set, to obtain a fine-tuned recognition model and a model adjustment parameter corresponding to the fine-tuned recognition model, further comprising: obtaining, in a case that the fine-tuned recognition model satisfies the model fine-tuning condition, a to-be-recognized image by photographing through the image capturing apparatus; obtaining a seventh prediction result based on the to-be-recognized image by using the fine-tuned recognition model, wherein the seventh prediction result comprises a predicted category and a category score; and determining, if the category score is greater than or equal to a category score threshold, that the to-be-recognized image belongs to the predicted category.
The method according to claim 11, after the obtaining a seventh prediction result based on the to-be-recognized image by using the fine-tuned recognition model, further comprising: transmitting the to-be-recognized image to the server if the category score is less than the category score threshold; and receiving an image recognition result transmitted from the server, wherein the image recognition result is generated by the server based on the to-be-recognized image by using the image recognition model.
A data processing method, performed by a server, comprising: receiving K reference images transmitted by a terminal, the K images being obtained through photographing by the terminal in an environment where the terminal is currently located, and K being an integer greater than or equal to 1; obtaining K first prediction results based on the K reference images by using an image recognition model; transmitting the K first prediction results to the terminal, to enable the terminal to construct a fine-tuning training set according to the K reference images and the K first prediction results, fine-tune a to-be-trained model on the terminal by using the fine-tuning training set through: obtaining, for a reference image comprised in each group of fine-tuning training data in the fine-tuning training set and by using the to-be-trained model, a second prediction result corresponding to the reference image, and updating a model parameter of the to-be-trained model according to the second prediction result corresponding to each reference image and the first prediction result of the reference image in the fine-tuning training set, to obtain a fine-tuned recognition model and a model adjustment parameter corresponding to the fine-tuned recognition model, each group of fine-tuning training data comprising one of the K reference images and one of the K first prediction result corresponding to the reference image; receiving, if the fine-tuned recognition model satisfies a model fine-tuning condition, the model adjustment parameter from the terminal; and updating a model parameter of the image recognition model when obtaining a model adjustment parameter set, the model adjustment parameter set comprising the model adjustment parameter.
The method according to claim 13, wherein the model adjustment parameter set comprises at least one model adjustment parameter from at least one associated terminal, and the updating a model parameter of the image recognition model when obtaining a model adjustment parameter set comprises: receiving the at least one model adjustment parameter comprised in the model adjustment parameter set from the at least one associated terminal; performing weighting processing on the at least one model adjustment parameter according to a comprehensive recognition score corresponding to each associated terminal, to obtain a weighted model adjustment parameter set; and updating the model parameter of the image recognition model by using the weighted model adjustment parameter set.
The method according to claim 13, wherein the updating a model parameter of the image recognition model when obtaining a model adjustment parameter set comprises: updating the model parameter of the image recognition model when obtaining a model parameter set, wherein the model parameter set comprises the model adjustment parameter, and the model adjustment parameter is a model parameter; or updating the model parameter of the image recognition model when obtaining a gradient set, wherein the gradient set comprises the model adjustment parameter, and the model adjustment parameter is a gradient; or updating the model parameter of the image recognition model when obtaining an optimization algorithm parameter set, wherein the optimization algorithm parameter set comprises the model adjustment parameter, and the model adjustment parameter is an optimization algorithm parameter.
The method according to any one of claims 13 to 15, after the updating a model parameter of the image recognition model when obtaining a model adjustment parameter set, further comprising: transmitting the model adjustment parameter of the image recognition model to at least one terminal, to enable each associated terminal to update a model parameter of a local recognition model on the associated terminal by using the model adjustment parameter of the image recognition model.
A data processing apparatus, deployed on a terminal, comprising: a photographing module, configured to obtain K reference images through photographing in an environment where the terminal is currently located, K being an integer greater than or equal to 1; a transmission module, configured to transmit the K reference images to a server, and receiving, from the server, K first prediction results generated based on the K reference images by using an image recognition model; an obtaining module, configured to construct a fine-tuning training set according to the K reference images and the K first prediction results, the fine-tuning training set comprising K groups of fine-tuning training data, and each group of fine-tuning training data comprising one of the K reference images and one of the K first prediction result corresponding to the reference image; and an updating module, configured to fine-tune a to-be-trained model on the terminal by using the fine-tuning training set through: obtaining, for a reference image comprised in each group of fine-tuning training data in the fine-tuning training set and by using the to-be-trained model, a second prediction result corresponding to the reference image, and updating a model parameter of the to-be-trained model according to the second prediction result corresponding to each reference image and the first prediction result of the reference image in the fine-tuning training set, to obtain a fine-tuned recognition model and a model adjustment parameter corresponding to the fine-tuned recognition model; the transmission module being further configured to transmit the model adjustment parameter to the server in response to the fine-tuned recognition model satisfying a model fine-tuning condition, to enable the server to update a model parameter of the image recognition model according to a model adjustment parameter set, the model adjustment parameter set comprising the model adjustment parameter.
A data processing apparatus, deployed on a server, comprising: a receiving module, configured to receive K reference images transmitted by a terminal, the K images being obtained through photographing by the terminal in an environment where the terminal is currently located, and K being an integer greater than or equal to 1; an obtaining module, configured to obtain K first prediction results based on the K reference images by using an image recognition model; a transmission module, configured to transmit the K first prediction results to the terminal, to enable the terminal to construct a fine-tuning training set according to the K reference images and the K first prediction results, fine-tune a to-be-trained model on the terminal by using the fine-tuning training set through: obtaining, for a reference image comprised in each group of fine-tuning training data in the fine-tuning training set and by using the to-be-trained model, a second prediction result corresponding to the reference image, and updating a model parameter of the to-be-trained model according to the second prediction result corresponding to each reference image and the first prediction result of the reference image in the fine-tuning training set, to obtain a fine-tuned recognition model and a model adjustment parameter corresponding to the fine-tuned recognition model, each group of fine-tuning training data comprising one of the K reference images and one of the K first prediction result corresponding to the reference image; the receiving module being further configured to receive, if the fine-tuned recognition model satisfies a model fine-tuning condition, the model adjustment parameter from the terminal; and an updating module, configured to update a model parameter of the image recognition model when obtaining a model adjustment parameter set, the model adjustment parameter set comprising the model adjustment parameter.
A computer device, comprising a memory and a processor, the memory having a computer program stored therein, the processor, when executing the computer program, implementing the operations of the method according to any one of claims 1 to 12 or the operations of the method according to any one of claims 13 to 16.
A computer-readable storage medium, having a computer program stored therein, the computer program, when executed by a processor, implementing the operations of the method according to any one of claims 1 to 12 or the operations of the method according to any one of claims 13 to 16.

Description

RELATED APPLICATION This application claims priority to Chinese Patent Application No. 202310894889.1, filed with the China National Intellectual Property Administration on July 20, 2023 and entitled "DATA PROCESSING METHOD AND RELATED APPARATUS, AND DEVICE AND STORAGE MEDIUM ", which is incorporated herein by reference in its entirety. FIELD OF THE TECHNOLOGY This application relates to the field of artificial intelligence technologies, and in particular, to a data processing method, a related apparatus, a device, and a storage medium. BACKGROUND OF THE DISCLOSURE In recent years, artificial intelligence (AI) technologies are constantly developed, and are widely applied to the image recognition field. AI can recognize a biometric object (for example, a human face, an iris, or a palmprint), an item, a text, and the like in an image by using complex algorithms and models, thereby implementing intelligent image processing and analysis. Image capturing in different environments is usually susceptible to complex environmental factors, for example, light intensity and background noise of different environments are different. These environmental factors may affect the accuracy of image recognition. Therefore, in the related technology, a large quantity of images may be captured in different environments to perform model training, to enhance model recognition capability. However, in the related technology, on one hand, because images used for model training can hardly cover various environments, sample types that can be learned by models are limited, resulting in a poor model learning effect. On the other hand, training a large quantity of images by models not only consumes much computing power, but also consumes much time. No effective solution to the foregoing problem has been provided yet. SUMMARY Embodiments of this application provide a data processing method, a related apparatus, a device, and a storage medium, so that an image recognition model is applicable to various specific on-site environments to improve model recognition precision, and processing resources of a server are saved and model learning efficiency is improved. In view of this, according to one aspect of this application, a data processing method is provided, performed by an on-site terminal, including: photographing K images in a current on-site environment by using an image capturing apparatus, K being an integer greater than or equal to 1;transmitting the K images to a server, so that the server obtains K first prediction results based on the K images by using an image recognition model;constructing a fine-tuning training set according to the K images and the K first prediction results transmitted by the server, the fine-tuning training set including K groups of fine-tuning training data, and each group of fine-tuning training data including an image and a first prediction result of the image;fine-tuning a to-be-trained model on the on-site terminal by using the fine-tuning training set, and in a process of fine-tuning the to-be-trained model on the on-site terminal, obtaining, based on an image included in each group of fine-tuning training data in the fine-tuning training set and by using the to-be-trained model, a second prediction result corresponding to each image, and updating a model parameter of the to-be-trained model according to the second prediction result corresponding to each image and the first prediction result of the image in the fine-tuning training set, to obtain a local recognition model and a model adjustment parameter corresponding to the local recognition model; andsending the model adjustment parameter to the server if the local recognition model satisfies a model fine-tuning condition, so that the server updates a model parameter of an image recognition model according to a model adjustment parameter set from at least one terminal, the model adjustment parameter set including the model adjustment parameter. According to another aspect of this application, a data processing method is provided, performed by a server, including: receiving K images transmitted by an on-site terminal, the K images being photographed by the on-site terminal in a current on-site environment by using a capture apparatus, and K being an integer greater than or equal to 1;obtaining K first prediction results based on the K images by using an image recognition model;transmitting the K first prediction results to the on-site terminal, so that the on-site terminal constructs a fine-tuning training set according to the K images and the K first prediction results, fine-tunes a to-be-trained model on the on-site terminal by using the fine-tuning training set, and in a process of fine-tuning the to-be-trained model on the on-site terminal, obtains, based on an image included in each group of fine-tuning training data in the fine-tuning training set and by using the to-be-trained model, a second prediction result corresponding to each image, and upd