EP-4738208-A1 - COMMUNICATION METHOD, APPARATUS AND SYSTEM

EP4738208A1EP 4738208 A1EP4738208 A1EP 4738208A1EP-4738208-A1

Abstract

This application provides a communication method, apparatus, and system. The method provides a solution that can effectively improve accuracy of a model obtained through federated learning. Specifically, the method includes: before training a first model from a first network element by using a local dataset, a second network element may perform data processing on the local dataset based on an indication of the first network element; and then train the first model by using a local dataset obtained after the data processing, to obtain a second model, and send the second model to the first network element. The data processing is used to enable the local dataset to meet a data reliability requirement. In this way, before model training is performed, data processing is first performed on the local dataset used for performing model training. This can effectively improve reliability of the local dataset, and can effectively improve accuracy of the model obtained through federated learning.

Inventors

LIN, HSIAO-YING
LEI, ZHONGDING
SHI, JIE

Assignees

Huawei Technologies Co., Ltd.

Dates

Publication Date: 20260506
Application Date: 20240711

Claims (20)

A communication method, comprising: sending, by a first network element, a first model and first indication information to a second network element, wherein the first indication information indicates the second network element to perform data processing on a first dataset, the first dataset is a dataset obtained by the second network element, and the first dataset is used by the second network element to train the first model; and receiving, by the first network element, a second model sent by the second network element, wherein the second model is a model obtained after the second network element trains the first model by using a second dataset, and the second dataset is a dataset obtained after the second network element performs data processing on the first dataset.
The method according to claim 1, wherein the method further comprises: sending, by the first network element, second indication information to the second network element, wherein the second indication information indicates an algorithm, and the algorithm is used by the second network element to perform data processing on the first dataset.
The method according to claim 1 or 2, wherein the method further comprises: sending, by the first network element, a data amount threshold to the second network element, wherein the data amount threshold indicates the second network element to train the machine learning model when a data amount of the second dataset is greater than or equal to the data amount threshold.
The method according to any one of claims 1 to 3, wherein the method further comprises: receiving, by the first network element, first feedback information sent by the second network element, wherein the first feedback information indicates that the second network element has performed data processing on the first dataset.
The method according to any one of claims 1 to 4, wherein the method further comprises: receiving, by the first network element, a data amount of the first dataset and the data amount of the second dataset that are sent by the second network element.
The method according to any one of claims 2 to 5, wherein the method further comprises: receiving, by the first network element, second feedback information sent by the second network element, wherein the second feedback information indicates that the second network element has performed data processing on the first dataset according to the algorithm.
The method according to any one of claims 1 to 6, wherein the method further comprises: sending, by the first network element, request information to the second network element, wherein the request information is used to request the second network element to train the first model, and the request information comprises the first model and the first indication information.
A communication method, comprising: receiving, by a second network element, a first model and first indication information that are sent by a first network element, wherein the first indication information indicates the second network element to perform data processing on a first dataset, the first dataset is a dataset obtained by the second network element, and the first dataset is used by the second network element to train the first model; performing, by the second network element, data processing on the first dataset based on the first indication information, to obtain a second dataset; training, by the second network element, the first model based on the second dataset, to obtain a second model; and sending, by the second network element, the second model to the first network element.
The method according to claim 8, wherein the method further comprises: receiving, by the second network element, second indication information sent by the first network element, wherein the second indication information indicates an algorithm, and the algorithm is used by the second network element to perform data processing on the first dataset.
The method according to claim 8 or 9, wherein the method further comprises: receiving, by the second network element, a data amount threshold sent by the first network element, wherein the data amount threshold indicates the second network element to train the first model when a data amount of the second dataset is greater than or equal to the data amount threshold.
The method according to any one of claims 8 to 10, wherein the method further comprises: sending, by the second network element, first feedback information to the first network element, wherein the first feedback information indicates that the second network element has performed data processing on the first dataset.
The method according to any one of claims 8 to 11, wherein the method further comprises: sending, by the second network element, a data amount of the first dataset and the data amount of the second dataset to the first network element.
The method according to any one of claims 9 to 12, wherein the method further comprises: sending, by the second network element, second feedback information to the first network element, wherein the second feedback information indicates that the second network element has performed data processing on the first dataset according to the algorithm.
The method according to any one of claims 8 to 13, wherein the method further comprises: receiving, by the second network element, request information sent by the first network element, wherein the request information is used to request the second network element to train the first model, and the request information comprises the first model and the first indication information.
A communication apparatus, comprising a processor, wherein the processor is configured to execute a computer program or instructions, or enable, through a logic circuit, to cause the communication apparatus to perform the method according to any one of claims 1 to 7, or perform the method according to any one of claims 8 to 14.
The communication apparatus according to claim 15, wherein the communication apparatus further comprises a memory, and the memory is configured to store the computer program or the instructions.
The communication apparatus according to claim 15 or 16, wherein the communication apparatus further comprises a communication interface, and the communication interface is configured to input a signal and/or output a signal.
A communication apparatus, comprising a logic circuit and an input/output interface, wherein the input/output interface is configured to input a signal and/or output a signal; and the logic circuit is configured to perform the method according to any one of claims 1 to 7; or the logic circuit is configured to perform the method according to any one of claims 8 to 14.
A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program or instructions; and when the computer program or the instructions are executed on a computer, the method according to any one of claims 1 to 7 is performed, or the method according to any one of claims 8 to 14 is performed.
A computer program product, comprising instructions, wherein when the instructions are run on a computer, the method according to any one of claims 1 to 7 is performed, or the method according to any one of claims 8 to 14 is performed.

Description

This application claims priority to Chinese Patent Application No. 202310883716.X, filed with the China National Intellectual Property Administration on July 18, 2023 and entitled "COMMUNICATION METHOD, APPARATUS, AND SYSTEM", which is incorporated herein by reference in its entirety. TECHNICAL FIELD This application relates to the field of communication technologies, and more specifically, to a communication method, apparatus, and system. BACKGROUND Federated learning (federated learning, FL) is an emerging basic artificial intelligence technology, and is designed to implement high-efficient machine learning between a plurality of participants or a plurality of computing nodes on a premise of ensuring information security during big data exchange, protecting terminal data and personal data privacy, and ensuring validity. Specifically, a server-side network element sends, to a client-side network element, a model that needs to be trained, and the client-side network element trains the model based on a local training dataset collected by the client-side network element, to obtain a model parameter of a trained model. The client-side network element sends the model parameter to the server-side network element, and the server-side network element performs aggregation processing on the model parameter, to complete model training on the model. However, accuracy of the model obtained through training based on the foregoing federated learning may be doubtful, and consequently the model cannot be well applied in some scenarios. Therefore, how to effectively improve accuracy of a model obtained through federated learning is an urgent technical problem that needs to be resolved currently. SUMMARY This application provides a communication method, apparatus, and system, so that accuracy of a model obtained through federated learning can be effectively improved. According to a first aspect, a communication method is provided. The method includes: A first network element sends a first model and first indication information to a second network element, where the first indication information indicates the second network element to perform data processing on a first dataset, and the first dataset is used by the second network element to train the first model; and the first network element receives a second model sent by the second network element, where the second model is obtained after the second network element trains the first model by using a second dataset, and the second dataset is a dataset obtained after the second network element performs data processing on the first dataset. The first dataset is a dataset that is obtained by the second network element and that is used to train the first model from the first network element. The second network element may obtain or collect the first dataset from another network element. Specifically, that a first network element sends a first model to a second network element may be: The first network element sends a model architecture of the first model and an initial parameter in the model architecture to the second network element. That the first network element receives a second model may be: The first network element receives a model architecture of the second model and a parameter in the model architecture that are sent by the second network element. The model architecture of the first model may be the same as or different from the model architecture of the second model. This is not limited herein. It should be noted that the foregoing data processing is used to enable the first dataset to meet a data reliability requirement. The foregoing data processing may be data cleaning processing; and may be used to delete data that has a security risk from the first dataset, or may be used to remove invalid data, incomplete data, duplicate data, or the like from the first dataset, or may be used to supplement missing data or the like in the first dataset. In this way, reliability of the dataset used for model training can be ensured. In addition, a type or specific content of the foregoing data processing is not specifically limited in this embodiment of this application. The "data reliability requirement" may be that the first dataset includes less duplicate data, or the first dataset includes less incomplete data, or the first dataset includes less dangerous data or the like. In addition, the first indication information may also indicate the second network element to train the first model by using the first dataset obtained after the data processing. In this way, the second network element can perform data processing on the first dataset based on the first indication information, to obtain the first dataset obtained after the data processing, namely, the second dataset. Specifically, the foregoing data processing is performed on the first dataset, so that reliability of a dataset used to train a machine learning model can be effectively ensured, the machine learning model is train