CN-116542319-B - Self-adaptive federation learning method and system based on digital twin in edge computing environment
Abstract
The invention belongs to the technical field of edge calculation and joint learning, and particularly discloses a digital twinning-based self-adaptive federal learning method and system in an edge calculation environment, which comprises the steps of obtaining state information of each device at the current moment and model parameters currently obtained in local training of the device According to the model parameters uploaded by the partial industrial Internet of things equipment selected at the last moment And optimizing and configuring each device at the next moment by utilizing a trained deep reinforcement learning agent model according to the current state information of all devices, and optimizing the devices for the global model parameter aggregation at the next moment according to the configuration result of the bandwidth ratio h n . The problem of Non-IID and resource allocation existing in federal learning under a digital twin body is solved, real-time online optimization of the system is supported, and robustness of the system under the condition that a channel state is not friendly is improved.
Inventors
- GUO SONGTAO
- Qiao Dewen
- LIU GUIYAN
- Jiao Xianlong
- CHEN CHAO
- LIU KAI
Assignees
- 重庆大学
Dates
- Publication Date
- 20260508
- Application Date
- 20230425
Claims (6)
- 1. An adaptive federal learning method based on digital twinning in an edge computing environment, comprising: Acquiring state information of each device at the current moment from industrial Internet of things device or digital twin thereof and model parameters currently obtained in local training of the device The status information includes the actual CPU frequency of the device Transmission power Transmission rate between the device and base station ; According to the model parameters uploaded by the partial industrial Internet of things equipment selected at the last moment Aggregation to obtain global model parameters ; Optimizing and configuring CPU frequency of each device at the next moment according to the current state information of all devices by using a trained deep reinforcement learning agent model Transmission power And bandwidth ratio In the training of the deep reinforcement learning agent model, the digital twin body of each device is utilized to implement online training, and the deep reinforcement learning agent model is a model constructed and trained based on a depth deterministic strategy gradient method; The optimization objective of the deep reinforcement learning agent model is to simultaneously minimize the total quantity of the model training global loss function and the resource consumption under the corresponding limiting condition; the objective function P1 of the deep reinforcement learning agent model and its constraint condition are expressed as: ; Wherein: , N is the number of IIoT devices, Is of the kind Is used for the measurement of the sample of (a), Is the first The local data set of the IIoT devices, Represent the first In IIoT units, samples The loss function of the above-mentioned data, Represent the first Local model parameters on the IIoT devices; is a given resource; z=2 represents the total energy and total time consumption of a given resource in t periods, considering both time and energy consumption, as follows: , wherein, for IIoT devices , Representing the number of CPU cycles required to execute a sample of data, a processor comprising Is determined by the CPU performance of the (c) processor, The computational power consumption in one iteration is: , Wherein, the Is that And (2) calculating the energy consumption coefficient The calculation time of (2) is as follows: , internet of things equipment in each global aggregation The transmission energy consumption of (a) is as follows: , The uplink transmission time is as follows: ; at the same time, according to the bandwidth ratio Preferably for the aggregation of global model parameters at the next moment, preferably provided that in the configuration result at the current moment, the bandwidth ratio allocated to the device is , Representing an allocated bandwidth minimum threshold; Repeating the above process, and carrying out multiple resource allocation and model parameter aggregation until reaching the condition of ending the federal learning.
- 2. The method of claim 1, wherein the deep reinforcement learning agent model comprises an action network Critics network The parameters are respectively And 。
- 3. The method of claim 2, wherein in training of the deep reinforcement learning agent model, a reward function for action network parameter update is: Wherein, the Is a constant with , The accuracy of the model is represented by, Representing the total resource consumption ratio.
- 4. The method of claim 2, wherein the parameters of the reviewer network are updated using a gradient descent method.
- 5. The method of claim 4, wherein the gradient function minimization problem for updating the commentator network parameters is expressed as follows: ; Where S represents the number of random samples in the depth deterministic strategy gradient method, In order to sample a set of samples, Representing the state space, action space and reward function of the action network in the jth random sample of the set, Representing the rewards discount factor.
- 6. The self-adaptive federal learning system based on the digital twin is applied to an edge computing environment and comprises industrial Internet of things equipment and a server, and is characterized in that the digital twin of each industrial Internet of things equipment and a trained deep reinforcement learning agent model are arranged in the server; the system performs adaptive federal learning between industrial internet of things devices and servers in accordance with the method of any of claims 1-5.
Description
Self-adaptive federation learning method and system based on digital twin in edge computing environment Technical Field The invention belongs to the technical field of edge calculation and joint learning, and particularly discloses a digital twinning-based self-adaptive federal learning method and system in an edge calculation environment. Background While conventional cloud computing cannot meet strict requirements of industrial internet of things (Industrial Internet ofThings, IIoT) on delay, edge Computing (EC) is a promising technology, and conventional cloud services can be extended to an Edge network closer to terminal devices, so that the cloud computing system is suitable for network services with lower delay requirements. Meanwhile, in machine learning (MACHINE LEARNING, ML) based IIoT, the implementation of edge intelligence services relies on real-time state processing and monitoring of large-scale devices. However, because the communication delay in IIoT is random, the operation device increases sharply, and it is difficult for the edge server to perform online optimization by parsing the running environment such as channel state information of the internet of things device. Digital twins (DIGITAL TWIN, DT) are an emerging technology that can provide a bridge between real-time physical states and virtual space for IIoT. In general, since the server has enough resources, the digital object of the continuous change of the industrial internet of things equipment can be maintained in real time. Specifically, a digital object is created in virtual space by software definition and sensor awareness, which is a timely digital representation of the state, characteristics, and evolution of a physical entity. Because DT has good state sensing and real-time analysis capability, control decision efficiency is greatly improved. Meanwhile, DT is a data driven method, relying on mass data analysis of distributed Internet of things devices. However, manufacturers are reluctant to exchange private data of their respective internet of things devices in a virtual space for business competition and privacy protection. Thus, the presence of "islands" of data presents a challenge to building digital objects of a physical system with DTs. Federal learning (FEDERATED LEARNING, FL) as a new ML technique, a new application paradigm of "data available but invisible" and "data not moving but model moving" is achieved by exchanging model parameters without uploading the data to a central server. In the DT Internet of things system, flexible decision making is carried out according to the variable state information of the industrial Internet of things system in federal learning to construct an intelligent model. Thus, introducing the FL in DT-based IIoT can not only improve control efficiency, but also improve the manufacturer's willingness to participate in global model training, resulting in a more accurate digital object in DT. However, in an EC environment, non-independent co-distributed (Non-IID) data across devices and limited edge resources make it very difficult to maintain virtual objects in digital space through FL integrated DT technology. Meanwhile, in the Edge environment, DTs can interact to form a DT Edge Network (DTEN), DTEN and IIoT devices work in real time, and feedback information is consistent. Thus, dynamic optimization of physical entities may be achieved by optimizing the DTs in DTEN. Disclosure of Invention In order to solve the technical problems, the invention provides a self-adaptive federal learning method based on digital twinning in an edge computing environment. The method comprises the following steps: acquiring state information of each device at the current moment from industrial Internet of things device or digital twin body thereof and model parameters currently obtained in local training of the device The state information comprises the actual CPU frequency gamma n of the equipment, the transmission power p n and the transmission rate v n between the equipment and the base station; According to the model parameters uploaded by the partial industrial Internet of things equipment selected at the last moment Obtaining global model parameters x t through aggregation; Optimizing and configuring the CPU frequency gamma n, the transmission power p n and the bandwidth ratio h n of each device at the next moment according to the current state information of all the devices by using a trained Deep Reinforcement Learning (DRL) agent model; The optimization objective of the deep reinforcement learning agent model is to simultaneously minimize the total quantity of the model training global loss function and the resource consumption under the corresponding limiting condition; meanwhile, according to the configuration result of the bandwidth ratio h n, the device for global model parameter aggregation at the next moment is preferably selected, wherein the preferred condition is that the bandwidth ratio