CN-122021973-A - Method for U-shaped segmentation collaborative training architecture of input and output ends

CN122021973ACN 122021973 ACN122021973 ACN 122021973ACN-122021973-A

Abstract

A method for a U-shaped segmentation collaborative training architecture of an input end and an output end belongs to the field of network models, and comprises the following steps of 1, carrying out off-line pretreatment before model deployment to generate a submodel library for dynamic scheduling for realizing integration and dynamic segmentation in ONNX operation, 2, carrying out multidimensional real-time acquisition and monitoring on performance data of the end equipment, 3, submitting data at the end, namely training samples or reasoning input, starting a model execution engine, triggering a decision callback when an execution flow reaches a preset potential segmentation point, 4, carrying out submodel loading, trusted authentication and collaborative calculation, enabling a safe and efficient collaborative calculation stage of a system according to dynamic decision results, and simultaneously enabling a full-link multilayer privacy protection system, and 5, establishing a closed-loop feedback mechanism for the system for continuously optimizing the accuracy of dynamic segmentation decision. The invention realizes the optimal balance among privacy, efficiency and model utility.

Inventors

ZHANG YAN
CHEN SIYUAN
LU CHENGGANG

Assignees

浙江工业大学

Dates

Publication Date: 20260512
Application Date: 20260414

Claims (9)

1. The method for the U-shaped segmentation collaborative training architecture of the input and output ends is characterized by comprising the following steps of: Step 1, model preprocessing and pre-deployment preparation, namely, for realizing integration and dynamic segmentation in ONNX operation, performing off-line preprocessing before model deployment to generate a sub-model library for dynamic scheduling; Step 2, multidimensional real-time acquisition and monitoring of performance data of the terminal equipment, wherein a performance monitoring agent acquires a plurality of hardware resource state data on the terminal equipment in real time at a preset frequency; Step 3, dynamic segmentation decision based on equipment type and real-time load, wherein a user submits data at an end side, namely training samples or reasoning input, a model execution engine is started, and when an execution flow reaches a preset potential segmentation point, a decision callback is triggered; Step 4, sub-model loading, trusted authentication and collaborative computing execution, wherein the system enters a safe and efficient collaborative computing stage according to a dynamic decision result, and simultaneously a full-link multi-layer privacy protection system is started to ensure the safety of the whole process of data, computation and training; and step 5, closed loop feedback, strategy optimization and system self-adaption, wherein a closed loop feedback mechanism is established for continuously optimizing the accuracy of the dynamic segmentation decision.
2. The method of the U-shaped segmentation co-training architecture of the input/output end according to claim 1, wherein the step 1 is completed in a development environment or a cloud server without occupying end-side resources, and comprises the following steps: Step 1.1, model analysis and marking, namely loading an original model to be deployed by ONNXPythonAPI, traversing a calculation graph of the original model, and inserting a custom marking node at a predefined potential segmentation point, wherein the node is a special operator without actual calculation and is only used for marking a graph segmentation boundary; step 1.2, graph segmentation and submodel generation, namely cutting an original model into two independent submodels by utilizing ONNX serialization and anti-serialization functions based on marking nodes, wherein a terminal side submodel comprises all operators from an input layer to a current marking node, and a cloud side submodel comprises all operators from the current marking node to an output layer; and 1.3, creating a metadata mapping table, namely creating a mapping table of a segmentation point ID-submodel, mapping the segmentation point ID to corresponding end side and cloud side submodel files, and pre-deploying the mapping table to an end side dynamic scheduler to support the mapping table to rapidly load the corresponding submodel according to a decision result.
3. The method of claim 1 or 2, wherein in the step 2, the hardware resource status data includes a computing resource, a storage resource, power consumption and thermal status, and a network status, the computing resource includes a CPU core utilization, a GPU utilization, and an AI accelerator occupancy, the storage resource includes a memory occupancy, an available memory amount, and a virtual memory usage, the power consumption and thermal status includes a current remaining power of the device, an instantaneous power consumption, a battery temperature, and a CPU/GPU core temperature, the network status includes a current available network bandwidth, a round trip delay, and a connection stability, and the raw data forms an input vector of the performance evaluation.
4. The method of an input/output U-shaped segmentation co-training architecture according to claim 1 or 2, wherein the process of step 3 is as follows: step 3.1, real-time evaluation, namely the dynamic segmentation decision device receives the real-time performance data acquired in the step 1, calculates according to a preset evaluation model, wherein the evaluation model comprises a basic weighting model, a priority perception model or an intelligent prediction model, and outputs a quantized comprehensive load score S; step 3.2, strategy mapping, wherein a decision maker queries a predefined two-dimensional segmentation strategy mapping table by combining a pre-registered device type identifier D_type and a calculated comprehensive load score S; and 3.3, generating an instruction, namely outputting a model dividing point L_split by a mapping table, wherein the instruction defines where a terminal side sub-model is to be cut off in the task, and where a cloud side sub-model is to be started, so that a three-section U-shaped structure of the cooperative computing is dynamically formed.
5. The method of an input/output U-shaped segmentation co-training architecture according to claim 1 or 2, wherein the process of step 4 is as follows: step 4.1, loading the sub-model, namely, rapidly pulling the corresponding terminal sub-model and cloud side sub-model from a local cache or a server according to a segmentation point instruction by an end-side dynamic scheduler, and completing loading and initialization; step 4.2, establishing a secure channel and two-way verification: The method comprises the steps that a secure communication link is established between an end side and a cloud service or a Trusted Execution Environment (TEE) instance; The method comprises the steps of receiving a key data transmission request, calculating a hash value of the key data transmission request by a sender, and comparing the key data transmission request with the hash value of the key data transmission request; Step 4.3, full-link multi-layer privacy protection; step 4.4, forward calculation; And 4.5, back propagation.
6. An input/output device according to claim 5a method for end U-shaped segmentation collaborative training architecture, the method is characterized in that the process of the step 4.3 is as follows: a. The first layer is used for data stealth and confusion, wherein the end side performs homomorphic encryption on the intermediate activation value or adds differential privacy noise to prevent the original information from being revealed; b. the second layer is used for verifying the calculation integrity, wherein the calculation executed by the cloud is ensured not to be tampered based on a remote proving mechanism of a trusted execution environment; c. The third layer, the privacy reinforcement in the training process, namely compressing, thinning or encrypting the gradient in the back propagation to prevent gradient leakage; d. and a fourth layer, verifying data integrity and freshness, namely preventing replay attack and man-in-the-middle attack through a hash chain and a time stamp mechanism.
7. An input/output device according to claim 5a method for end U-shaped segmentation collaborative training architecture, the method is characterized in that the process of the step 4.4 is as follows: a. end side calculation, namely calculating input data by a loaded end side sub-model to generate an intermediate activation value; b. The privacy processing comprises the steps of preprocessing an intermediate activation value according to a security policy, wherein the preprocessing process comprises the steps of (1) adding differential privacy noise, namely injecting calibrated Gaussian or Laplacian noise to provide provable privacy budget, (2) homomorphic encryption, namely encrypting by using a cloud public key to generate ciphertext; c. The cloud computing, namely the processed data are sent to the cloud, the cloud side sub-model loaded by the cloud computes the received data, and an intermediate result required by final output or subsequent gradient computation is generated; d. And returning the result, namely returning the calculation result to the end side by the cloud end, and decrypting or directly using the end side.
8. An input/output device according to claim 5a method for end U-shaped segmentation collaborative training architecture, the method is characterized in that the process of the step 4.5 is as follows: a. End side loss calculation and gradient initiation, namely, label data always remain an end side, and the end side calculates loss and starts back propagation until a division point is cut; b. cloud gradient calculation and privacy reinforcement, namely calculating the gradient of a cloud side part by the cloud, and applying privacy reinforcement technology such as gradient clipping, noise addition or Top-k sparsification; c. Gradient security aggregation and return, namely, in a trusted execution environment TEE, securely aggregating gradients from a plurality of devices, and securely returning the reinforced gradients to an end side; d. And updating parameters, namely integrating a local gradient and a cloud returned gradient at the end side, updating the model parameters at the end side, and sending the updating information of the model at the cloud side to the cloud side.
9. The method of an input/output U-shaped segmentation co-training architecture according to claim 1 or 2, wherein the process of step 5 is as follows: Step 5.1, after the cooperative task is completed, the system collects key performance indexes including time consumption of local calculation of an end side, time consumption of data encryption/decryption, time consumption of network transmission, time consumption of cloud computing, total task completion time and equipment energy consumption change; step 5.2, feedback analysis, namely receiving the feedback data by the dynamic segmentation decision device and comparing and analyzing the feedback data with a predicted load state in decision; And 5.3, strategy tuning, namely automatically or manually adjusting the entry of the segmentation strategy mapping table by using an analysis result or optimizing parameters of a performance evaluation model by an administrator, so that the system can better adapt to the equipment performance change and the work load mode in long-term operation.

Description

Method for U-shaped segmentation collaborative training architecture of input and output ends Technical Field The invention belongs to the field of network models, and relates to a method for an input/output end U-shaped segmentation collaborative training architecture. Background With the deep penetration of artificial intelligence technology in the fields of Internet of things, mobile Internet, industrial Internet and the like, a deep learning model has become a core engine for driving industrial intelligence. However, exponential growth of model scale (e.g., large language models, multimodal models) places unprecedented demands on computing, storage, and communication resources. In this context, the traditional centralized cloud computing mode faces serious challenges such as data privacy and compliance, and sensitive information such as medical images, financial transactions, personal voice and behavior data are directly uploaded to the cloud, so that huge privacy leakage risks are faced, and increasingly strict data protection regulations may be violated. If the models are all deployed on the terminal, the computing power of the terminal equipment (such as a mobile phone, a sensor and an unmanned aerial vehicle) is limited, but the computing power of the terminal equipment is still limited, and the computing power, the storage power and the electric quantity resources are still limited, so that the complete training task of the large-scale deep learning model can not be independently born. To address the challenges described above, a variety of distributed machine learning paradigms have been proposed in the academia and industry, but they all suffer from inherent drawbacks that do not achieve an ideal balance between privacy, efficiency, computational effort and model performance. Classical solutions, such as centralized cloud computing, upload all raw data to a cloud data center for unified model training. As mentioned before, this approach presents serious privacy exposure and security risks. The federal learning method has the defects that the terminal equipment locally utilizes an own data training model and only uploads model update (such as gradient) to the cloud for aggregation to generate a global model, the computing and storage burden is heavy, the terminal needs to have the capability of running the complete model, the resource requirement is high, the method is not suitable for a complex model or a terminal with limited resources, meanwhile, the problem of communication efficiency is faced, the problem that the model parameter update amount is still large, the communication cost is large, and the shared gradient information still leaks the original data privacy through inversion attack or member inference attack on information privacy. In the traditional segmentation learning, a deep learning model is segmented at a certain segmentation point, a terminal is responsible for calculation of a plurality of previous layers, and an intermediate activation value is sent to a cloud to finish calculation of the remaining layers, including final loss calculation. The method faces output and label exposure problems, and as loss calculation is completed in the cloud, model output and label data for supervised learning are usually required to be uploaded to the cloud, which constitutes a new privacy leakage point. The tag data itself is often highly sensitive information. The method is practically incomplete end-side closed loop, and cannot realize privacy protection standard of 'input and output are both at end side', and cannot meet the scene requirement of confidentiality of output result. The U-shaped segmentation learning is an evolution framework of the traditional segmentation learning, and the core idea is that an input layer and an output layer of a model are reserved on terminal equipment, and only an intermediate layer of the model is placed in a cloud for calculation. This architecture theoretically provides the possibility to achieve complete privacy closed loop, but existing research and practice is still in an early stage and faces a series of engineering and security challenges: firstly, the bottleneck problem of communication efficiency is that the intermediate activation value and the segmentation point gradient which are transmitted in two directions between the terminal and the cloud end still have huge data dimension, and the intermediate activation value and the segmentation point gradient become main bottlenecks for restricting the training speed and increasing the communication cost. Efficient compression, quantization and error compensation algorithms are needed to greatly reduce the communication load while ensuring model convergence. And secondly, privacy protection reinforcement needs that although the original data and the label do not leave the terminal, the intermediate activation value still can imply the original input information, and the risk of being recovered by a r