CN-122019103-A - Digital drainage twin large model end cloud collaborative deployment method
Abstract
The invention discloses a digital drainage twin big model end cloud collaborative deployment method which is characterized by comprising the steps of obtaining an original model and preprocessing, carrying out light weight processing based on the processed original model, carrying out model carrying through a layered deployment architecture, carrying out computational power scheduling according to tasks, deploying an end cloud collaborative mechanism and testing. Through model lightweight pretreatment, layered deployment architecture construction, heterogeneous computation power scheduling and end cloud collaborative mechanism design, the efficient end cloud collaborative operation of the drainage twin large model under the hong Monte 6 system is realized.
Inventors
- WANG HAO
- ZHANG DEQUAN
- WANG HAO
Assignees
- 上海中井汉鼎数字技术有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260209
Claims (10)
- 1. The digital drainage twin large model end cloud collaborative deployment method is characterized by comprising the following steps of: acquiring an original model and preprocessing; carrying out light weight treatment based on the treated original model to obtain a light weight model; Carrying a lightweight model through a layered deployment architecture; computing power dispatch according to task and And deploying a cloud coordination mechanism and testing.
- 2. The method of claim 1, wherein the obtaining and preprocessing of the raw model comprises: acquiring an AI twin management and control algorithm model and a pipe network water level blocking monitoring algorithm model; Analyzing the model structure and parameter distribution, and performing pre-training verification by adopting a standard test data set.
- 3. The method of claim 1, wherein the lightweight processing based on the processed raw model comprises: Counting the weight value range of each layer of the original model, calculating a quantization scaling factor and a zero point, converting the floating point number weight into an integer weight, and eliminating redundant parameters with absolute values smaller than a dynamic threshold; carrying out knowledge distillation training on the quantized original model; Calculating the contribution degree of each operator, cutting redundant operators with the contribution degree lower than 0.05, and performing 30 rounds of fine tuning training on the cut model to obtain a lightweight model; And converting the lightweight model into a preset format.
- 4. A method according to claim 3, characterized in that the quantization scaling factor calculation is performed by the following formula: ; Where b is the quantization bit width, max w is the weight maximum, min w is the weight minimum, and is used to map the floating point number weight to the integer interval.
- 5. A method according to claim 3, characterized in that the quantization zero is calculated by the following formula: ; wherein round (,) is a rounding function, and Z is a zero offset of the quantized integer, for reducing quantization mapping error.
- 6. A method according to claim 3, wherein the floating point number weights are converted to integer weights by the following formula: ; Wherein w is the original 32-bit floating point number weight, and w q is the quantized b-bit integer weight, so as to realize weight data compression.
- 7. A method according to claim 3, wherein knowledge distillation is performed by the following formula, wherein the original model is a teacher model and the quantized model is a student model: ; Wherein, α=0.7, β=0.3, KL (P T ,P S ) is KL divergence of probability distribution output by the teacher model and the student model, and CE (y, P S ) is cross entropy loss of the student model and the real label, which is used for guaranteeing model precision after distillation.
- 8. A method according to claim 3, wherein the operator contribution is calculated by the formula: ; Wherein ∂ Acc/∂ Oi is the partial derivative of the model precision to the ith operator, comp Oi is the calculated amount of the ith operator, comp total is the total calculated amount of the model, and the calculated amount is used for screening redundant operators.
- 9. The method of claim 1, wherein the piggybacking model by a hierarchical deployment architecture comprises: Burning the lightweight model into the embedded equipment, configuring a data acquisition interface and a local reasoning engine, and setting an alarm triggering threshold; deploying a mixed precision model at an edge computing node, configuring a fragment area data fusion rule and twin simulation parameters, and constructing a local cache database; and deploying a full-precision large model in the cloud server cluster, constructing a model training platform and a global strategy decision module, and configuring a cloud data synchronization interface and a model updating scheduling rule.
- 10. The method according to claim 1, wherein in the computing power scheduling according to the task, the task allocation rule includes: the matrix operation task is distributed to the NPU, and an NPU hardware acceleration interface is called; The control logic task is distributed to the CPU; The attention mechanism tasks are assigned to the GPUs.
Description
Digital drainage twin large model end cloud collaborative deployment method Technical Field The invention relates to the technical field of digital twinning, artificial intelligent model deployment and end cloud collaborative, in particular to a digital drainage twinning large model end cloud collaborative deployment method. Background In the technical application field of urban drainage twin systems, a cloud centralized deployment architecture is a mainstream implementation scheme commonly adopted in the industry at present, but the architecture has three major core technical defects in practical application, namely, firstly, the dependence of the system on the cloud is extremely high, when network transmission is interrupted, edge side equipment loses autonomous decision-making capability, so that waterlogging early warning response delay reaches more than 500ms, early warning timeliness and emergency disposal efficiency are seriously influenced, secondly, an AI large model for a drainage twin scene usually has a trillion-level parameter scale, if the AI large model is directly deployed at an edge terminal, single equipment operation power consumption exceeds 20W, model reasoning delay breaks through 100ms, low-power consumption and low-delay special application scenes such as underground terminals and gate stations cannot be adapted, thirdly, the computational power scheduling mode of the existing architecture is relatively single, targeted allocation and optimal design are not carried out aiming at edge side CPU, GPU, NPU heterogeneous computational power resources, the overall operation efficiency of the model is low, and hardware computational power potential is difficult to fully develop. In the prior art, although related technical schemes of AI terminal algorithms such as an AI and 3D technology-based drainage unit twin system and method, a drainage unit twin management system and the like exist, the schemes generally lack a lightweight deployment design for adapting to edge side equipment, so that the design is difficult to realize effective landing application in an edge terminal. In addition, although the hong Mongolian 6 system has the technical characteristics of end-cloud cooperation, a special layered deployment standard and an adaptive scheme aiming at a drainage twin large model are not formed at present, so that the balance problem between model precision, terminal calculation force and running power consumption cannot be effectively solved. In summary, the existing urban drainage twin system still has a plurality of technical defects to be solved in the aspects of deployment architecture, model adaptation, calculation scheduling, standard specification and the like, restricts the large-scale application and efficiency improvement of the urban drainage twin system in an edge side scene, and needs a technical scheme capable of considering autonomous decision making capability, light deployment requirements and heterogeneous calculation optimal scheduling. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a digital drainage twin big model end cloud collaborative deployment method, which aims at: a terminal-edge-cloud three-layer deployment architecture is constructed, so that autonomous reasoning of edge nodes in a broken network state is realized, and the cloud dependence of a system is reduced; Compressing the trillion-level parameter large model to the loadable scale of the edge terminal by a model lightweight quantization technology, and realizing that the end-side push processing time delay is less than or equal to 50ms and the single-terminal power consumption is less than or equal to 5W on the premise of ensuring that the precision loss is less than or equal to 3%; and the heterogeneous calculation power scheduling strategy is optimized, so that the cooperative work of the CPU/GPU/NPU is realized, and the running efficiency of the model is improved. The invention provides a digital drainage twin big model end cloud collaborative deployment method, which comprises the following steps: acquiring an original model and preprocessing; carrying out light weight treatment based on the treated original model to obtain a light weight model; Carrying a lightweight model through a layered deployment architecture; computing power dispatch according to task and And deploying a cloud coordination mechanism and testing. In an embodiment of the present invention, the obtaining the original model and the preprocessing includes: acquiring an AI twin management and control algorithm model and a pipe network water level blocking monitoring algorithm model; Analyzing the model structure and parameter distribution, and performing pre-training verification by adopting a standard test data set. In an embodiment of the present invention, the performing the light-weight processing based on the processed original model includes: Counting the weight value range of each layer