CN-122019104-A - Digital drainage twin large model end cloud collaborative deployment method
Abstract
The invention discloses a digital drainage twin big model end cloud collaborative deployment method which is characterized by comprising the steps of obtaining an original model, carrying out light weight treatment to obtain a light weight model, deploying the light weight model through a three-layer framework, carrying out calculation scheduling according to tasks, carrying out end cloud collaborative testing, carrying out model training update based on data, and completing model iteration. The system is suitable for the calculation power dispatching and model operation of the urban flood control and drainage digital twin management system and the intelligent terminal.
Inventors
- WANG HAO
- ZHANG DEQUAN
- WANG HAO
Assignees
- 上海中井汉鼎数字技术有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260209
Claims (9)
- 1. The digital drainage twin large model end cloud collaborative deployment method is characterized by comprising the following steps of: acquiring an original model and performing light weight treatment to obtain a light weight model; Deploying a lightweight model through a three-layer architecture; performing calculation scheduling according to the tasks; Performing end cloud cooperative test, and Model training and updating are carried out based on the data, and model iteration is completed.
- 2. The method of claim 1, wherein the obtaining the original model and performing the light weight process to obtain the light weight model comprises: Obtaining an original model; Calculating the contribution degree of each operator, cutting redundant operators with the contribution degree lower than a threshold value, and performing fine adjustment training on the cut model; Carrying out knowledge distillation training on the cut original model; configuring mixed quantization parameters for quantization to obtain a quantization model; And converting the format of the quantization model, and outputting the OM/HIM format lightweight model.
- 3. The method of claim 2, wherein the mixed quantization parameter comprises an INT4 weight quantization and an INT8 activation value quantization.
- 4. The method according to claim 1, wherein the obtaining the original model and the lightening process are performed, and a precision loss constraint is required to be satisfied in the process of obtaining the lightening model: 。
- 5. The method of claim 1, wherein deploying the lightweight model through a three-tier architecture comprises: deploying an HIM model at an edge terminal layer; Deploying an OM model at an edge computing layer; and deploying the full-precision original model on a cloud service layer.
- 6. The method of claim 1, wherein performing computational power scheduling according to the task comprises mapping matrix operations, control logic, attention mechanisms to NPU, CPU, GPU computational power units, respectively, according to the task type.
- 7. The method according to claim 1, wherein the computing power scheduling efficiency in computing power scheduling according to the task is: ; where Pi is the power consumption of the class i power calculation unit and Ti is the running time of the class i power calculation unit.
- 8. The method of claim 1, wherein the performing end-cloud cooperative testing includes starting a system to verify data synchronization and policy issuing functions under a normal network; And simulating a network breaking scene to verify the autonomous reasoning capacity of the edge node.
- 9. The method of claim 1, wherein the performing the end-cloud cooperative test is required to satisfy an end-side push delay constraint: 。
Description
Digital drainage twin large model end cloud collaborative deployment method Technical Field The invention relates to the technical field of digital twin and artificial intelligence model light weight, in particular to a digital drainage twin large model end cloud collaborative deployment method. Background In the technical application field of urban drainage twin systems, a cloud centralized deployment architecture is a mainstream implementation scheme commonly adopted in the industry at present, but the architecture has three major core technical defects in practical application, namely, firstly, the dependence of the system on the cloud is extremely high, when network transmission is interrupted, edge side equipment loses autonomous decision-making capability, so that waterlogging early warning response delay reaches more than 500ms, early warning timeliness and emergency disposal efficiency are seriously influenced, secondly, an AI large model for a drainage twin scene usually has a trillion-level parameter scale, if the AI large model is directly deployed at an edge terminal, single equipment operation power consumption exceeds 20W, model reasoning delay breaks through 100ms, low-power consumption and low-delay special application scenes such as underground terminals and gate stations cannot be adapted, thirdly, the computational power scheduling mode of the existing architecture is relatively single, targeted allocation and optimal design are not carried out aiming at edge side CPU, GPU, NPU heterogeneous computational power resources, the overall operation efficiency of the model is low, and hardware computational power potential is difficult to fully develop. In the prior art, although related technical schemes of AI terminal algorithms such as an AI and 3D technology-based drainage unit twin system and method, a drainage unit twin management system and the like exist, the schemes generally lack a lightweight deployment design for adapting to edge side equipment, so that the design is difficult to realize effective landing application in an edge terminal. In addition, although the hong Mongolian 6 system has the technical characteristics of end-cloud cooperation, a special layered deployment standard and an adaptive scheme aiming at a drainage twin large model are not formed at present, so that the balance problem between model precision, terminal calculation force and running power consumption cannot be effectively solved. In summary, the existing urban drainage twin system still has a plurality of technical defects to be solved in the aspects of deployment architecture, model adaptation, calculation scheduling, standard specification and the like, restricts the large-scale application and efficiency improvement of the urban drainage twin system in an edge side scene, and needs a technical scheme capable of considering autonomous decision making capability, light deployment requirements and heterogeneous calculation optimal scheduling. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a digital drainage twin big model end cloud collaborative deployment method, which aims at: a terminal-edge-cloud three-layer deployment architecture is constructed, so that autonomous reasoning of edge nodes in a broken network state is realized, and the cloud dependence of a system is reduced; Compressing the trillion-level parameter large model to the loadable scale of the edge terminal by a model lightweight quantization technology, and realizing that the end-side push processing time delay is less than or equal to 50ms and the single-terminal power consumption is less than or equal to 5W on the premise of ensuring that the precision loss is less than or equal to 3%; and the heterogeneous calculation power scheduling strategy is optimized, so that the cooperative work of the CPU/GPU/NPU is realized, and the running efficiency of the model is improved. The invention provides a digital drainage twin big model end cloud collaborative deployment method, which comprises the following steps: acquiring an original model and performing light weight treatment to obtain a light weight model; Deploying a lightweight model through a three-layer architecture; performing calculation scheduling according to the tasks; Performing end cloud cooperative test, and Model training and updating are carried out based on the data, and model iteration is completed. In an embodiment of the present invention, the obtaining the original model and performing the lightweight process, the obtaining the lightweight model includes: Obtaining an original model; Calculating the contribution degree of each operator, cutting redundant operators with the contribution degree lower than a threshold value, and performing fine adjustment training on the cut model; Carrying out knowledge distillation training on the cut original model; configuring mixed quantization parameters for quantization to obtain a quantization mo