CN-122018916-A - Model compiling method and device

CN122018916ACN 122018916 ACN122018916 ACN 122018916ACN-122018916-A

Abstract

The present disclosure provides a method and apparatus for model compilation. The method comprises the steps of determining a hardware feature vector of target hardware and analysis data of a model to be operated on the target hardware, generating a first optimization strategy and a second optimization strategy by utilizing a decision rule and a machine learning model respectively based on the hardware feature vector and the analysis data of the model to be operated, fusing the first optimization strategy and the second optimization strategy to obtain a target optimization strategy, and applying the target optimization strategy to a compiler under the condition that a confidence evaluation result of the target optimization strategy meets the requirements so that the compiler compiles the model to be operated according to the target optimization strategy.

Inventors

WANG KUN
ZHANG TAO
LI PEIWEN
GUO YANGBO

Assignees

重庆长安汽车股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260122

Claims (10)

1. A method of model compilation, the method comprising: Determining a hardware feature vector of target hardware and analysis data of a model to be operated on the target hardware; Based on the hardware feature vector and the analysis data of the model to be operated, generating a first optimization strategy and a second optimization strategy by utilizing a decision rule and a machine learning model respectively; fusing the first optimization strategy and the second optimization strategy to obtain a target optimization strategy; And under the condition that the confidence evaluation result of the target optimization strategy meets the requirement, applying the target optimization strategy to a compiler so that the compiler compiles the model to be operated according to the target optimization strategy.
2. The method according to claim 1, wherein the applying the target optimization strategy to a compiler, so that the compiler compiles the model to be run according to the target optimization strategy, includes: According to the target optimization strategy, an optimization module corresponding to the target optimization strategy is added in the compiler, so that the model to be operated is compiled through the optimization module corresponding to the target optimization strategy, and the optimization module at least comprises at least one module of a memory optimization module, a layout change module, a hardware perception operator fusion module, a hardware perception circulation optimization module and an instruction set optimization module.
3. The method according to claim 2, wherein compiling the model to be run by the optimization module corresponding to the target optimization strategy includes: and compiling the model to be operated by calling an optimization module corresponding to the target optimization strategy in at least one stage of a compiled Relay diagram optimization stage, a Tensor expression optimization stage and an automatic tuning stage.
4. A method according to any one of claims 1-3, characterized in that the method further comprises: Acquiring operation data of the compiled model to be operated, wherein the operation data comprises execution time of the compiled model to be operated in the target hardware; And adjusting optimization parameters in the target optimization strategy by using the operation data until the operation data meets the requirements.
5. The method according to claim 4, wherein the method further comprises: Updating the adjusted target optimization strategy to an optimization strategy library, wherein the optimization strategy library comprises corresponding adjusted target optimization strategies when the model to be operated is operated on the target hardware; And taking the optimization strategy library as priori data, and performing iterative training on the machine learning model.
6. The method of claim 1, wherein determining analysis data for a model to be run on the target hardware comprises: Analyzing the calculation graph of the model to be operated to obtain analysis data of the model to be operated, wherein the analysis data of the model to be operated comprises operator type characteristics, data dependency relations in the model to be operated and target nodes, and the target nodes are nodes with calculated quantity larger than a preset threshold value.
7. The method of any of claims 1-6, wherein the determining a hardware feature vector of the target hardware comprises: Testing the target hardware to obtain test data of the target hardware, wherein the test data comprises at least one of computing capability test data, memory system test data, parallel capability test data, cache hierarchy test data and instruction set test data; And carrying out quantization and normalization processing on at least one test data among the computing capability test data, the memory system test data, the parallel capability test data, the cache hierarchy test data and the instruction set test data to obtain a hardware feature vector of the target hardware, and storing the hardware feature vector into a hardware feature database.
8. The device for compiling the model is characterized by comprising a determining unit, a generating unit, a fusion unit and an application unit; the determining unit is used for determining a hardware feature vector of target hardware and analysis data of a model to be operated on the target hardware; The generating unit is used for generating a first optimization strategy and a second optimization strategy by utilizing a decision rule and a machine learning model respectively based on the hardware feature vector and analysis data of the model to be operated; the fusion unit is used for fusing the first optimization strategy and the second optimization strategy to obtain a target optimization strategy; The application unit is configured to apply the target optimization strategy to a compiler when the confidence evaluation result of the target optimization strategy meets the requirement, so that the compiler compiles the model to be run according to the target optimization strategy.
9. An electronic device comprising a processor, a memory storing instructions executable by the processor, which when executed by the processor, implement the method of any one of claims 1 to 7.
10. A computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the method according to any one of claims 1 to 7.

Description

Model compiling method and device Technical Field The present application relates to the field of computer software technologies, and in particular, to a method and apparatus for model compilation. Background With the development of deep learning technology, how to efficiently deploy models on various hardware becomes a key problem. The model compiling is a necessary link for connecting the model and the hardware, and the optimizing effect directly influences the running performance of the model on the target hardware. In the related art, a partial compilation framework supports deployment of models from different deep learning frameworks to a variety of hardware platforms and provides an auto-tuning mechanism. For example, the AutoTVM module in the TVM framework searches for optimal configuration through search space exploration, but it relies on predefined search spaces, the tuning process takes longer and is difficult to dynamically adjust according to different hardware characteristics. In addition, the compiling process has insufficient analysis on hardware and models, so that an optimization strategy is lack of pertinence, and compiling performance is affected. Therefore, a new way to improve the above-mentioned problems is needed. Disclosure of Invention One of the purposes of the present disclosure is to provide a method for compiling a model, so as to solve the problem that the compiling performance is affected due to lack of pertinence of an optimization strategy caused by insufficient analysis of hardware and the model in a compiling process in a related scheme; the second object is to provide a model compiling device, the third object is to provide an electronic device, the fourth object is to provide a computer readable storage medium, and the fifth object is to provide a computer program product. In order to achieve the above purpose, the technical scheme adopted in the present disclosure is as follows: The method comprises the steps of determining a hardware feature vector of target hardware and analysis data of a model to be operated on the target hardware, generating a first optimization strategy and a second optimization strategy by utilizing a decision rule and a machine learning model respectively based on the hardware feature vector and the analysis data of the model to be operated, fusing the first optimization strategy and the second optimization strategy to obtain a target optimization strategy, and applying the target optimization strategy to a compiler under the condition that a confidence evaluation result of the target optimization strategy meets the requirements so as to enable the compiler to compile the model to be operated according to the target optimization strategy. According to the technical means, two sets of optimization strategies are generated by combining a decision rule and a machine learning model through the analysis data of the hardware feature vector and the model to be operated, and are fused into a target optimization strategy. Therefore, the hardware characteristics and the structure of the model to be operated can be considered more comprehensively, and the adaptability and accuracy of the optimization strategy are improved. In addition, the target optimization strategy is only applied when the confidence evaluation result meets the conditions, so that the inefficient or unstable strategy is prevented from being adopted, and the compiling efficiency and the model performance are improved. Further, the target optimization strategy is applied to the compiler so that the compiler compiles the model to be operated according to the target optimization strategy, and the method comprises the steps of adding an optimization module corresponding to the target optimization strategy into the compiler according to the target optimization strategy so as to compile the model to be operated through the optimization module corresponding to the target optimization strategy, wherein the optimization module at least comprises at least one module of a memory optimization module, a layout change module, a hardware perception operator fusion module, a hardware perception loop optimization module and an instruction set optimization module. Further, compiling the model to be operated through an optimization module corresponding to the target optimization strategy, wherein the compiling comprises the steps of compiling the model to be operated through the optimization module corresponding to the target optimization strategy in at least one stage of a compiled Relay graph optimization stage, a Tensor expression optimization stage and an automatic tuning stage. The method further comprises the steps of obtaining operation data of the compiled model to be operated, wherein the operation data comprise execution time of the compiled model to be operated in target hardware, and adjusting optimization parameters in a target optimization strategy by using the operation data until the operat