CN-122019518-A - Model performance prediction method and device, electronic equipment and medium

CN122019518ACN 122019518 ACN122019518 ACN 122019518ACN-122019518-A

Abstract

The embodiment of the invention relates to the technical field of artificial intelligence and high-performance calculation, and provides a method, a device, electronic equipment and a medium for predicting the performance of a model, wherein the method comprises the steps of obtaining predicted application parameters and constraint targets of the model to be detected, which are input by a user; and carrying out performance prediction on the model to be detected based on the prediction application parameters, the constraint target and the target inference mode to obtain a performance prediction result, wherein the performance prediction result comprises a performance index prediction value, a hardware selection recommendation result or a model structure recommendation result. Therefore, accurate prediction of the performance of the model is realized, and a user is assisted to perform optimal resource scheduling and model selection on a specific task.

Inventors

ZHAO ZHIHENG
ZHANG XIYA

Assignees

杭州涿溪脑与智能研究所

Dates

Publication Date: 20260512
Application Date: 20251229

Claims (10)

1. A method for predicting performance of a model, comprising: Acquiring predicted application parameters and constraint targets of a to-be-detected model input by a user; determining a target inference mode based on the predicted application parameters; and carrying out performance prediction on the model to be detected based on the prediction application parameters, the constraint target and the target reasoning mode to obtain a performance prediction result, wherein the performance prediction result comprises a performance index prediction value, a hardware model selection recommendation result or a model structure recommendation result.
2. The method of claim 1, wherein the target inference patterns include a forward performance aware pattern, a hardware-selected recommendation pattern, and a model structure recommendation pattern; The determining a target inference mode based on the predicted application parameters includes: when the predicted application parameters comprise data set features, model structure features and hardware type features, determining that the target inference mode is a forward performance aware mode; When the predicted application parameters comprise data set characteristics, model structure characteristics and expected time, determining that the target reasoning mode is a hardware-based recommendation mode; And when the predicted application parameters comprise data set characteristics, hardware type characteristics and expected time, determining that the target reasoning mode is a model structure recommendation mode.
3. The method according to claim 1, wherein before obtaining the predicted application parameters and constraint targets of the model to be measured input by the user, the method comprises: Acquiring original performance data of a sample deep learning model; And cleaning the original performance data, and constructing a performance knowledge base based on the cleaned performance data, wherein the performance knowledge base comprises a data set, a model structure, hardware parameters and performance indexes.
4. A method according to any one of claims 1-3, wherein said predicting the performance of the model to be measured based on the predicted application parameters, constraint targets and target inference modes to obtain a performance prediction result comprises: If the target reasoning mode is determined to be a forward performance perception mode, generating first triplet information based on data set features, model structure features and hardware type features in the predicted application parameters; retrieving a target performance record in the performance knowledge base, wherein the target performance record is completely matched or isomorphically matched with the first triplet information; and if the target performance record is retrieved, directly returning the corresponding performance index.
5. A method according to any one of claims 1-3, wherein said predicting the performance of the model to be measured based on the predicted application parameters, constraint targets and target inference modes to obtain a performance prediction result comprises: If the target reasoning mode is determined to be a hardware selection type recommendation mode, traversing a hardware candidate set, and generating second triplet information by combining the data set characteristics and the model structure characteristics aiming at the hardware information of each hardware in the hardware candidate set; calculating a performance index corresponding to each hardware based on the constraint target and the second triplet information corresponding to each hardware; And screening the hardware configuration which meets the constraint target and has the lowest cost or highest utilization rate from the hardware candidate set based on the performance index corresponding to each hardware.
6. A method according to any one of claims 1-3, wherein said predicting the performance of the model to be measured based on the predicted application parameters, constraint targets and target inference modes to obtain a performance prediction result comprises: If the target reasoning mode is determined to be a model structure recommendation mode, extracting a first alternative model structure set of the similar task from the performance knowledge base; calculating a performance index corresponding to each model in the first alternative model structure set based on the data set characteristics and the hardware type characteristics in the prediction application parameters; Screening a second alternative model structure set meeting the constraint target based on the performance index and the constraint target corresponding to each model; Querying an average precision index recorded in the performance knowledge base by each target model structure in the second alternative model structure set; and screening a target model structure which meets the time requirement on the appointed hardware and has optimal precision based on the average precision index.
7. The method according to claim 1, wherein the method further comprises: and carrying out performance prediction on the model to be detected through a layered hybrid architecture based on the prediction application parameters, the constraint target and the target reasoning mode to obtain a performance prediction result, wherein the layered hybrid architecture comprises a first layer of table lookup matching layer, a second layer of linear deduction layer and a third layer of machine learning fitting layer.
8. A model performance prediction apparatus, comprising: the acquisition module is used for acquiring predicted application parameters and constraint targets of the to-be-detected model input by a user; the determining module is used for determining a target reasoning mode based on the prediction application parameters; And the prediction module is used for predicting the performance of the model to be detected based on the prediction application parameters, the constraint target and the target reasoning mode to obtain a performance prediction result, wherein the performance prediction result comprises a performance index prediction value, a hardware model selection recommendation result or a model structure recommendation result.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, characterized in that the processor implements the method of performance prediction of the model according to any one of claims 1 to 7 when executing the computer program.
10. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements a method of performance prediction of a model according to any one of claims 1 to 7.

Description

Model performance prediction method and device, electronic equipment and medium Technical Field The invention relates to the technical field of artificial intelligence and high-performance computing, in particular to a performance prediction method and device of a model, electronic equipment and a medium. Background With the rapid development of deep learning technology, the complexity of model structures and the size of data sets are exponentially increasing, and the demands for hardware computing power (such as GPU, TPU, etc.) are increasing. In a practical artificial intelligence engineering landing process, users are typically faced with the following challenges: The time cost evaluation is difficult, and the time, the video memory and the computing power resources which need to be consumed are difficult to be accurately estimated before training or deployment is started. Often relying on human experience or by trial and error of several epochs, this method is error-prone and inefficient. The hardware selection has the difficulty that the users cannot know which hardware has the best cost performance or can meet the specific time delay requirement in the face of different types of computing cards (such as NVIDIA V100, A100 and T4, and the like), the resource waste or the performance is often caused to be not up to standard, and the optimal decision is difficult to make in the purchasing of hardware equipment. Model selection is blind, namely in a scene with limited hardware or limited time (such as edge equipment deployment), it is difficult to quickly determine which model structure can finish calculation in a specified time and reach target accuracy. Often, a great deal of hardware and labor cost is required to perform the earlier testing. Current performance assessment relies primarily on simple benchmarking (Benchmark) or general theoretical force formulas (e.g., FLOPs divided by hardware peak performance). This approach has high accuracy on some of the mainstream model structures (e.g., resNet, mobileNet) and the mainstream datasets (e.g., imageNet, COCO). However, this approach does not refine, summarize and extend the complex factors of the dataset characteristics (e.g., dataset size, image resolution), model operator differences (different operators are used differently under different architectures), and bandwidth bottlenecks when the hardware is actually running. The prior art lacks a unified performance sensing and recommending mechanism which comprehensively considers the coupling relation of a data set, a model and hardware, and cannot support flexible multidimensional constraint recommendation. Disclosure of Invention The invention provides a performance prediction method, device, electronic equipment and medium for a model, which are used for solving the defect that a unified performance sensing and recommending mechanism for comprehensively considering the coupling relation of a data set, the model and hardware is not capable of supporting flexible multidimensional constraint recommendation in the prior art, realizing accurate prediction of the performance of the model, and assisting a user in carrying out optimal resource scheduling and model selection on specific tasks. The invention provides a performance prediction method of a model, which comprises the following steps: Acquiring predicted application parameters and constraint targets of a to-be-detected model input by a user; determining a target inference mode based on the predicted application parameters; and carrying out performance prediction on the model to be detected based on the prediction application parameters, the constraint target and the target reasoning mode to obtain a performance prediction result, wherein the performance prediction result comprises a performance index prediction value, a hardware model selection recommendation result or a model structure recommendation result. In one possible embodiment, the method further comprises: The target reasoning mode comprises a forward performance perception mode, a hardware selection recommendation mode and a model structure recommendation mode; when the predicted application parameters comprise data set features, model structure features and hardware type features, determining that the target inference mode is a forward performance aware mode; When the predicted application parameters comprise data set characteristics, model structure characteristics and expected time, determining that the target reasoning mode is a hardware-based recommendation mode; And when the predicted application parameters comprise data set characteristics, hardware type characteristics and expected time, determining that the target reasoning mode is a model structure recommendation mode. In one possible embodiment, the method further comprises: Acquiring original performance data of a sample deep learning model; And cleaning the original performance data, and constructing a performance knowledge