US-20260127492-A1 - DATA PROCESSING SYSTEM AND MODEL OPTIMIZATION METHOD

US20260127492A1US 20260127492 A1US20260127492 A1US 20260127492A1US-20260127492-A1

Abstract

A data processing system, which performs a model optimization for a first model executed on a platform, comprises a first processing unit and a second processing unit. The first processing unit is configured to capture a set of statistical data of the first model on the platform, and to generate trace data based on the statistical data, wherein the trace data indicates a plurality of performance metrics of the first model. The second processing unit is configured to execute a second model to analyze the performance metrics indicated by the trace data to generate an advice data for the first model. The advice data comprises a suggestion for optimizing the first model and/or a bottleneck identification for indicating a bottleneck of performance of the first model.

Inventors

Ping-Hsun Hsieh
Wei-Liang Kuo
Ming-Yu Hung

Assignees

MEDIATEK INC.

Dates

Publication Date: 20260507
Application Date: 20250716

Claims (20)

1 . A data processing system, for performing a model optimization for a first model which is executed on a platform, the data processing system comprising: a first processing unit, configured to capture a set of statistical data of the first model, and to generate trace data based on the statistical data, wherein the trace data indicates a plurality of performance metrics of the first model; and a second processing unit, configured to execute a second model to analyze the performance metrics indicated by the trace data to generate an advice data for the first model, wherein the advice data comprises a suggestion for optimizing the first model and/or a bottleneck identification for indicating a bottleneck of performance of the first model.
2 . The data processing system of claim 1 , wherein the second model is a large language model (LLM) and different from the first model.
3 . The data processing system of claim 1 , wherein the performance metrics comprise an execution time for each layer or operation of the first model, a hardware resource usage associated with the hardware resources of the platform which are utilized by the first model, a power consumption and temperature monitoring for energy efficiency issue, a memory access pattern, and data transfer statistics.
4 . The data processing system of claim 1 , wherein the first processing unit is further configured to convert the trace data into a visual data which is a visualization graph of the trace data.
5 . The data processing system of claim 4 , further comprising: a user interface, configured to demonstrate the visual data.
6 . The data processing system of claim 5 , wherein the second processing unit is further configured to mark a plurality of contents of the advice data in the visual data, and the user interface is further configured to demonstrate the contents which are marked.
7 . The data processing system of claim 1 , wherein the second processing unit comprising: a training module, configured to retrieve a historical trace data and provides the historical trace data as a first portion of a training data, and the training data is used to train the second model in a training phase.
8 . The data processing system of claim 7 , wherein the second model is executed by the second processing unit in an execution phase subsequent to the training phase to analyze the performance metrics of the first model.
9 . The data processing system of claim 7 , wherein the second processing unit further comprising: a database, for storing a key information indicating a relationship between the historical trace data and the performance metrics of the first model.
10 . The data processing system of claim 9 , wherein the training module is further configured to generate a prompt based on the key information and to provide the prompt as a second portion of the training data.
11 . A model optimization method for a first model which is executed on a platform, comprising: capturing a set of statistical data of the first model on the platform; generating trace data based on the statistical data, wherein the trace data indicates a plurality of performance metrics of the first model; and executing a second model to analyze the performance metrics indicated by the trace data to generate an advice data for the first model, wherein the advice data comprises a suggestion for optimizing the first model and/or a bottleneck identification for indicating a bottleneck of performance of the first model.
12 . The model optimization method of claim 11 , wherein the second model is a large language model (LLM) and different from the first model.
13 . The model optimization method of claim 11 , wherein the performance metrics comprise an execution time for each layer or operation of the first model, a hardware resource usage associated with the hardware resources of the platform which are utilized by the first model, a power consumption and temperature monitoring for energy efficiency issue, a memory access pattern, and data transfer statistics.
14 . The model optimization method of claim 11 , wherein after the step of generating trace data based on the statistical data, further comprising: converting the trace data into a visual data which is a visualization graph of the trace data.
15 . The model optimization method of claim 14 , wherein after the step of converting the trace data into the visual data, further comprising: demonstrating the visual data through a user interface.
16 . The model optimization method of claim 15 , wherein a plurality of contents of the advice data are marked in the visual data, and the marked contents are demonstrated by the user interface.
17 . The model optimization method of claim 11 , wherein before the step of executing a second model to analyze the performance metrics indicated by the trace data, further comprising: retrieving a historical trace data; providing the historical trace data as a first portion of a training data; and training the second model in a training phase, by the training data.
18 . The model optimization method of claim 17 , wherein in the step of executing a second model to analyze the performance metrics indicated by the trace data, the second model is executed in an execution phase subsequent to the training phase.
19 . The model optimization method of claim 17 , wherein before the step of training the second model in a training phase, further comprising: storing a key information indicating a relationship between the historical trace data and the performance metrics of the first model.
20 . The model optimization method of claim 19 , further comprising: generating a prompt based on the key information; and providing the prompt as a second portion of the training data.

Description

This application claims the benefit of U.S. provisional application Ser. No. 63/715,673, filed Nov. 4, 2024, the disclosure of which is incorporated by reference herein in its entirety. TECHNICAL FIELD The disclosure relates to a model optimization mechanism, and particularly relates to a data processing system and a model optimization method for a target model executed on a platform. BACKGROUND For evaluating a performance of a target model, a toolset named “profiling system” is often utilized. The profiling system may perform a “performance profiling” for the target model, which may collect trace data of a computational model according to statistical data of the computational model when the computational model is executed on a hardware platform, and the trace data indicates various performance metrics. After the profiling system collects the trace data, researchers need to manually analyze the trace data to identify bottlenecks and inefficiencies of the target model, and further provide suggestions for optimizing the target model. The whole process may cause a huge timing cost. Furthermore, the bottlenecks of the target model cannot be precisely identified with manual efforts by the researchers. In view of the above issues, it is desirable to have an improved model optimization mechanism, which can automatically and precisely analyze trace data of the computational model in order to identify the bottlenecks of the target model precisely. SUMMARY According to one embodiment of the present disclosure, a data processing system is provided. The data processing system is for performing a model optimization for a first model which is executed on a platform, and the data processing system comprises a first processing unit and a second processing unit. The first processing unit is configured to capture a set of statistical data of the first model on the platform, and to generate trace data based on the statistical data, wherein the trace data indicates a plurality of performance metrics of the first model. The second processing unit is configured to execute a second model to analyze the performance metrics indicated by the trace data to generate an advice data for the first model. The advice data comprises a suggestion for optimizing the first model and/or a bottleneck identification for indicating a bottleneck of performance of the first model. According to another embodiment of the present disclosure, a model optimization method is provided. The model optimization method is for a first model which is executed on a platform, and the model optimization method comprises the following steps. A set of statistical data of the first model on the platform are captured. Trace data is generated based on the statistical data, wherein the trace data indicates a plurality of performance metrics of the first model. A second model is executed to analyze the performance metrics indicated by the trace data to generate an advice data for the first model. The advice data comprises a suggestion for optimizing the first model and/or a bottleneck identification for indicating a bottleneck of performance of the first model. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of a data processing system according to an embodiment of the present disclosure. FIG. 2 is a schematic diagram of the visual data. FIG. 3 is a block diagram of the first processing unit. FIG. 4 is a block diagram of the second processing unit. FIG. 5 is a flow diagram of a model optimization method according to an embodiment of the present disclosure. FIG. 6 is a flow diagram of a model optimization method according to still another embodiment of the present disclosure. FIG. 7 is a flow diagram of a model optimization method according to yet another embodiment of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed embodiments. It will be apparent, however, that one or more embodiments may be practiced without these specific details. In other instances, well-known structures and devices are schematically shown in order to simplify the drawing. DETAILED DESCRIPTION Referring to FIG. 1, which is a block diagram of a data processing system 1000 according to an embodiment of the present disclosure. The data processing system 1000 is used to perform a model optimization for a first model m1. The first model m1 is referred to as a “target model”, which may be any type of computational model, e.g., a convolutional neural network (CNN) model. The first model m1 is deployed and executed on a platform 2000, and the platform 2000 is a hardware device. For example, the platform 2000 may be a portable or fixed hardware device, e.g., a smart phone, a wearable device, a panel computer, a laptop computer or a desktop computer. The platform 2000 has hardware resources, e.g., computing cores, memory devices, and communication bandwid