Search

CN-119166483-B - Program performance analysis method, device, medium and product

CN119166483BCN 119166483 BCN119166483 BCN 119166483BCN-119166483-B

Abstract

The invention relates to the technical field of computers and discloses a program performance analysis method, equipment, a medium and a product, wherein the method comprises the steps of starting a user program, creating a sub-thread by using the user program, and taking out an idle target structure body from a structure body array; the method comprises the steps of setting asynchronous task performance data and synchronous task performance data in a code interval in a statistics mode, updating the asynchronous task performance data and the synchronous task performance data in a target structure, detecting the updated target structure data by utilizing a sub-thread, calculating the average utilization rate of a board acceleration module in a time period where the start and the end of asynchronous operation are located and time-consuming data from the start time to the end time of synchronous operation, and obtaining the average utilization rate and the average transmission bandwidth of the board acceleration module in the whole user program executing process according to the data calculated by the sub-thread. Therefore, the method can be attached to an actual execution scene, so that the counted data has practicability and flexibility, the performance condition in the process is reflected more comprehensively, and the accurate analysis of the program performance is realized.

Inventors

  • LIU HUI

Assignees

  • 浪潮电子信息产业股份有限公司

Dates

Publication Date
20260505
Application Date
20240930

Claims (11)

  1. 1. A method of program performance analysis, the method comprising: Starting a user program, creating a sub-thread by using the user program, and taking out an idle target structure from the structure array; Counting asynchronous task performance data and synchronous task performance data in a set code interval, and updating the asynchronous task performance data and the synchronous task performance data into a target structure body; the asynchronous task performance data comprise asynchronous execution time, in the process of counting the asynchronous execution time, an asynchronous operation statistical interface is constructed by adopting a structure body comprising a first constructor and a first destructor, events are respectively created and recorded in the first constructor and the first destructor by utilizing the asynchronous operation statistical interface, the asynchronous execution time is calculated through an interface provided by a software development kit, the synchronous task performance data comprise synchronous execution time, in the process of counting the synchronous execution time, the synchronous operation statistical interface is constructed by adopting a structure body comprising a second constructor and a second destructor, the synchronous operation statistical interface is utilized to record the starting time of statistical content, and when leaving a statistical content scope, the second destructor is utilized to record the ending time of the statistical content, and the synchronous execution time is counted according to the starting time and the ending time of the statistical content; Detecting updated target structure data by utilizing the sub-thread, and calculating the average utilization rate of the board acceleration module in a time period where the start and the end of asynchronous operation are positioned and time-consuming data from the start time to the end time of synchronous operation; Multiplying each operation time of each task operation saved by the sub thread by the calculated average utilization rate and dividing the calculated average utilization rate by the total operation time to obtain the average utilization rate of the board acceleration module in the whole user program executing process; According to the data volume transferred by each data transmission operation, obtaining the total transmission data volume of the data transmission operation task; dividing the total transmission data size of the data transmission operation task by the time-consuming data calculated by the sub-thread to obtain the average transmission bandwidth in the whole user program execution process.
  2. 2. The program performance analysis method according to claim 1, characterized by further comprising: And recording idle time between two adjacent task operations by utilizing the sub-thread, and counting the idle time information of a set number as a potential optimization time interval according to the recorded idle time, wherein the idle time information comprises idle duration, idle starting time, operation names and operation serial numbers.
  3. 3. The program performance analysis method according to claim 1, characterized by further comprising, while updating asynchronous task performance data and synchronous task performance data into the target structure: When the user program writes the update data, the writing operation wakes up the sub-thread in the blocking waiting state; When the user program does not write the update data, the sub-thread enters a blocking waiting state.
  4. 4. The program performance analysis method according to claim 1, wherein in the process of detecting updated target structure data by the sub-thread, comprising: If the sub-thread detects that the starting timing operation of the asynchronous operation is updated, inquiring an asynchronous operation ending event completion mark in real time, and calculating the average utilization rate of the board acceleration module in the time period where the starting and ending of the asynchronous operation are located.
  5. 5. The method of claim 4, wherein if the sub-thread detects that the start timing operation of the asynchronous operation is updated, querying the completion flag of the end event of the asynchronous operation in real time, and calculating the average utilization of the board acceleration module in the time period where the start and the end of the asynchronous operation are located, the method comprises: if the sub-thread detects the event of writing the asynchronous operation by the user program and inquires that the event starts to be executed in an asynchronous task sequence, periodically detecting and counting the utilization rate of the board acceleration module used by the sub-thread in real time until the user program updates the end event data and inquires that the execution of the asynchronous operation end event is completed; and calculating the average utilization rate of the board card acceleration module in the time period of the start and the end of the asynchronous operation according to the counted utilization rate of the board card acceleration module.
  6. 6. The program performance analysis method according to claim 5, wherein in the process of detecting updated target structure data by the sub-thread, further comprising: if the sub-thread detects that the synchronous operation is updated to be the end timing data, time-consuming data from the synchronous operation starting time to the synchronous operation ending time are calculated.
  7. 7. The program performance analysis method according to claim 1, wherein when the user program operates for a plurality of processes, each process uses an accelerator card; Each process correspondingly creates a sub-thread, and each sub-thread stores a corresponding acceleration card device number so as to distinguish different board card data.
  8. 8. The program performance analysis method according to claim 1, characterized by further comprising: creating a shared memory; correspondingly, after calculating the average utilization rate of the board acceleration module in the time period of the start and the end of the asynchronous operation and the time-consuming data from the start time to the end time of the synchronous operation, the method further comprises the following steps: And storing the calculated average utilization rate and time-consuming data, the asynchronous task performance data and the synchronous task performance data in a shared memory.
  9. 9. A program performance analysis apparatus, the apparatus comprising: A memory for storing a computer program; processor for implementing the steps of the program performance analysis method according to any one of claims 1 to 8 when executing the computer program.
  10. 10. A non-volatile storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the program performance analysis method according to any one of claims 1 to 8.
  11. 11. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the program performance analysis method according to any one of claims 1 to 8.

Description

Program performance analysis method, device, medium and product Technical Field The present invention relates to the field of computer technologies, and in particular, to a program performance analysis method, apparatus, medium, and product. Background In the related technical scheme, the performance analysis of the program generally depends on a performance analysis tool provided by a board card manufacturer, the tool can count the calling times, execution time and other information of interfaces such as a board operation time and an operator library and the like of the user program in the whole process, and the counted result is generally displayed through a visualization tool according to the execution time sequence. The running time and operator information counted by the tool are in a board software development kit (Software Development Kit, SDK), namely the tool can only count the calling condition of interfaces in the software development kit provided by the board. Besides the tools of the board card manufacturer, some framework software provides other performance analysis tools, specific performance statistics function codes are added, the use scene is limited, the statistics is the interface performance in the framework, and the statistics result only comprises information such as the interface calling times, logic processing time and the like, but cannot count data capable of reflecting the board card performance. Disclosure of Invention The invention aims to provide a program performance analysis method, equipment, medium and product, which can be attached to an actual execution scene, so that the counted data has practicability and flexibility, the performance condition in the process is reflected more comprehensively, and the accurate analysis of the program performance is realized. In order to solve the above technical problems, the present invention provides a program performance analysis method, the method comprising: Starting a user program, creating a sub-thread by using the user program, and taking out an idle target structure from the structure array; counting asynchronous task performance data and synchronous task performance data in a set code interval, and updating the asynchronous task performance data and the synchronous task performance data into a target structure body; Detecting updated target structure data by utilizing the sub-thread, and calculating the average utilization rate of the board acceleration module in a time period where the start and the end of asynchronous operation are positioned and time-consuming data from the start time to the end time of synchronous operation; and according to the average utilization rate and the time-consuming data calculated by the sub-thread, combining the asynchronous task performance data and the synchronous task performance data to obtain the average utilization rate and the average transmission bandwidth of the board acceleration module in the whole user program executing process. In a first aspect, the program performance analysis method provided by the present invention further includes: And recording idle time between two adjacent task operations by utilizing the sub-thread, and counting the idle time information of a set number as a potential optimization time interval according to the recorded idle time, wherein the idle time information comprises idle duration, idle starting time, operation names and operation serial numbers. On the other hand, in the program performance analysis method provided by the invention, the asynchronous task performance data comprises asynchronous execution time, and the process of counting the asynchronous execution time comprises the following steps: Constructing an asynchronous operation statistical interface by adopting a structure body comprising a first constructor and a first destructor; Creating events in the first constructor and the first destructor by using the asynchronous operation statistics interface and recording the events so as to calculate asynchronous execution time through an interface provided by a software development kit. On the other hand, in the program performance analysis method provided by the invention, the synchronous task performance data comprises synchronous execution time, and the process of counting the synchronous execution time comprises the following steps: constructing a synchronous operation statistical interface by adopting a structure body containing a second constructor and a second destructor; calling the second constructor to record the starting time of the statistical content by using the synchronous operation statistical interface, and calling the second destructor to record the ending time of the statistical content when leaving the statistical content scope; and counting the synchronous execution time according to the start time and the end time of the counted content. On the other hand, in the program performance analysis method provided by the present