Search

CN-119621511-B - Performance profiling method, system and storage medium for distributed service

CN119621511BCN 119621511 BCN119621511 BCN 119621511BCN-119621511-B

Abstract

The application relates to a performance analysis method, a system and a storage medium of a distributed service, wherein the distributed service comprises a plurality of working nodes, each working node is respectively provided with a local analyzer, the performance analysis method comprises the steps of receiving stack tracking data which are collected by each local analyzer and are aimed at the working nodes, carrying out aggregation processing on the stack tracking data to obtain aggregated stack data, generating a target flame graph according to the aggregated stack data, and generating a target performance analysis result aiming at the distributed service based on the target flame graph. The application solves the problem that the performance analysis is difficult to adapt to large-scale distributed service.

Inventors

  • YANG DINGYU
  • ZHANG DONGXIANG
  • Xie Zhongle
  • CHEN KE
  • SHOU LIDAN
  • LI HUAN

Assignees

  • 杭州高新区(滨江)区块链与数据安全研究院
  • 浙江大学

Dates

Publication Date
20260512
Application Date
20241204

Claims (9)

  1. 1. A performance parsing method of a distributed service is characterized by being applied to a controller, wherein the distributed service comprises a plurality of working nodes, each working node is respectively provided with a local parser, the local parser is connected with the controller, and the method comprises the following steps: Receiving stack tracking data collected by each local parser for the working node, and performing aggregation processing on the stack tracking data to obtain aggregated stack data; Generating a target flame graph according to the aggregation stack data, and generating a target performance analysis result for the distributed service based on the target flame graph; the method further comprises the steps of: Acquiring first stack tracking data of the local profiler at a first sampling frequency, generating a first target flame graph according to the first stack tracking data, and generating a first performance profiling result based on the first target flame graph, wherein the first sampling frequency is higher than a preset sampling frequency; Acquiring second stack tracking data of the local profiler at the second sampling frequency, generating a second target flame pattern according to the second stack tracking data, and generating a second performance profiling result based on the second target flame pattern; Performing similarity calculation based on the first performance analysis result and the second performance analysis result to obtain a similarity result; and under the condition that the similarity result is detected to be smaller than a preset similarity threshold value, taking the first performance analysis result as the target performance analysis result.
  2. 2. The performance profiling method of claim 1, wherein the receiving stack trace data collected by each of the local profilers for the working node comprises: receiving an initial flame pattern sent by each local profiler, wherein the initial flame pattern is generated by the local profiler from the stack trace data; And analyzing the initial flame diagram to obtain the stack tracking data.
  3. 3. The performance profiling method of claim 2, wherein the local profiler further comprises an agent and an acquisition core, wherein the initial flame map is generated by the acquisition core acquiring the stack trace data and processing the stack trace data by the agent.
  4. 4. The performance profiling method of claim 3, wherein the number of acquisition cores is multiple, and wherein the receiving the initial flame pattern sent by each local profiler comprises: Detecting operation information of the distributed service, and determining a corresponding target kernel from the acquisition kernels based on the operation information, wherein the agent processes the stack trace data acquired by the target kernel and generates the initial flame map.
  5. 5. A performance parsing method of a distributed service is characterized by being applied to a local parser, wherein the distributed service comprises a plurality of working nodes, each working node is respectively provided with the local parser, and the method comprises the following steps: Collecting stack tracking data for the working node; The method comprises the steps of receiving stack tracking data, sending the stack tracking data to a controller, wherein the controller carries out aggregation processing on the stack tracking data to obtain aggregation stack data, generating a target flame graph according to the aggregation stack data, and generating a target performance analysis result aiming at the distributed service based on the target flame graph; The controller obtains first stack tracking data of the local profiler at a first sampling frequency, generates a first target flame graph according to the first stack tracking data, and generates a first performance profiling result based on the first target flame graph, wherein the first sampling frequency is higher than a preset sampling frequency; the controller obtains second stack tracking data of the local analyzer at the second sampling frequency, generates a second target flame pattern according to the second stack tracking data, and generates a second performance analysis result based on the second target flame pattern; Performing similarity calculation based on the first performance analysis result and the second performance analysis result to obtain a similarity result; and under the condition that the similarity result is detected to be smaller than a preset similarity threshold value, taking the first performance analysis result as the target performance analysis result.
  6. 6. The performance profiling method of claim 5, wherein the collecting stack trace data for the working node comprises: Performing stack tracking sampling processing on the distributed service by using a plurality of threads, and pruning the threads based on the number of thread samples to obtain target threads in the threads; and acquiring the stack tracking data obtained by sampling the target thread.
  7. 7. A performance profiling system for a distributed service, wherein the distributed service comprises a plurality of working nodes, the performance profiling system comprising: the local profiler is deployed on the corresponding working node and is used for collecting stack tracking data on the working node; The controller is used for receiving the stack tracking data sent by each local parser and carrying out aggregation processing on the stack tracking data to obtain aggregated stack data; The controller is further configured to generate a target flame graph according to the aggregate stack data, and generate a target performance profile result for the distributed service based on the target flame graph; the controller is further used for acquiring first stack tracking data of the local analyzer at a first sampling frequency, generating a first target flame graph according to the first stack tracking data, and generating a first performance analysis result based on the first target flame graph, wherein the first sampling frequency is higher than a preset sampling frequency; The controller is further configured to reduce and adjust the first sampling frequency based on the first performance analysis result to obtain a second sampling frequency, obtain second stack trace data of the local analyzer at the second sampling frequency, generate a second target flame pattern according to the second stack trace data, generate a second performance analysis result based on the second target flame pattern, perform similarity calculation based on the first performance analysis result and the second performance analysis result to obtain a similarity result, and take the first performance analysis result as the target performance analysis result if the similarity result is detected to be smaller than a preset similarity threshold.
  8. 8. The performance profiling system of claim 7, wherein the performance profiling system further comprises: The monitoring server is used for receiving a hot spot function index sent by the local profiler, wherein the hot spot function index is generated by the local profiler based on the stack tracking data; the monitoring server is further used for generating tracking storage data for each hotspot function in the distributed service based on the hotspot function indexes.
  9. 9. A storage medium, characterized in that the storage medium has stored therein a computer program, wherein the computer program is arranged to execute the performance profiling method of the distributed service according to any of the claims 1-6 at run-time.

Description

Performance profiling method, system and storage medium for distributed service Technical Field The present application relates to the field of computers, and in particular, to a performance profiling method, system and storage medium for distributed services. Background Performance profiling refers to analyzing the runtime behavior of a program by recording hardware or software metrics, such as function call frequency, execution time, and resource usage, to accurately locate performance bottlenecks, memory leaks, or other efficiency issues. The existing performance analysis tool can only realize the performance analysis of a single machine and cannot cope with diversified and distributed workloads. At present, no effective solution is proposed for the problem that performance profiling is difficult to adapt to large-scale distributed services in the related art. Disclosure of Invention The embodiment of the application provides a performance analysis method, a system and a storage medium for distributed services, which are used for at least solving the problem that performance analysis is difficult to adapt to large-scale distributed services in the related technology. In a first aspect, an embodiment of the present application provides a performance parsing method of a distributed service, which is applied to a controller, where the distributed service includes a plurality of working nodes, each of the working nodes is respectively deployed with a local parser, the local parser is connected to the controller, and the method includes: receiving stack tracking data for the working node collected by each local parser, and performing aggregation processing on the stack tracking data to obtain aggregated stack data; and generating a target flame graph according to the aggregation stack data, and generating a target performance analysis result aiming at the distributed service based on the target flame graph. In some of these embodiments, the method further comprises: Acquiring first stack tracking data of the local profiler at a first sampling frequency, generating a first target flame graph according to the first stack tracking data, and generating a first performance profiling result based on the first target flame graph, wherein the first sampling frequency is higher than a preset sampling frequency; And acquiring second stack tracking data of the local profiler at the second sampling frequency, generating a second target flame pattern according to the second stack tracking data, and generating a second performance profiling result based on the second target flame pattern. In some of these embodiments, after the generating the second performance profile result, the method further comprises: Performing similarity calculation based on the first performance analysis result and the second performance analysis result to obtain a similarity result; and under the condition that the similarity result is detected to be smaller than a preset similarity threshold value, taking the first performance analysis result as the target performance analysis result. In some of these embodiments, the receiving stack trace data collected by each of the local profilers for use on the work node includes: receiving an initial flame pattern sent by each local profiler, wherein the initial flame pattern is generated by the local profiler from the stack trace data; And analyzing the initial flame diagram to obtain the stack tracking data. In some embodiments, the local profiler further comprises an agent and an acquisition core, wherein the initial flame pattern is generated by the acquisition core acquiring the stack trace data and processing the stack trace data by the agent. In some embodiments, the number of acquisition cores is multiple, and the receiving the initial flame pattern sent by each local profiler includes: Detecting operation information of the distributed service, and determining a corresponding target kernel from the acquisition kernels based on the operation information, wherein the agent processes the stack trace data acquired by the target kernel and generates the initial flame map. In a third aspect, the present application provides a performance parsing method of a distributed service, applied to a local parser, where the distributed service includes a plurality of working nodes, and each working node is deployed with a local parser, and the method includes: Collecting stack tracking data for the working node; The method comprises the steps of sending stack tracking data to a controller, wherein the controller carries out aggregation processing on the stack tracking data to obtain aggregation stack data, generating a target flame graph according to the aggregation stack data, and generating a target performance analysis result aiming at the distributed service based on the target flame graph. In some of these embodiments, the collecting stack trace data for the working node includes: Performing stack tracking sa