Search

CN-121070750-B - Performance counter, method for the same, electronic device and storage medium

CN121070750BCN 121070750 BCN121070750 BCN 121070750BCN-121070750-B

Abstract

The present disclosure provides a performance counter, a method, an electronic device, and a storage medium for the performance counter. The method for the performance counter of the artificial intelligence processor comprises the steps of receiving counting period configuration information to configure the counting period of events of the performance counter for a working module of a processor core of the artificial intelligence processor, configuring a counting bit width for counting the number of the events based on the counting period, counting the number of the events based on the counting bit width, outputting a counted value every other counting period, and resetting the counted value. The method for the performance counter of the artificial intelligence processor can enable a user to obtain a count value of events in a desired counting period and has small data volume, fine-granularity sampling and analysis of the performance counter and potential low bandwidth occupation are realized, and accordingly influence on the operation of the artificial intelligence processor is reduced.

Inventors

  • Request for anonymity
  • Request for anonymity
  • Request for anonymity
  • Request for anonymity

Assignees

  • 上海壁仞科技股份有限公司

Dates

Publication Date
20260508
Application Date
20251104

Claims (20)

  1. 1. A method for a performance counter of an artificial intelligence processor, the method comprising: receiving count period configuration information to configure a count period of events of the performance counter for a work module of a processor core of the artificial intelligence processor; Configuring a count bit width for counting the number of event occurrences based on the count period, wherein the count bit width is a number of bits for representing the count; counting the number of occurrences of the event based on the count bit width, and Outputting the counted count value every the counting period, and resetting the count value.
  2. 2. The method according to claim 1, wherein the method further comprises: Receiving compressed mode configuration information to configure a compressed mode of the count value, and And processing the count value based on the compression mode, and sending the processed count value to a memory of the artificial intelligence processor.
  3. 3. The method of claim 2, wherein processing the count value based on the compressed mode comprises: in response to the performance counter being configured in a compressed mode, the count value is encoded into a format having a smaller bit width as the processed count value.
  4. 4. The method according to claim 2, wherein the method further comprises: in response to the performance counter being configured in a non-compressed mode, the count value is organized in a predetermined format as a data packet as a count value for the processing.
  5. 5. The method of claim 2, wherein the memory comprises a shared memory and a global memory of the artificial intelligence processor that are provided separately for the performance counter and the processor core, and Transmitting the processed count value to a memory of the artificial intelligence processor, comprising: Writing the processed count value into the shared memory in response to the processor core being in an operating state, and And writing the processed count value from the shared memory to the global memory in response to the processor core being in a work completion state.
  6. 6. The method of claim 1, wherein counting the number of event occurrences based on the counted bit width comprises: And counting input signals from the working module based on the counting bit width, wherein the input signals represent whether the event occurs or not.
  7. 7. The method of claim 6, wherein the method further comprises: Receiving signal selection configuration information to configure the performance counter to select a plurality of input signals to be counted at a time, and Wherein counting input signals from the work module based on the count bit width comprises: the plurality of input signals are respectively counted based on a plurality of count bit widths.
  8. 8. The method of claim 7, wherein the plurality of input signals are associated with respective characterized events.
  9. 9. The method of claim 7, wherein counting the plurality of input signals based on a plurality of counted bit widths, respectively, comprises: counting at a first count bit width for each of a first set of the plurality of input signals, and For each of a second set of the plurality of input signals, counting at a second count bit width.
  10. 10. The method of claim 8, wherein counting the plurality of input signals based on a plurality of counted bit widths, respectively, comprises: counting is performed with the same count bit width for each of the plurality of input signals.
  11. 11. The method of claim 1, wherein the count period does not cause overflow of the count bit wide count cells.
  12. 12. The method of claim 11, wherein configuring a count bit width for counting the number of event occurrences based on the count period comprises: configuring the count bit width based on the count period according to the following configuration rules: if 0< count period is less than or equal to 60, count bit width=6 bits; If 60< count period is less than or equal to 250, count bit width = 8 bits; If 250< count period is less than or equal to 960, count bit width = 10 bits; If 960< count period is less than or equal to 16000, count bit width = 14 bits; if 16000< count period, count bit width=32 bits.
  13. 13. The method according to any one of claims 1-12, further comprising: counting the number of programs currently being executed by the work module to obtain a program count value, and In response to the program count value being 0, either sleep the performance counter or And starting the performance counter in response to the program count value not being 0.
  14. 14. A performance counter for an artificial intelligence processor, the performance counter comprising: a configuration register configured to receive count period configuration information to configure a count period of events of the performance counter for a work module of a processor core of the artificial intelligence processor; A control unit configured to configure a count bit width for counting the number of occurrence of the event based on the count period, wherein the count bit width is the number of bits for representing the count, and A counting unit configured to: counting the number of occurrences of the event based on the count bit width, and Outputting the counted count value every the counting period, and resetting the count value.
  15. 15. The performance counter of claim 14, wherein the configuration register is further configured to: Receiving compressed mode configuration information to configure a compressed mode of the count value, and The performance counter further includes: and the compression unit is configured to process the count value based on the compression mode and send the processed count value to the internal memory of the artificial intelligence processor.
  16. 16. The performance counter of claim 15, wherein the memory comprises a shared memory and a global memory of the artificial intelligence processor that are provided separately for the performance counter and the processor core, and the performance counter further comprises: a storage unit configured to: Receiving the count value from the compression unit; Writing the processed count value into the shared memory in response to the processor core being in an operating state, and The shared memory is configured to: And writing the processed count value from the shared memory to the global memory in response to the processor core being in a work completion state.
  17. 17. The performance counter of claim 14, wherein the performance counter is configured to, In order to enable counting the number of occurrences of the event based on the count bit width, the counting unit is configured to: Counting input signals from the operational module based on the count bit width, wherein the input signals are indicative of whether the event occurred, The configuration register is further configured to: Receiving signal selection configuration information to configure the performance counter to select a plurality of input signals to be counted at a time, and In order to enable counting of input signals from the working module based on the count bit width, the counting unit is configured to: the plurality of input signals are respectively counted based on a plurality of count bit widths.
  18. 18. The performance counter of claim 17, wherein to enable counting the plurality of input signals based on a plurality of count bit widths, respectively, the counting unit is configured to: Counting a first number of bits wide for each of a first set of the plurality of input signals and a second number of bits wide for each of a second set of the plurality of input signals, or Counting is performed with the same count bit width for each of the plurality of input signals.
  19. 19. The performance counter of claim 14, wherein the performance counter further comprises: A program counting unit configured to: counting the number of programs currently being executed by the work module to obtain a program count value, and In response to the program count value being 0, either sleep the performance counter or And starting the performance counter in response to the program count value not being 0.
  20. 20. An artificial intelligence processor, the artificial intelligence processor comprising: The performance counter of any one of claims 14-19.

Description

Performance counter, method for the same, electronic device and storage medium Technical Field Embodiments of the present disclosure relate to the field of artificial intelligence chips, and more particularly, to performance counters and methods, electronic devices, and storage media therefor. Background The performance counter (Performance Counter) is a tool for collecting and recording the running state data of the system/component in real time, has the core function of providing quantitative basis for performance analysis, fault detection and optimization, and is widely applied to the scenes such as processors, servers, operating systems, application programs and the like. Performance counters may typically be integrated within the processor and used to count the processor and related activities. For example, these activities may include operations related to instructions, cache related to memory, and other hardware activities. Disclosure of Invention At least one embodiment of the present disclosure provides a method for a performance counter of an artificial intelligence processor, including receiving count period configuration information to configure a count period of the performance counter for events of a work module of a processor core of the artificial intelligence processor, configuring a count bit width for counting a number of events based on the count period, counting the number of events based on the count bit width, and outputting a counted count value every other count period, and resetting the count value. For example, in some embodiments, the method further includes receiving compressed mode configuration information to configure a compressed mode of the count value and processing the count value based on the compressed mode and sending the processed count value to a memory of the artificial intelligence processor. For example, in some embodiments, processing the count value based on the compressed mode includes encoding the count value into a bit-wide smaller format as the processed count value in response to the performance counter being configured in the compressed mode. For example, in some embodiments, processing the count value based on the compressed mode includes organizing the count value into data packets in a predetermined format as a processed count value in response to the performance counter being configured in a non-compressed mode. For example, in some embodiments, the memory includes a shared memory and a global memory of the artificial intelligence processor that are provided separately for the performance counter and the processor core, and transmitting the processed count value to the memory of the artificial intelligence processor includes writing the processed count value to the shared memory in response to the processor core being in an operational state, and writing the processed count value from the shared memory to the global memory in response to the processor core being in an operational state. For example, in some embodiments, counting the number of events based on the count bit width includes counting input signals from the work module based on the count bit width, wherein the input signals are indicative of whether an event occurred. For example, in some embodiments, the method further includes receiving signal selection configuration information to configure the performance counter to select a plurality of input signals to be counted at a same time, and wherein counting the input signals from the work module based on the count bit widths includes counting the plurality of input signals based on the plurality of count bit widths, respectively. For example, in some embodiments, a plurality of input signals are associated with separately characterized events. For example, in some embodiments, the plurality of input signals are respectively counted based on a plurality of counted bit widths, including counting at a first counted bit width for each of a first set of input signals of the plurality of input signals and counting at a second counted bit width for each of a second set of input signals of the plurality of input signals. For example, in some embodiments, counting the plurality of input signals based on the plurality of count bit widths, respectively, includes counting with the same count bit width for each of the plurality of input signals. For example, in some embodiments, the count period does not cause overflow of the count cells that are count bit wide. For example, in some embodiments, configuring a count bit width for counting the number of events based on a count period includes configuring the count bit width based on the count period according to a configuration rule of count bit width=6 bits (bits) if 0< count period is less than or equal to 60, count bit width=8 bits if 60< count period is less than or equal to 250, count bit width=10 bits if 250< count period is less than or equal to 960, count bit width=14 bits if 960< count period is les