Search

CN-122019418-A - Method and device for processing data, electronic equipment and non-transitory computer readable storage medium

CN122019418ACN 122019418 ACN122019418 ACN 122019418ACN-122019418-A

Abstract

Embodiments of the present disclosure provide a method and apparatus for processing data, an electronic device, and a non-transitory computer readable storage medium. The embodiment of the disclosure is used for concurrently accessing service data and debugging data in a final-stage cache of the artificial intelligence processor. According to the method and the device, the service data and the debugging data are subjected to hardware-level isolation in the final-level cache, so that the read-write performance of the service data in the cache is not affected in the whole process that the artificial intelligent chip starts the debugging function and continuously generates a large amount of debugging data, and the debugging efficiency of developers is improved.

Inventors

  • Request for anonymity
  • Request for anonymity

Assignees

  • 上海壁仞科技股份有限公司

Dates

Publication Date
20260512
Application Date
20260416

Claims (20)

  1. 1. A method of processing data for concurrent access to traffic data and debug data in a last level cache of an artificial intelligence processor, the method comprising: receiving a service request through a first port of the final-stage cache, processing the service request through a first hardware path connected with the first port, and routing service data in the service request to a first storage space of a main storage area; Receiving a debugging write request through a second port of the final-stage cache, processing the debugging write request through a second hardware path connected with the second port, and routing debugging data in the debugging write request to a second storage space of the main storage area; Wherein the second hardware path is independent of the first hardware path, the second memory space and the first memory space corresponding to different physical address ranges in the primary storage area.
  2. 2. The method of claim 1, wherein the processing the service request through a first hardware path connected to the first port comprises: storing the service request to a service data access queue on the first hardware path; Determining whether the service data corresponding to the service request hits in a cache array or not through a hit detection unit on the first hardware path; Responding to the hit of the data corresponding to the service request in the cache array, and reading or writing the data in a cache line of the cache array through an execution unit on the first hardware channel; In response to a miss of data corresponding to the service request in the cache array, a corresponding cache line is retrieved from the primary storage area by the execution unit.
  3. 3. The method of claim 2, wherein the processing the debug write request through a second hardware path connected to the second port comprises: storing the debug write request to a debug data access queue on the second hardware path; Mapping the debugging data packet sequence number in the debugging write request into a target memory address in the second storage space through an address calculation unit on the second hardware channel; And sending the debugging data to the second storage space through a bypass of the cache array of the final cache based on the target memory address.
  4. 4. A method as claimed in claim 3, wherein the method further comprises: Allocating a first request number on a bus for the first hardware path by a first request management unit in the execution unit, wherein the first request management unit is configured with a first number of request number pools; And allocating a second request number on a bus for the second hardware path through a second request management unit on the second hardware path, wherein the second request management unit is configured with a second number of request number pools, and the first number is larger than the second number.
  5. 5. The method of claim 4, wherein said sending the debug data to the second storage space through a bypass of a cache array of the last level cache comprises: responding to the obtained target memory address and distributing the second request number, and sending the debug data into a debug request queue to be sent to the second storage space through the debug request queue; And responding to the debugging request queue to receive a writing response to the debugging data returned by the second storage space, and releasing the second request number in the second request management unit.
  6. 6. The method of claim 1, wherein the method further comprises: and responding to the business request and the debugging write request to have access conflict to the main storage area, and preferentially sending the business request to the main storage area through an arbitration unit.
  7. 7. The method of claim 1, wherein the method further comprises: Receiving a read request for the debug data through the first port; and reading corresponding debugging data from the second storage space through the first hardware path and returning the corresponding debugging data.
  8. 8. The method of claim 1, wherein the second storage space is a preset capacity of storage space in the primary storage area.
  9. 9. The method of claim 1, wherein during execution of a business program by the artificial intelligence processor, receiving a business request and receiving a debug write request are performed in parallel.
  10. 10. An apparatus for processing data for concurrent access to traffic data and debug data in a last level cache of an artificial intelligence processor, the apparatus comprising: A first port configured to receive a service request; a first hardware path, coupled to the first port, configured to route the service data in the service request to a first storage space of a primary storage area; a second port configured to receive a debug write request; A second hardware path connected to the second port and independent of the first hardware path, configured to route debug data in the debug write request to a second storage space of the main storage area; wherein the second memory space and the first memory space correspond to different physical address ranges in the main memory area.
  11. 11. The apparatus of claim 10, wherein the first hardware path comprises: A service data access queue configured to store the service request; a hit detection unit configured to determine whether data corresponding to the service request hits in a cache array; an execution unit configured to: Responding to the hit of the data corresponding to the service request in the cache array, and reading or writing the data in the cache line of the cache array; In response to a miss of data corresponding to the service request in the cache array, a corresponding cache line is retrieved from the primary storage area.
  12. 12. The apparatus of claim 11, wherein the second hardware path comprises: A debug data access queue configured to store the debug write request; The address calculation unit is configured to map the debugging data packet sequence number in the debugging write request to a target memory address in the second storage space; and the debug data is sent to the second storage space in a bypass mode through the cache array of the final cache based on the target memory address.
  13. 13. The apparatus of claim 12, wherein the device comprises a plurality of sensors, The execution unit further includes a first request management unit configured to allocate a first request number on a bus for the first hardware path, wherein the first request management unit is configured with a first number of request number pools; The second hardware path further comprises a second request management unit configured to allocate a second request number on the bus for the second hardware path, wherein the second request management unit is configured with a second number of request number pools, the first number being greater than the second number.
  14. 14. The apparatus of claim 13, wherein the second hardware path comprises a debug request queue configured to: responding to the obtained target memory address, distributing the target memory address to obtain the second request number, receiving the debugging write request, and forwarding the debugging write request to the second storage space; And responding to the received write response to the debug data returned by the second storage space, requesting to release the second request number in the second request management unit.
  15. 15. The apparatus of claim 10, wherein the apparatus further comprises: And an arbitration unit configured to preferentially send the service request to the main memory area in response to the service request and the debug write request having an access conflict to the main memory area.
  16. 16. The apparatus of claim 10, wherein the first port is further configured to receive a read request for the debug data, the first hardware path is further configured to read the corresponding debug data from the second storage space and return.
  17. 17. The apparatus of claim 10, wherein the second storage space is a preset capacity of storage space in the primary storage area.
  18. 18. The apparatus of claim 10, wherein receiving a service request via the first port and receiving a debug write request via the second port are configured to occur in parallel during execution of a service program by the artificial intelligence processor.
  19. 19. An electronic device, the electronic device comprising: a main memory area, and An artificial intelligence processor coupled to the primary storage area, the artificial intelligence processor comprising a last level cache comprising the apparatus of any one of claims 10 to 18.
  20. 20. A non-transitory computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the method of any of claims 1 to 9.

Description

Method and device for processing data, electronic equipment and non-transitory computer readable storage medium Technical Field The present disclosure relates to the field of artificial intelligence chips, and more particularly to a method of processing data, an apparatus for processing data, an electronic device, and a non-transitory computer-readable storage medium. Background The rapid development of artificial intelligence places higher demands on computing power, thereby promoting artificial intelligence chips dedicated to accelerating artificial intelligence workload. In the operation process of the artificial intelligent processor, besides processing normal business data calculation tasks, software and hardware cooperation mode is needed to carry out debugging, verification and diagnosis work of the chip. Conventional debug schemes typically employ hardware to write performance counter data across the chip to main memory (e.g., memory), and software to read the main memory and parse the performance data to obtain chip operating state information. However, the last level cache, which is the last level between the processor core and the main memory, receives performance data from other blocks, and the last level cache itself generates performance data. These performance data are written indifferently into the last level cache, resulting in the performance data itself occupying the capacity and bandwidth of the last level cache, thereby affecting the performance of the cache processing of the service data. The method not only reduces the execution efficiency of the service program, but also causes the acquired performance data to be interfered and loses the accurate reference value. In addition, even if the above problem is alleviated by reducing the bandwidth occupation of the performance data and compressing the performance data as much as possible, the performance data often still needs to be written into the memory through the regular way of the last-stage cache. The scheme inevitably occupies hardware resources for caching and processing service data, severely limits the data quantity and acquisition precision of performance data, and cannot meet the increasingly complex debugging requirements of artificial intelligent chips. Therefore, improvements in the data processing scheme of the last level cache are needed. Disclosure of Invention Embodiments of the present disclosure provide a method of processing data, an apparatus for processing data, an electronic device, and a non-transitory computer-readable storage medium. The embodiment of the disclosure provides a method for processing data, which is used for concurrently accessing service data and debugging data in a final-stage cache of an artificial intelligent processor, and comprises the steps of receiving a service request through a first port of the final-stage cache, processing the service request through a first hardware path connected with the first port, routing the service data in the service request to a first storage space of a main storage area, receiving a debugging write request through a second port of the final-stage cache, processing the debugging write request through a second hardware path connected with the second port, and routing the debugging data in the debugging write request to a second storage space of the main storage area, wherein the second hardware path is independent of the first hardware path, and the second storage space and the first storage space correspond to different physical address ranges in the main storage area. An embodiment of the disclosure provides an apparatus for processing data, the apparatus being used for concurrently accessing service data and debug data in a last level cache of an artificial intelligence processor, the apparatus comprising a first port configured to receive a service request, a first hardware path connected with the first port and configured to route the service data in the service request to a first storage space of a main storage area, a second port configured to receive a debug write request, and a second hardware path connected with the second port and independent of the first hardware path and configured to route debug data in the debug write request to a second storage space of the main storage area, wherein the second storage space corresponds to a different physical address range in the main storage area than the first storage space. Embodiments of the present disclosure provide an artificial intelligence processor configured to perform the above-described method. The embodiment of the disclosure provides electronic equipment, which comprises a main storage area and an artificial intelligent processor, wherein the artificial intelligent processor is connected with the main storage area and comprises a final-stage cache, and the final-stage cache comprises the device. Embodiments of the present disclosure provide a computer readable storage medium having stored thereon