EP-4735999-A1 - HARDWARE-BASED ACCELERATOR SIGNALING
Abstract
A processor [102] employs a hardware signal monitor [110] to manage signaling for accelerators [103, 104]. The hardware signal monitor monitors designated memory addresses assigned to accelerator signals. In response to a memory write [112] to one of the designated memory addresses, the hardware signal monitor executes a set of one or more operations (referred to as a callback). The hardware signal monitor thereby enables improved and enhanced signaling features, such as asynchronous signaling between agents, inter-accelerator signaling, and inter-process signaling.
Inventors
- WYSE, MARK UNRUH
- GUTIERREZ, ANTHONY THOMAS
- BLINZER, PAUL
- Bayliss, Samuel Richard
Assignees
- Advanced Micro Devices, Inc.
- Xilinx, Inc.
Dates
- Publication Date
- 20260506
- Application Date
- 20240418
Claims (20)
- 1 . A method comprising: based on a signal issued by a first accelerator, asynchronously notifying, by a hardware signal monitor, a first process of the signal.
- 2. The method of claim 1 , wherein asynchronously notifying comprises: executing, at the hardware signal monitor, a set of operations based on the signal.
- 3. The method of claim 2, wherein the set of operations includes one or more of a memory transfer of one or more bytes of data, an atomic memory operation, an enqueuing of a first packet, a dequeuing of a second packet, a task dispatch operation, a set of signal operations, and an instruction to be executed by a processor.
- 4. The method of claim 2, wherein the set of operations includes the hardware signal monitor sending a task to a second accelerator.
- 5. The method of claim 4, wherein sending the task comprises the hardware signal monitor sending the task based on a value generated by one of a plurality of accelerators.
- 6. The method of claim 4, wherein sending the task comprises enqueuing the task at a work queue of the second accelerator.
- 7. The method of claim 2, wherein the set of operations is programmable.
- 8. The method of any of claims 1-7, wherein the signal is issued by a second process executing at the first accelerator, the second process independent of the first process.
- 9. The method of any of claims 1-8, wherein the signal is issued by the first accelerator in response to a write operation to a specified memory address.
- 10. A method, comprising: receiving, at a hardware signal monitor, a signal from a first process, the signal associated with a first accelerator; and indicating, by the hardware signal monitor, the signal to a second process, the second process independent of the first process.
- 11 . The method of claim 10, wherein the second process is associated with a second accelerator of the processor.
- 12. A system, comprising: a first accelerator; and hardware signal monitor circuitry to: based on a signal associated with the first accelerator, asynchronously notify a first process of the signal.
- 13. The system of claim 12, wherein the hardware signal monitor circuitry includes: signal handling circuitry to execute a set of operations based on the signal.
- 14. The system of claim 13, wherein the set of operations includes one or more of a memory transfer of one or more bytes of data, an atomic memory operation, an enqueuing of a first packet, a dequeuing of a second packet, a task dispatch operation, a set of signal operations, and an instruction to be executed by a processor.
- 15. The system of claim 14, wherein the set of operations includes the hardware signal monitor circuitry sending a task to a second accelerator.
- 16. The system of claim 15, wherein the hardware signal monitor circuitry is to send the task based on a value generated by one of a plurality of accelerators.
- 17. The system of claim 15, wherein the hardware signal monitor circuitry is to send the task to a work queue of the second accelerator.
- 18. The system of any of claims 13-17, wherein the set of operations is programmable.
- 19. The system of any of claims 13-18, wherein the signal is issued by a second process, the second process independent of the first process.
- 20. The system of any of claims 13-18, wherein the signal is based on a write operation to a specified memory address.
Description
HARDWARE-BASED ACCELERATOR SIGNALING BACKGROUND [0001] To improve processing efficiency and conserve power, some processing systems employ one or more accelerators to perform designated operations on behalf of a central processing unit (CPU). For example, some processing systems employ a graphics processing unit (GPU) to perform graphics operations, an artificial intelligence (Al) accelerator to perform Al operations, a digital signal processor (DSP) to perform signal processing operations, and the like. To facilitate communication between the accelerators and the CPU, some processing systems employ signals, wherein each signal is a shared memory object that can be accessed by the CPU and one or more accelerators to share information. Examples of signals include doorbell signals that notify agents (e.g., one or more accelerators) that work is available, and completion signals that notify agents (e.g., a CPU or accelerator) when assigned work is available. However, existing signal implementations are not well- suited for asynchronous communication, and require a relatively high amount of overhead, such as software polling or interrupts to observe the state of each signal, thus requiring. BRIEF DESCRIPTION OF THE DRAWINGS [0002] The present disclosure may be better understood, and its numerous features and advantages made apparent to those skilled in the art by referencing the accompanying drawings. The use of the same reference symbols in different drawings indicates similar or identical items. [0003] FIG. 1 is a block diagram of a system that employs a hardware signal monitor to manage signals at the system in accordance with some embodiments. [0004] FIG. 2 is a block diagram illustrating aspects of the hardware signal monitor of FIG. 1 in accordance with some embodiments. [0005] FIG. 3 is a block diagram illustrating an example of the hardware signal monitor of FIG. 1 enabling signaling between accelerators in accordance with some embodiments. [0006] FIG. 4 is a block diagram illustrating an example of the hardware signal monitor of FIG. 1 enabling inter-process signaling in accordance with some embodiments. [0007] FIG. 5 is an example of a CPU of FIG. 1 employing page tables to manage the hardware signal monitor of FIG. 1 in accordance with some embodiments. [0008] FIG. 6 is a flow diagram of a method of managing signals at a hardware signal monitor in accordance with some embodiments. [0009] FIG. 7 is a flow diagram of a method of employing a hardware signal monitor to implement a wait process at a system in accordance with some embodiments. DETAILED DESCRIPTION [0010] FIGs. 1-7 illustrate circuitry and techniques for employing a hardware signal monitor to manage signaling for accelerators at a system in accordance with some embodiments. The hardware signal monitor monitors designated memory addresses assigned to accelerator signals. In response to a memory write to one of the designated memory addresses, the hardware signal monitor executes a set of one or more operations (referred to as a callback). The hardware signal monitor thereby enables improved and enhanced signaling features, such as asynchronous signaling between agents, inter-accelerator signaling, and inter-process signaling. [0011] To illustrate, in some embodiments a system includes a number of agents, including at least one CPU and two or more accelerators. To communicate, the agents employ a set of signals, wherein each signal is a shared memory-backed object assigned a corresponding memory address. Each signal includes both a signal value and a signal condition. A signal is typically waited on by one or more agents, wherein each agent takes action when the signal condition is met by the corresponding signal value (e.g., the signal condition is met when the signal value is less than one). A signal is sent by an agent when the agent performs a write using an atomic memory operation to the corresponding address. An example of a signal is a doorbell signal, wherein the signal is used by one agent (e.g., a CPU) to indicate to another agent (e.g., an accelerator) that work (e.g., one or more commands) is available to be executed. Another example of a signal is a completion signal, wherein the signal is used by one agent to indicate to another agent that assigned work has been completed. [0012] Conventionally, signals are managed by software polling, interrupts, or a combination thereof. With software polling, software executes a polling loop that repeatedly checks a signal value until the signal condition is met. However, this approach requires a relatively high number of memory accesses (to check the signal value), consuming energy and memory bandwidth. Furthermore, this approach does not allow for true asynchronous signaling, as the software synchronously checks the signal value. With interrupts, the system is configured to trigger a specified interrupt when a corresponding signal condition is met. However, this approach suffers from a