EP-4738121-A2 - HARDWARE ACCELERATOR SERVICE AGGREGATION

EP4738121A2EP 4738121 A2EP4738121 A2EP 4738121A2EP-4738121-A2

Abstract

The present disclosure includes systems, methods, and computer-readable mediums for discovering capabilities of local and remote hardware (HW) accelerator cards. A local hardware (HW) accelerator card may provide, via a communication interface, a listing of acceleration services from the local HW accelerator card. The listing of acceleration services may include a first set of acceleration services provided by one or more accelerators of the local HW accelerator card and a second set of acceleration services provided by one or more accelerators of a remote HW accelerator card. A workload instruction defining a workload for processing by at least one of the acceleration services of the second set of acceleration services may be received from a processor of a computing device. The workload instruction may be forwarded to the remote HW accelerator card.

Inventors

KELKAR, Shrikant
ADHAV, Gargi
SHARMA, LAKSHMI
Jayadevan, Manoj
PATEL, PARVEEN
RANGANATHAN, PARTHASARATHY

Assignees

Google LLC

Dates

Publication Date: 20260506
Application Date: 20221027

Claims (15)

A method comprising: generating, by an accelerated services manager (ASM) executing on one or more processors of a local hardware (HW) accelerator card, a first set of acceleration services provided by one or more accelerators of the local HW accelerator card; receiving, by the ASM, from another ASM executing on a remote HW accelerator card, a second set of acceleration services provided by one or more accelerators of the remote HW accelerator card; providing, by one or more processors of a local hardware (HW) accelerator card, via a communication interface, a listing of acceleration services from the local HW accelerator card, the listing of acceleration services including the first set of acceleration services and the second set of acceleration services; receiving, by the one or more processors, a workload instruction from a processor of a computing device, the workload instruction defining a workload for processing by at least one of the acceleration services of the second set of acceleration services; forwarding, by the ASM, the workload instruction to the remote HW accelerator card; and after determining a failure to process the workload instruction by the remote HW accelerator card, sending, by the ASM, an updated workload instruction to a different HW accelerator card for processing by at least one acceleration service of the different HW accelerator.
The method of claim 1, further comprising: receiving, by the one or more processors, a processed workload from the remote HW accelerator card, the processed workload being the workload after processing by the at least one of the acceleration services of the second set of acceleration services.
The method of claim 2, further comprising: forwarding, by the one or more processors, the processed workload to the processor of the computing device.
The method of claim 1, wherein forwarding the workload instruction to the remote HW accelerator card comprises sending the workload instruction to the other ASM.
The method of claim 1, wherein prior to receiving the second set of acceleration services, the ASM requests a listing of the second set of acceleration services from the other ASM.
The method of claim 1, wherein the ASM identifies and prunes unhealthy acceleration services from the listing of acceleration services.
The method of claim 6, wherein identifying the unhealthy acceleration services includes: determining, by the ASM, the failure to process the workload instruction by the at least one of the acceleration services of the second set of acceleration services.
The method of claim 7, wherein pruning the unhealthy acceleration services includes: marking the at least one of the acceleration services of the second set of acceleration services as unhealthy; or removing the at least one of the acceleration services of the second set of acceleration services from the listing of acceleration services.
The method of claim 1, wherein the workload instruction further defines processing by at least one acceleration service of at least one other remote HW accelerator card.
A system comprising: a communication interface; a local hardware (HW) accelerator card including one or more processors and one or more accelerators, the one or more processors configured to: generate, by an accelerated services manager (ASM) executing on the one or more processors a first set of acceleration services provided by the one or more accelerators; receiving, by the ASM, from another ASM executing on a remote HW accelerator card, a second set of acceleration services provided by one or more accelerators of the remote HW accelerator card; provide, via the communication interface, a listing of acceleration services from the local HW accelerator card, the listing of acceleration services including the first set of acceleration services and the second set of acceleration services; receive a workload instruction from a processor of a computing device, the workload instruction defining a workload for processing by at least one of the acceleration services of the second set of acceleration services; forward, by the ASM, the workload instruction to the remote HW accelerator card; and send, by the ASM after determining a failure to process the workload instruction by the remote HW accelerator card, an updated workload instruction to a different HW accelerator card for processing by at least one acceleration service of the different HW accelerator.
The system of claim 10, wherein the one or more processors are further configured to: receive a processed workload from the remote HW accelerator card, the processed workload being the workload after processing by the at least one of the acceleration services of the second set of acceleration services.
The system of claim 11, wherein the one or more processors are further configured to: forward the processed workload to the processor of the computing device.
The system of claim 10, wherein forwarding the workload instruction to the remote HW accelerator card comprises sending the workload instruction to the other ASM.
The system of claim 10, wherein prior to receiving the second set of acceleration services, the ASM requests a listing of the second set of acceleration services from the other ASM.
The system of claim 10, wherein the ASM identifies and prunes unhealthy acceleration services from the listing of acceleration services.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present application is a continuation of U.S. Patent Application No. 17/525,300, filed on November 12, 2021, the disclosure of which is hereby incorporated herein by reference. BACKGROUND In most systems, it is difficult for a computing device, including the components of the computing device and/or the software executing on the computing device, including the operating system, to discover the functionality and capabilities provided by hardware accelerator cards connected to the computing device by a communication interface, such as a PCIe bus. To avoid these issues, processors may be hardcoded with software, such as drivers, to communicate with particular hardware accelerator cards. However, hardcoding processors with the necessary software to communicate with particular hardware accelerator cards limit the processors to only those particular hardware accelerator cards. Thus, processors are not able to leverage the functions and capabilities of other hardware accelerator cards or hardware accelerator cards that were developed after the processor was produced. Additionally, some hardware accelerator cards may expose their functionalities and capabilities as separate devices within the operating system of a computing device. In this regard, when a hardware accelerator card is connected to a computing device by a communication interface, such as a PCIe bus, the operating system may detect or otherwise be notified of the connection and list each function and capability of the hardware accelerator card as discrete devices within the operating system according to predefined classes and subclasses. Based on the devices listed in the operating system, the computing device may use the capabilities and functionalities of the hardware accelerator card. As the capabilities and functionalities of hardware accelerator cards have increased and become more specialized, these new capabilities and functionalities are not clearly identified by the classes and subclasses provided for by current operating systems. Thus, some operating systems may indicate the capabilities and functionalities provided by hardware accelerator cards but may not be able to identify all of the capabilities and functionalities of the hardware accelerator cards. Further, some of the capabilities and functionalities of the hardware accelerator cards may not be recognized and/or clearly identified within the operating systems. As such, computing devices may not be able to leverage or even be made aware of all of the features and capabilities of available hardware accelerator cards. Systems are typically provided with a limited number of connections to a communication interface. For instance, systems that include PCIe buses may only have a few PCIe slots that connect hardware accelerator cards to the PCIe buses. The limited number of connections may be due to cost constraints. In this regard, each additional connection added to a system may increase physical hardware expenses and add to the overall manufacturing costs. In addition, technical limitations, such as power availability, may also limit the number of devices that may be connected to a system. For example, a system may include five connections for hardware accelerator cards; however, the power supply may only be able to provide power to two hardware accelerator cards at a time. Thus, systems may be limited in their ability to access acceleration services offered by hardware accelerator cards due to the limited number of hardware accelerator cards that may be connected to the systems. BRIEF SUMMARY The technology described herein relates to systems and methods for service aggregation that aggregates and exposes acceleration services provided by accelerators of hardware accelerator cards. With service aggregation, a hardware accelerator card may communicate with other hardware accelerator cards to aggregate and expose the accelerations services provided by accelerators of these other hardware accelerator cards that may be connected locally or remotely. The aggregated and exposed acceleration services may also include acceleration services offered by the accelerators of the hardware accelerator card performing the service aggregation. The acceleration services offered by the hardware accelerator card, as well as other locally or remotely connected hardware accelerator cards may then be leveraged by the system. One aspect of the disclosure relates to a method. The method may comprise providing, by one or more processors of a local hardware (HW) accelerator card, via a communication interface, a listing of acceleration services from the local HW accelerator card, the listing of acceleration services including a first set of acceleration services provided by one or more accelerators of the local HW accelerator card and a second set of acceleration services provided by one or more accelerators of a remote HW accelerator card; receiving, by the one or more pr