US-12627589-B2 - Network collective offloading cost management
Abstract
The disclosed device includes a collective engine that can select a communication cost model from multiple communication cost models for a collective operation and configure a topology of a collective network for performing the collective operation using the selected communication cost model. Various other methods, systems, and computer-readable media are also disclosed.
Inventors
- Josiah I. Clark
Assignees
- ADVANCED MICRO DEVICES, INC.
Dates
- Publication Date
- 20260512
- Application Date
- 20231214
Claims (19)
- 1 . A device comprising: a control circuit configured to: select a communication cost model for a collective operation; and configure, based on the selected communication cost model, a topology of a collective network for performing the collective operation, wherein configuring the topology includes configuring communication connections between nodes of a level of the collective network with nodes of neighboring levels of the collective network based on a plurality of parameters.
- 2 . The device of claim 1 , wherein the control circuit is configured to select the communication cost model by: optimizing a communication cost of the collective operation by evaluating a plurality of communication cost models for the collective operation; and selecting the communication cost model from the plurality of communication cost models corresponding to the optimized communication cost.
- 3 . The device of claim 2 , wherein the communication cost model includes optimizing the communication cost includes determining optimized parameters for the plurality of parameters.
- 4 . The device of claim 3 , wherein configuring the topology is based on using the optimized parameters as topology parameters.
- 5 . The device of claim 3 , wherein the plurality of parameters includes at least one of: a number of upstream ports; a number of downstream ports; a number of processors; a number of ports per processor; a tree depth; and a stride value.
- 6 . The device of claim 5 , wherein the plurality of parameters includes at least an optimized stride value that corresponds to a number of nodes connected to in a next level.
- 7 . The device of claim 2 , wherein optimizing the communication cost includes flattening a tree associated with the communication cost model.
- 8 . The device of claim 1 , wherein the control circuit is further configured to configure a portion of the topology based on the selected communication cost model.
- 9 . The device of claim 8 , wherein the control circuit is further configured to configure a second portion of the topology based on a second communication cost model.
- 10 . A system comprising: a memory; a processor, and a control circuit configured to: evaluate cost parameters of a communication cost model for a collective operation; and configure, based on the cost parameters, a topology of a collective network for performing the collective operation, wherein configuring the topology includes configuring communication connections between nodes of a level of the collective network with nodes of neighboring levels of the collective network based on a plurality of topology parameters.
- 11 . The system of claim 10 , wherein the control circuit is configured to evaluate the cost parameters of the communication cost model by optimizing a communication cost of the collective operation.
- 12 . The system of claim 11 , wherein optimizing the communication cost includes determining optimized parameter values for the cost parameters, and configuring the topology is based on using the optimized parameter values as topology parameters.
- 13 . The system of claim 12 , wherein the cost parameters include at least one of: a number of upstream ports; a number of downstream ports; a number of processors; a number of ports per processor; a tree depth; and a stride value.
- 14 . The system of claim 13 , wherein the plurality of topology parameters include at least an optimized stride value that corresponds to a number of nodes connected to in a next level.
- 15 . The system of claim 11 , wherein optimizing the communication cost includes flattening a tree associated with the communication cost model.
- 16 . The system of claim 10 , wherein the control circuit is further configured to configure a portion of the topology based on the evaluated cost parameters of the communication cost model and configuring a second portion of the topology based on a second communication cost model.
- 17 . A method comprising: evaluating a plurality of communication cost models for a collective operation; determining values for a plurality of cost model parameters based on the evaluation; and configuring, using a plurality of topology parameters corresponding to the values for the plurality of cost model parameters, a topology of a collective network for performing the collective operation, wherein configuring the topology includes configuring communication connections between nodes of a level of the collective network with nodes of neighboring levels of the collective network based on the plurality of topology parameters.
- 18 . The method of claim 17 , wherein evaluating the plurality of communication cost models include optimizing a communication cost of the collective operation and the values for the plurality of cost model parameters are determined based on the optimized communication cost.
- 19 . The method of claim 17 , further comprising configuring a portion of the topology based on the plurality of topology parameters and configuring a second portion of the topology based on a second communication cost model.
Description
CROSS REFERENCE TO RELATED APPLICATION This application claims the benefit of U.S. Provisional Application No. 63/432,657, filed 14 Dec. 2022, the disclosure of which is incorporated, in its entirety, by this reference. BACKGROUND Various improvements to computing performance, such as increasing a number of processing cores, provide increased performance but can reach scalability limits. Collective communications allow for global communications operations amongst all processes/nodes in a system (e.g., a collective network), including networked nodes. As a number of nodes increase, collective communications can suffer from scalability issues. To ensure better scalability, certain communications processing can be offloaded from the nodes (e.g., processors thereof) to other nodes (e.g., a network adapter, switch, etc.) of the collective network, which can be managed by a collective engine that can reside in a server or other connected computing device. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings illustrate a number of exemplary implementations and are a part of the specification. Together with the following description, these drawings demonstrate and explain various principles of the present disclosure. FIG. 1 is a block diagram of an exemplary system for a collective engine. FIG. 2 is block diagram of an exemplary collective network. FIG. 3 is a simplified diagram of a broadcast operation and a reduce operation. FIG. 4 is a simplified diagram of a scatter operation and a gather operation. FIG. 5 is a simplified diagram of an allgather operation and a reduce-scatter operation. FIG. 6 is a simplified diagram of an allreduce operation. FIG. 7A is a network topology diagram of an initial phase of an allreduce operation. FIG. 7B is a network topology diagram of an intermediate phase of an allreduce operation. FIG. 7C is a network topology diagram of a final phase of an allreduce operation. FIG. 8 is a table of network topology cost models for an allreduce operation. FIGS. 9A-G illustrate a network topology as chunks are routed through. FIG. 10 is a flow diagram of an exemplary method for network collective offload cost management. FIG. 11 is a flow diagram of an exemplary method for network collective offload routing management. FIG. 12 is a flow diagram of an exemplary method for network collective offload message chunking management. Throughout the drawings, identical reference characters and descriptions indicate similar, but not necessarily identical, elements. While the exemplary implementations described herein are susceptible to various modifications and alternative forms, specific implementations have been shown by way of example in the drawings and will be described in detail herein. However, the exemplary implementations described herein are not intended to be limited to the particular forms disclosed. Rather, the present disclosure covers all modifications, equivalents, and alternatives falling within the scope of the appended claims. DETAILED DESCRIPTION The present disclosure is generally directed to network collective offloading management. As will be explained in greater detail below, implementations of the present disclosure provide a collective engine that can manage various aspects of collective offloading onto networked nodes that communicate with each other to share data and/or processing of data. The present disclosure is generally directed to network collective offloading cost management. As will be explained in greater detail below, implementations of the present disclosure configure a topology of a collective network for performing a collective operation based on a communication cost model, which can be evaluated and/or otherwise optimized for communication cost (e.g., latency with respect to sending data), such as reducing and/or minimizing costs for communicating between nodes of the collective network when performing the collective operation. The systems and methods provided herein can improve efficiency of the collective network, for instance by establish a more efficient topology that reduces network communication costs. In one implementation, a device for network collective offloading cost management includes a control circuit configured to select a communication cost model for a collective operation, and configure, based on the selected communication cost model, a topology of a collective network for performing the collective operation. In some examples, the control circuit is configured to select the communication cost model by optimizing a communication cost of the collective operation by evaluating a plurality of communication cost models for the collective operation, and selecting the cost model from the plurality of communication cost models corresponding to the optimized communication cost. In some examples, the communication cost model includes a plurality of parameters and optimizing the communication cost includes determining optimized parameters for the plur