Search

EP-4738786-A1 - RESOURCE MANAGEMENT IN A NETWORK INTERFACE CONTROLLER WITH HARDWARE LINK AGGREGATION

EP4738786A1EP 4738786 A1EP4738786 A1EP 4738786A1EP-4738786-A1

Abstract

An example network interface controller (NIC) includes: first resources configured to supply first traffic; second resources configured to supply second traffic; a load balancer, coupled to the first and second resources, configured to balance the first traffic and the second traffic between first and second port circuits of a link aggregation group (LAG) using a hash function; and remote direct memory access (RDMA) logic configured to, using the hash function, divide work requests into a first set of work requests for first packets that hash to the first port circuit and a second set of work requests for second packets that hash to the second port circuit, the RDMA logic configured to supply the first set of work requests to the first resources and the second set of work requests to the second resources.

Inventors

  • Huang, Jeffrey Wei
  • Muppavarapu, Sudheer

Assignees

  • Avago Technologies International Sales Pte. Limited

Dates

Publication Date
20260506
Application Date
20251031

Claims (15)

  1. A network interface controller (NIC), comprising: first resources configured to supply first traffic; second resources configured to supply second traffic; a load balancer, coupled to the first and second resources, configured to balance the first traffic and the second traffic between first and second port circuits of a link aggregation group (LAG) using a hash function; and remote direct memory access (RDMA) logic configured to, using the hash function, divide work requests into a first set of work requests for first packets that hash to the first port circuit and a second set of work requests for second packets that hash to the second port circuit, the RDMA logic configured to supply the first set of work requests to the first resources and the second set of work requests to the second resources.
  2. The NIC of claim 1, further comprising: a bus interface having a first physical function and a second physical function; wherein the RDMA logic, in response to first RDMA work received via the first physical function, is configured to add a portion of the first set of work requests for a portion of the first packets to a first group and add a portion of the second set of work requests for a portion of the second packets to a second group; wherein the RDMA logic, in response to second RDMA work received via the second physical function, is configured to add another portion of the first set of work requests for another portion of the first packets to a third group and add another portion of the second set of work requests for another portion of the second packets to a fourth group.
  3. The NIC of claim 2, wherein the first and third groups are configured to input to the first resources, and wherein the second and fourth groups are configured to input to the second resources.
  4. The NIC of any one of the claims 1 to 3, wherein the first set of resources include a first queue configured to receive the first set of work requests and a first buffer configured to assemble the first packets based on the first set of work requests, and wherein the second resources include a second queue configured to receive the second set of work requests and a second buffer configured to assume the second packets based on the second set of work requests.
  5. The NIC of any one of the claims 1 to 4, wherein the RDMA logic includes a hash calculator configured to use the hash function to divide the work requests.
  6. The NIC of claim 5, wherein the hash calculator comprises firmware executed by a central processing unit (CPU).
  7. The NIC of claim 5, wherein the hash calculator is configured to obtain a table from the load balancer and divide the work requests into the first set of work requests for the first packets that hash to the first port, and the second set of work requests for the second packets that hash to the second port, using results of the hash function as applied to the table.
  8. A method of managing resources in a network interface controller (NIC) in a computer, the method comprising: supplying first traffic from first resources of the NIC; supplying second traffic from second resources of the NIC; balancing the first traffic and the second traffic between first and second port circuits of a link aggregation group (LAG) using a hash function; dividing, using the hash function at remote direct memory access (RDMA) logic of the NIC, work requests into a first set of work requests for first packets that hash to the first port circuit and a second set of work request for second packets that hash to the second port circuit; supplying the first set of work requests to the first resources; and supplying the second set of work request to the second resources.
  9. The method of claim 8, further comprising: receiving first RDMA work via a first physical function of the NIC; adding a portion of the first set of work requests for a portion of the first packets to a first group and adding a portion of the second set of work requests for a portion of the second packets to a second group; receiving second RDMA work via a second physical function of the NIC; and adding another portion of the first set of work requests for another portion of the first packets to a third group and adding another portion of the second set of work requests for another portion of the second packets to a fourth group; wherein in particular the method further comprises inputting the first and third groups to the first resources; and inputting the second and fourth groups to the second resources.
  10. The method of claim 8 or 9, comprising at least one of the following features (A) and (B): (A) the first set of resources include a first queue configured to receive the first set of work requests and a first buffer configured to assemble the first packets based on the first set of work requests, and wherein the second resources include a second queue configured to receive the second set of work requests and a second buffer configured to assume the second packets based on the second set of work requests;. and (B) the RDMA logic includes a hash calculator configured to use the hash function to divide the work requests, wherein in particular the hash calculator comprises firmware executed by a central processing unit (CPU) and/or the hash calculator is configured to obtain a table from the load balancer and divide the work requests into the first set of work requests for the first packets that hash to the first port, and the second set of work requests for the second packets that hash to the second port, using results of the hash function as applied to the table.
  11. A computer, comprising: a hardware platform including a central processing unit (CPU), memory, and a network interface controller (NIC); and software executing on the hardware platform; wherein the NIC includes: first resources configured to supply first traffic; second resources configured to supply second traffic; a load balancer, coupled to the first and second resources, configured to balance the first traffic and the second traffic between first and second port circuits of a link aggregation group (LAG) using a hash function; and remote direct memory access (RDMA) logic configured to, using the hash function, divide work requests from the software into a first set of work requests for first packets that hash to the first port circuit and a second set of work requests for second packets that hash to the second port circuit, the RDMA logic configured to supply the first set of work requests to the first resources and the second set of work requests to the second resources.
  12. The computer of claim 11, further comprising: a bus configured to couple the NIC to the CPU and the memory; wherein the NIC further includes a bus interface, coupled to the bus, having a first physical function and a second physical function; wherein the RDMA logic, in response to first RDMA work received via the first physical function, is configured to add a portion of the first set of work requests for a portion of the first packets to a first group and add a portion of the second set of work requests for a portion of the second packets to a second group; and wherein the RDMA logic, in response to second RDMA work received via the second physical function, is configured to add another portion of the first set of work requests for another portion of the first packets to a third group and add another portion of the second set of work requests for another portion of the second packets to a fourth group.
  13. The computer of claim 12, wherein the first and third groups are configured to input to the first resources, and wherein the second and fourth groups are configured to input to the second resources.
  14. The computer of any one of the claims 11 to 13, wherein the first set of resources include a first queue configured to receive the first set of work requests and a first buffer configured to assemble the first packets based on the first set of work requests, and wherein the second resources include a second queue configured to receive the second set of work requests and a second buffer configured to assume the second packets based on the second set of work requests.
  15. The computer of any one of the claims 11 to 14, wherein the RDMA logic includes a hash calculator configured to use the hash function to divide the work requests; wherein in particular the hash calculator comprises firmware executed by another central processing unit (CPU) of the NIC.

Description

BACKGROUND A network interface controller (NIC) may be a hardware component in a computer that connects the computer to a computer network. A computer may be an electronic device for storing and processing data. A computer network (hereinafter referred to as a network) may be a system that connects computers. A NIC can include a port circuit that couples the NIC to a transmission medium of the network. A port circuit (hereinafter referred to as a port) can be a circuit that provides a point of data ingress (e.g., data input), data egress (e.g., data output), or both. For example, a port of a NIC can include a physical layer circuit (PHY) among other circuits (examples discussed below). A PHY may be a circuit, such as a transceiver, which implements physical layer functions, e.g., layer 1 of the Open Systems Interconnection (OSI) model. Some NICs can include multiple ports. A NIC with multiple ports can include multiple connections to the network (where the connections can be referred to as links). Link aggregation may be the combining (referred to as aggregating) of multiple links. A link aggregation group (LAG) may be a logical entity representing an aggregation of multiple links. A NIC can group ports thereof to provide one end of the LAG. A network device connected to the NIC can group some of its ports to provide the other end of the LAG. Other terms known in the art to describe the concept of link aggregation include trunking, bundling, bonding, channeling, and teaming. For clarity by example, the description herein will use the term link aggregation. Link aggregation can increase total throughput with respect to use of a single link and can provide redundancy, where all by one of the links can fail without losing network connectivity. The NIC can include hardware that supports a LAG (hereinafter referred to as LAG hardware). The LAG hardware can balance the transmission of traffic among the ports of the LAG. A NIC can include a set of transmission resources for each port. Types of transmission resources are discussed further below. At least a portion of the transmission resources can be a pipeline. A pipeline may be a set of resources connected in series, where the input of one resource depends on the output of another resource. The LAG hardware can balance traffic supplied by sets of transmission resources among ports of a LAG. For example, a NIC can include two ports designated A and B as part of a LAG and two transmission resource sets designated 1 and 2. The LAG hardware can balance traffic supplied by transmission resource set 1 between the ports A and B (e.g., packets from transmission resource set 1 can sometimes be transmitted by port A and other times be transmitted by port B). The LAG hardware can also balance packets supplied by transmission resource set 2 between the ports A and B (e.g., packets from transmission resource set 2 can sometimes be transmitted by port A and other times be transmitted by port B). The NIC can be coupled to an expansion bus of the computer. Peripheral Component Interconnect Express (PCIe) is a well known and widely used standard for an expansion bus in a computer. In PCIe architecture, a peripheral device can present as multiple logical devices, where each logical device can be referred to as a function (or PCIe function). Each function can have its own configuration space, resources, and capabilities presented to software in the computer. A NIC, for example, can have separate functions for each port, where each of the functions includes a separate set of transmission resources. Thus, continuing the example above, the NIC can have a function A for the port A having, for example, the transmission resource set 1, and a function B for the port B having, for example, the transmission resource set 2. Software in a computer can be unaware that some ports of a NIC are part of a LAG. Continuing the example above, the software can provide a sequence of first packets and then second packets to NIC via function A to be transmitted via the port A. The NIC can process the first and second packets through transmission resource set 1 associated with function A. The LAG hardware can determine that the second packets are to be transmitted via the port A as expected by the software. The LAG hardware, however, can determine that the first packets are to be transmitted via the port B of the LAG unbeknownst to the software. Thus, the software cannot arrange ahead of time which port of NIC transmits which packets. The NIC can include flow control functionality, where ports can temporarily pause transmission of packets to prevent congestion in the network. In the example, the port B can pause transmission of packets. In such case, the first packets stall in transmission resource set 1 waiting for port B to resume packet transmission. The second packets can be behind the first packets in the transmission resource set 1 waiting for further processing (e.g., the second packets can be behi