US-20260127134-A1 - RING DATA-STRUCTURE FOR DATA OPERATIONS
Abstract
Systems, methods, and techniques are directed to data operations utilizing ring data-structures. A device comprises a network interface controller (NIC) and a ring data-structure comprising a first ring comprising pointers indicating memory regions that are empty and a second ring comprising pointers indicating regions that store data. In an aspect, the NIC receives a write request and determines a region to write data to based on the first ring. The NIC writes the data to the region and updates the second region to include a pointer to the region. In another aspect, the NIC receives a read request to read data and determines a region to read data from based on a pointer in the second ring. The NIC reads the data from the region. In a further aspect, the NIC is a one-sided remote direct memory access (RDMA) NIC and reads/writes utilizing low-level operations.
Inventors
- Matteo INTERLANDI
- Carlo Aldo Curino
- Karla Jean Saur
- Matthias Christian JASNY
Assignees
- MICROSOFT TECHNOLOGY LICENSING, LLC
Dates
- Publication Date
- 20260507
- Application Date
- 20241107
Claims (20)
- 1 . A system, comprising: a remote direct memory access (RDMA) memory device comprising: a plurality of memory regions, a first ring buffer comprising a first pointer indicating an address of a first memory region of the plurality of memory regions, the first memory region being empty, and a second ring buffer; and an RDMA network interface controller (NIC) that: receives, from a first computing device, a write request for writing first data to one of the plurality of memory regions, determines, based on the first pointer, the address of the first memory region, writes the first data to the first memory region based on the address of the first memory region, and updates the second ring buffer to comprise a second pointer indicating the address of the first memory region storing the first data.
- 2 . The system of claim 1 , wherein: the system further comprises an overflow manager that: determines a storage capacity of the plurality of memory regions satisfies a first storage criterion, and transfers the first data from the first memory region to a spill storage device; and the RDMA NIC causes the first ring buffer to comprise a third pointer indicating the address of the first memory region, the first memory region being empty.
- 3 . The system of claim 2 , wherein: the overflow manager further: determines the storage capacity of the plurality of memory regions satisfies a second storage criterion, determines, based on the first ring buffer, an address of a second memory region, the second memory region being empty, and transfers the first data from the spill storage device to the second memory region; and the RDMA NIC causes the second ring buffer to comprise a fourth pointer indicating the address of the second memory region storing the first data.
- 4 . The system of claim 2 , wherein: the first data is associated with a first entity account; the RDMA memory device further comprises: a first ring pair comprising the first ring buffer and the second ring buffer, and a second ring pair comprising a third ring buffer and a fourth ring buffer, the fourth ring buffer comprising fourth pointer indicating an address of a second memory region storing second data associated with a second entity account; and the overflow manager further: prioritizes transferring the first data from the first memory region to the spill storage device over transferring the second data from the second memory region to the spill storage device.
- 5 . The system of claim 1 , wherein to determine, based on the first pointer, the address of the first memory region, the RDMA NIC further: accesses the first ring buffer to obtain the first pointer; and provides the first pointer to the first computing device.
- 6 . The system of claim 5 , wherein to write the first data to the first memory region, the RDMA NIC further: receives, from the first computing device, a write instruction indicating the first data is to be written to the address of the first memory region; and responsive to receiving the write instruction, writes the first data to the first memory region.
- 7 . The system of claim 1 , wherein the RDMA NIC further: receives, from a second computing device, a read request for reading the first data; determines, based on the second pointer, the address of the first memory region; reads the first data from the first memory region based on the address of the first memory region; and provides the first data to the second computing device.
- 8 . The system of claim 7 , wherein the second ring buffer comprises a third pointer indicating an address of a second memory region storing second data, and to determine, based on the second pointer, the address of the first memory region, the RDMA NIC further: accesses the second ring buffer to obtain the second pointer and the third pointer; and provides the second pointer and the third pointer to the second computing device.
- 9 . The system of claim 1 , wherein to update the second ring buffer to comprise the second pointer, the RDMA NIC further: transfers the first pointer from the first ring buffer to the second ring buffer as the second pointer.
- 10 . A method for facilitating data operations in a computing system, the method comprising: receiving, from a first computing device, a write request for writing data to a memory device, the memory device comprising a first memory region, a first ring buffer, and a second ring buffer, the first ring buffer comprising a first pointer indicating an address of the first memory region; determining the first memory region is an empty memory region based on the first ring buffer comprising the first pointer; writing the data to the first memory region based on the address of the first memory region; and updating the second ring buffer to comprise a second pointer indicating the address of the first memory region storing the first data.
- 11 . The method of claim 10 , further comprising: determining a storage capacity of the memory device satisfies a first storage criterion, transferring the first data from the first memory region to a spill storage device, and causing the first ring buffer to comprise a third pointer indicating the address of the first memory region, the first memory region being empty.
- 12 . The method of claim 11 , further comprising: determining the storage capacity of the memory device satisfies a second storage criterion; determining, based on the first ring buffer, an address of a second memory region, the second memory region being empty; transferring the first data from the spill storage device to the second memory region; and causing the second ring buffer to comprise a fourth pointer indicating the address of the second memory region storing the first data.
- 13 . The method of claim 10 , wherein said determining the first memory region is an empty region comprises: accessing the first ring buffer to obtain the first pointer; and providing the first pointer to the first computing device.
- 14 . The method of claim 13 , wherein: the memory device is an RDMA memory device; said providing the first pointer to the first computing device causes the first computing device to generate a write instruction specifying the address of the first memory region as a target of the write instruction; and said writing the data to the first memory region comprises: receiving the write instruction from the first computing device, and utilizing a write operation to write the data to the first memory region.
- 15 . The method of claim 11 , further comprising: receiving, from a second computing device, a read request for reading the first data; determining, based on the second pointer, the address of the first memory region; reading the first data from the first memory region based on the address of the first memory region; and providing the first data to the second computing device.
- 16 . A remote direct memory access (RDMA) device comprising: a ring data-structure comprising: a first ring buffer comprising a first pointer indicating an address of a first memory region of a memory device, the first memory region storing first data, and a second ring buffer; and an RDMA network interface controller (NIC) that: receives, from a first computing device, a read request for reading the first data, receives the first pointer from the first ring buffer, reads the first data from the first memory region based on the address indicated by the first pointer, provides the first data to the first computing device, and updates the second ring buffer to comprise a second pointer indicating the address of the first memory region, the first memory region being empty.
- 17 . The RDMA device of claim 16 , wherein the first ring buffer comprises a third pointer indicating an address of a second memory region storing second data, and to receive the first pointer from the first ring buffer, the RDMA NIC further: receives the first and third pointer from the first ring buffer.
- 18 . The RDMA device of claim 17 , wherein RDMA NIC further: provides the first pointer and the third pointer to the first computing device; and receives, from the first computing device, a read instruction indicating the first data is to be read from the first memory region.
- 19 . The system of claim 1 , wherein the RDMA NIC further: determines a storage capacity of the plurality of memory regions satisfies a first storage criterion; transfers the first data from the first memory region to a spill storage device; and causes the first ring buffer to comprise a third pointer indicating the address of the first memory region, the first memory region being empty.
- 20 . The RDMA device of claim 16 , wherein the RDMA NIC further: receives, from a second computing device, a write request for writing second data to the memory device; determines, based on the second pointer, the address of the first memory region; writes the second data to the first memory region based on the address of the first memory region; and updates the first ring buffer to comprise a third pointer indicating the address of the first memory region storing the second data.
Description
BACKGROUND In computing systems, data transfers occur between different devices and/or services executed by devices. Transferring between devices or services (also referred to as “shuffling”) takes time and compute resources. These transfer operations can rely on communication patterns in some implementations. Furthermore, computing devices transferring data to another computing device can operate at different rates than the receiving device. This can lead to a bottleneck where the receiving device is unable to receive additional data and the providing device has to wait for available bandwidth to continue sending data to the receiving device. SUMMARY This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Embodiments described herein provide a ring data-structure for use in data operations. In an aspect, a device comprises a ring data-structure comprising a first ring buffer (also referred to as a “first ring” herein) and a second ring buffer (also referred to as a “second ring” herein). The first ring is configured to comprise pointers that indicate a respective address of a memory region that is empty. The second ring is configured to comprise pointers that indicate a respective address of a memory region that stores data. A network interface controller (NIC) receives a write request for writing data to a memory region. The NIC determines, based on a pointer of the first ring, an address of an empty memory region. The NIC causes data to be written to the empty memory region based on the address indicated by the pointer. The NIC updates the second ring to include the address of the memory region. In a further aspect, the NIC is a target-side NIC of the data operation (e.g., a NIC of a consumer device or an intermediary device). In another aspect, the NIC is an initiator-side NIC of the data operation (e.g., a NIC of a producer device). In a further aspect, the device comprising the ring data-structure is a remote direct memory access (RDMA) device and the data operation is an RDMA operation. In a further aspect, the data operation is a transmission control protocol (TCP) operation. In a further aspect, the NIC receives the write request from a computing device. To determine the address of the empty memory region, the NIC causes the computing device to determine the address of the empty memory region. In a further aspect, the NIC updates the first ring to no longer include the address of the memory region. In another aspect, the NIC receives a read request for reading data. The NIC determines an address of a memory region that stores the data and reads the data from the memory region based on the address. The NIC provides the data to the requesting computing device (also referred to as a “consumer device”). The NIC updates a first ring buffer to comprise a pointer indicating the address of the memory region the data was stored in. In another aspect, to determine the address of the memory region that stores the data, the NIC causes the consumer device to determine the address. In another aspect, the NIC determines a storage capacity of a memory device satisfies a first storage criterion. The NIC transfers data from the memory device to a spill storage. The NIC updates the first ring to include a pointer indicating an address of the memory region the data was transferred from. In a further aspect, the NIC determines the storage capacity of the memory device satisfies a second storage criterion. The NIC determines a memory region data stored in the spill storage is to be transferred to based on a pointer in the first ring. The NIC transfers the data stored in the spill storage to the memory region. The NIC updates the second ring to include a pointer to the memory region the data was transferred to. BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments and, together with the description, further serve to explain the principles of the embodiments and to enable a person skilled in the pertinent art to make and use the embodiments. FIG. 1 shows a block diagram of an example system for transferring data utilizing a ring data-structure, in accordance with an example embodiment. FIG. 2 shows a block diagram of an example system comprising the computing device of FIG. 1, in accordance with an example embodiment. FIG. 3A shows a flowchart of a process for writing data utilizing a computing device with a ring data-structure, in accordance with an example embodiment. FIG. 3B shows a flowchart of a process for reading data utilizing a computing device with a ring data-structure, in accordance with an example embodiment. FIG. 4 shows a block diag