Search

EP-4449237-B1 - NEAR MEMORY PROCESSING MODULE FOR DISTRIBUTION OF ALLOCATION UNITS

EP4449237B1EP 4449237 B1EP4449237 B1EP 4449237B1EP-4449237-B1

Inventors

  • GOLANDER, AMIT
  • GOODMAN, DANIEL

Dates

Publication Date
20260513
Application Date
20220111

Claims (8)

  1. A near memory processing, NMP, module (202A-N, 302A-N, 406A-N) for a centralised logging storage array (204, 304) which includes a plurality of servers (206A-N, 306A-N) configured to host a plurality of allocation units (208A-N, 308A-N, 404A-N, 502A-N, 604A-N) assigned to a logical volume, where the NMP module (202A-N, 302A-N, 406A-N) is associated with the plurality of allocation units (208A-N, 308A-N, 404A-N, 502A-N, 604A-N); wherein the NMP module (202A-N, 302A-N, 406A-N) is configured to return a current state of the associated allocation unit in response to one or more commands received from a plurality of client devices (402A-N, 602A-N), where the state of the allocation unit includes a mode of the allocation unit which is one of a set of modes comprising: ACTIVE, UNINITIALISED and SEALED; wherein, when the current mode of the allocation unit is UNINITIALISED, the NMP module (202A-N, 302A-N, 406A-N) is configured to change the mode of the associated allocation unit to ACTIVE in response to receiving an append command from a client device (310) including a current state of a previous allocation unit, which includes the SEALED mode; wherein, when the current mode of the allocation unit is ACTIVE and the NMP module (202A-N, 302A-N, 406A-N) determines that the allocation unit is full, the NMP module (202AN, 302A-N, 406A-N) is configured to change the mode of the allocation unit to SEALED, and reject any incoming append commands, wherein changing the mode of the associated allocation unit from UNINITIALISED to ACTIVE further comprises setting a log offset of the associated allocation unit based on a log offset and a last valid byte of the previous allocation unit included in the current state of the previous allocation unit including the SEALED mode received with the append command from the client device (310), wherein each unit represents a segment of the log, and a final size of each of the allocation units is dynamic, wherein a first allocation unit begins at a beginning of the log, and a next allocation unit logically starts from a next byte after the size of the previous allocation unit, wherein once the first allocation unit is sealed, a final size is determined, enabling the next allocation unit to begin writing, and wherein a size of the allocation units is recorded to enable the next allocation unit to know its logical offset.
  2. The NMP module (202A-N, 302A-N, 406A-N) of claim 1, further configured to: determine if a remaining storage capacity of the associated allocation unit is below a first threshold, change the mode of the allocation unit to SEALING and reject any further incoming append commands, and change the mode of the allocation unit to SEALED once any pending append actions have been completed.
  3. The NMP module (202A-N, 302A-N, 406A-N) of claim 2, further configured to: determine if a remaining storage capacity of the associated allocation unit is below a second threshold larger than the first threshold, change the mode of the current allocation unit to FULL, and reject any further incoming append commands associated with a payload which is larger than a predetermined payload size threshold.
  4. The NMP module (202A-N, 302A-N, 406A-N) of any preceding claim, further configured to return the current state of the associated allocation unit in response to a getState command received from the plurality of client devices (402A-N, 602A-N).
  5. A method of operating a centralised logging storage array (204, 304) which includes a plurality of servers (206A-N, 306A-N) configured to host a plurality of allocation units (208A-N, 308A-N, 404A-N, 502A-N, 604A-N) assigned to a logical volume, and a near memory processing, NMP, module (202A-N, 302A-N, 406A-N) associated with the plurality of allocation units (208AN, 308A-N, 404A-N, 502A-N, 604A-N), the method comprising: returning, by the NMP module (202A-N, 302A-N, 406A-N) in response to one or more commands received from a plurality of client devices (402A-N, 602A-N), a current state of the associated allocation unit, where the state of the allocation unit includes a mode of the allocation unit which is one of a set of modes comprising: ACTIVE, UNINITIALISED and SEALED; changing, by the NMP module (202A-N, 302A-N, 406A-N) when the current mode of the allocation unit is UNINITIALISED, the mode of the associated allocation unit to ACTIVE in response to receiving an append command from a client device (310) including a current state of a previous allocation unit, which includes the SEALED mode; and changing, by the NMP module (202A-N, 302A-N, 406A-N) when the current mode of the allocation unit is ACTIVE and the NMP module (202A-N, 302A-N, 406A-N) determines that the allocation unit is full, the mode of the allocation unit to SEALED, and rejecting any incoming append commands, wherein changing the mode of the associated allocation unit from UNINITIALISED to ACTIVE further comprises setting a log offset of the associated allocation unit based on a log offset and a last valid byte of the previous allocation unit included in the current state of the previous allocation unit including the SEALED mode received with the append command from the client device (310), wherein each unit represents a segment of the log, and a final size of each of the allocation units is dynamic, wherein a first allocation unit begins at a beginning of the log, and a next allocation unit logically starts from a next byte after the size of the previous allocation unit, wherein once the first allocation unit is sealed, a final size is determined, enabling the next allocation unit to begin writing, and wherein a size of the allocation units is recorded to enable the next allocation unit to know its logical offset.
  6. The method of claim 5, further comprising: determining, by the NMP module (202A-N, 302A-N, 406A-N), if a remaining storage capacity of the associated allocation unit is below a first threshold, changing, by the NMP module (202A-N, 302A-N, 406A-N), the mode of the allocation unit to SEALING and rejecting any further incoming append commands, and changing, by the NMP module (202A-N, 302A-N, 406A-N), the mode of the allocation unit to SEALED once any pending append actions have been completed.
  7. The method of claim 6, further comprising: determining, by the NMP module (202A-N, 302A-N, 406A-N), if a remaining storage capacity of the associated allocation unit is below a second threshold larger than the first threshold, changing, by the NMP module (202A-N, 302A-N, 406A-N), the mode of the current allocation unit to FULL, and rejecting, by the NMP module (202A-N, 302A-N, 406A-N), any further incoming append commands associated with a payload which is larger than a predetermined payload size threshold.
  8. The method of any one of claims 5 to 7, further comprising returning, by the NMP module (202A-N, 302A-N, 406A-N), the current state of the associated allocation unit in response to a getState command received from the plurality of client devices (402A-N, 602A-N).

Description

TECHNICAL FIELD The disclosure relates generally to near memory processing, NMP, module for a centralised logging storage array, and more particularly, the disclosure relates to a client device arranged to communicate with the centralised logging storage array. Moreover, the disclosure relates to a method of operating the centralised logging storage array. BACKGROUND FIG. 1 illustrates an architecture 100 for continuous multi-writer logging in accordance with a prior art. The architecture 100 includes one or more client devices 102A-N, a RDMA Fabric 104, and a storage array 106. The one or more client devices 102A-N include a first client device 102A, a second client device 102B and a Nth client device 102N, each including an application with commands including pmWAL and pmAddr. The RDMA Fabric 104 allows data transfers with much less overhead and a faster response time from lower latency. The storage array 106 includes one or more datastructures 108A-N, each including a plugin and an allocation unit. The architecture 100 segments the log into 1 GB allocation units (i.e. logic is split between a client element and the allocation unit plugin). One or more plugins 110A-N are able to write to one or more allocation units 112A-N. The one or more plugins 110A-N include logging applications like database write-ahead-logs (WAL), require that each append to a log return an address of the appended record and that critically all data up to that address is valid and continuous with no holes nor torn data. Success responses from the one or more plugins 110AN are sequenced with specific addresses. Further, the architecture 100 is based on programmable memory and the RDMA fabric 104, which provides latency performance with a quality of service (QoS) 10x current solutions for specific Input/Output (IO) sizes. Even with maintaining the latency with QoS, usage of these logs requires fail-fast i.e. the append operation may be failed as quickly as possible. And the response latency is critical to application performance, regardless of whether an append succeeds or fails. The architecture 100 may continue to write valid data to the log, even if the client or the application has failed, i.e. optimistic writing of data to the log is allowed as long as the log is valid. Each allocation unit represents a segment of the log, and a final size of each of the allocation units is dynamic as append payload size is not fixed. The first allocation unit 112A begins at the beginning of the log, and a next allocation unit 112N will logically start from a next byte after the size of the previous allocation unit (i.e. the first allocation unit 112A). Once the first allocation unit 112A is sealed, the final size can be determined, which enables the next allocation unit to begin writing. The size of the allocation units must be recorded in order to enable the next allocation unit to know its logical offset. Switching to the next allocation unit may arise drawbacks including when and who will allocate the next allocation unit when there is efficient remaining space in the previous allocation unit, and ensuring of all the IOs to the current allocation unit is either completed or failed, before begin writing to the next allocation unit. As the allocation unit represents a logical segment of the log, the next allocation unit cannot begin to service appends until knowing the final size of the previous allocation unit, as this is required for the allocation unit to know its logical offset within the log. And a single client may not reliably be used to send data to the next allocation unit, as the client may never return or a new client may appear at any time. Therefore, there arises a need to address the aforementioned technical problem/drawbacks in developing a solution that enables filling of allocation units and automatically allocating the next allocation unit with fast and efficient service in a storage array. The document US 2019/0007206 A1 shows a system and method for encrypting object index in a distributed storage environment. SUMMARY The present invention is defined by the independent claims. Further advantageous developments are shown by the dependent claims. It is an object of the disclosure to provide a near memory processing, NMP, module for a centralised logging storage array, a client device arranged to communicate with the centralised logging storage array, and a method of operating the centralised logging storage array, that provides distribution of allocation units by filling of allocation units and automatically allocating the next allocation unit with fast and efficient service in a storage array while avoiding one or more disadvantages of prior art approaches. This object is achieved by the features of the independent claims. Further, implementation forms are apparent from the dependent claims, the description, and the figures. The disclosure provides a near memory processing, NMP, module for a centralised logging storage array,