US-12619485-B1 - Initiating storage volume health tests via container orchestration systems
Abstract
Architectures and techniques are described that can leverage a container storage interface (CSI) driver to initiate volume health tests for a storage system in accordance with certain embodiments of this disclosure. The CSI driver can conform to a CSI specification or standard and can be used to expose the storage system to a container orchestration system. The volume health tests can be tailored to reduce or mitigate the impact of testing based on a status or state of the volumes. For instance, enforcement can be provided to ensure that only shorter, low-impact testing is performed on volumes having a published state, whereas longer, more intensive testing can be performed on volumes that are not in the published state.
Inventors
- Mikhail BORISOV
- Danil Safronov
- Mikhail Edkov
Assignees
- DELL PRODUCTS, L.P.
Dates
- Publication Date
- 20260505
- Application Date
- 20241105
Claims (20)
- 1 . A device, comprising: at least one processor; and at least one memory that stores executable instructions that, when executed by the at least one processor, facilitate performance of operations, comprising: receiving, via a control plane of a container orchestration (CO) system, a health request that requests a health test of a storage system that is exposed to the CO system via a container storage interface (CSI) driver that conforms to a specified CSI standard, wherein the health request comprises: volume data that identifies a portion of the storage system to be tested; and mode data that identifies an operating mode of the health test; executing, using the CSI driver, the health test indicated by the health request; and returning, using the CSI driver, results of the health test to the control plane of the CO system, wherein the results of the health test represent testing results from a self-monitoring, analysis, and reporting technology (SMART) procedure implemented by storage devices of the storage system, and wherein the SMART procedure is implemented as a SMART short test in response to the mode data indicating a background operating mode, and a SMART long test in response to the mode data indicating a foreground operating mode.
- 2 . The device of claim 1 , wherein the mode data comprises an indication of at least one of: the background operating mode indicating that the health test is to operate as a background operation having a limited impact on resource utilization of the storage system relative to an unrestricted impact on the resource utilization of the storage system; or the foreground operating mode indicating that the health test is to operate as a foreground operation with the unrestricted impact on the resource utilization of the storage system.
- 3 . The device of claim 2 , wherein the operations further comprise determining a service status of a target volume of the storage system that is identified by the volume data, and wherein the health test executed using the CSI driver compares the mode data to the service status of the target volume and, in response to the mode data indicating the foreground operating mode and the service status of the target volume being determined to be an in-use state or a published state, rejecting the health request.
- 4 . The device of claim 3 , wherein the service status is at least one of: an idle state or the in-use state, wherein the idle state is determined to exist in response to the target volume being determined to be in a created state, a node-ready state, or a volume-ready state, and wherein the idle state is determined to exist in response to the target volume being determined to be in the published state.
- 5 . The device of claim 2 , wherein the operations further comprise: determining a number of volumes identified by the volume data; and in response to the number of volumes being greater than a defined threshold and the mode data indicating the background operating mode, rejecting the health request.
- 6 . The device of claim 1 , wherein the operations further comprise generating response data indicative of a response to the health request, the response data comprising at least one of: a status field that indicates a status of the health test, a mode field that indicates the operating mode of the health test, a progress field that indicates an execution progress of the health test, or a completion field that indicates a time associated with a completion of the health test according to a type of completion indicated by the status field.
- 7 . The device of claim 6 , wherein the status field represents an indication that the health test is in at least one of: an in-progress state, a failed state, an aborted state, a completed state, or a rejected state.
- 8 . The device of claim 1 , wherein the control plane or the CSI driver comprises an interface structure, comprising an application programming interface (API) or a remote procedure call (RPC) interface, that is configured to process the health request, and wherein the interface structure is further configured to process: a request to identify volumes currently being tested; a request to abort the health test; a request to restart the health test; or a request to schedule the health test.
- 9 . The device of claim 8 , wherein the request to schedule the health test comprises the volume data identifying the portion of the storage system to be tested, a maximum number of volumes to be tested in parallel per storage class, test options data indicating configurable options for the health test, test interval data indicating intervals between testing and a timeout data indicating a threshold time in which to abort the test.
- 10 . The device of claim 1 , wherein the operations further comprise forwarding, using the CSI driver, the health test to the storage system and receiving, using the CSI driver, the results from the storage system.
- 11 . A method, comprising: receiving, by a device comprising at least one processor, a health request that requests a health test of a storage system that is exposed to a container orchestration (CO) system via a container storage interface (CSI) driver that conforms to a CSI standard; determining, by the device, that the health request comprises volume data that identifies at least one volume of the storage system and comprises mode data that identifies an operating mode of the health test, wherein the operating mode of the health test is subject to a service status of that at least one volume of the storage system; executing, by the device, the health test indicated by the health request; reporting, by the device, results of the health test to a control plane of the CO system; determining, by the device, the service status of the at least one volume of the storage system that is identified by the volume data, wherein the service status is at least one of an idle status or an in-use status; and in response to determining that any one of the at least one volume has the in-use status or any different volume on a same drive as the at least one volume has the in-use status rejecting, by the device, the health test in response to determining that the operating mode indicated by the mode data is a foreground operating mode, or rejecting, by the device, the health test in response to determining that the operating mode indicated by the mode data is a background operating mode and that a number of volumes being tested exceeds a defined threshold.
- 12 . A non-transitory computer-readable medium comprising instructions that, in response to execution, cause a system comprising at least one processor to perform operations, comprising: receiving, via a control plane of a container orchestration (CO) system, a health request that requests a health test of a storage system that is exposed to the CO system via a container storage interface (CSI) driver that conforms to a CSI standard, wherein the health request comprises: volume data that identifies a portion of the storage system to be tested; and mode data that identifies an operating mode of the health test; executing, by the CSI driver, the health test indicated by the health request; and returning, by the CSI driver, results of the health test to the control plane of the CO system, wherein the results of the health test represent testing results from a self-monitoring, analysis, and reporting technology (SMART) procedure implemented by storage devices of the storage system, and wherein the SMART procedure is implemented as a SMART short test in response to the mode data indicating a background operating mode, and a SMART long test in response to the mode data indicating a foreground operating mode.
- 13 . The non-transitory computer-readable medium of claim 12 , wherein the operations further comprise forwarding, by the CSI driver, the health test to the storage system and receiving, by the CSI driver, the results from the storage system.
- 14 . The non-transitory computer-readable medium of claim 12 , wherein the mode data comprises an indication of at least one of: the background operating mode indicating that the health test is to operate as a background operation having a limited impact on resource utilization of the storage system relative to an unrestricted impact on the resource utilization of the storage system; or the foreground operating mode indicating that the health test is to operate as a foreground operation with the unrestricted impact on the resource utilization of the storage system.
- 15 . The non-transitory computer-readable medium of claim 14 , wherein the operations further comprise determining a service status of a target volume of the storage system that is identified by the volume data, and wherein the health test executed using the CSI driver compares the mode data to the service status of the target volume and, in response to the mode data indicating the foreground operating mode and the service status of the target volume being determined to be an in-use state or a published state, rejecting the health request.
- 16 . The non-transitory computer-readable medium of claim 15 , wherein the service status is at least one of: an idle state or the in-use state, wherein the idle state is determined to exist in response to the target volume being determined to be in a created state, a node-ready state, or a volume-ready state, and wherein the idle state is determined to exist in response to the target volume being determined to be in the published state.
- 17 . The non-transitory computer-readable medium of claim 14 , wherein the operations further comprise: determining a number of volumes identified by the volume data; and in response to the number of volumes being greater than a defined threshold and the mode data indicating the background operating mode, rejecting the health request.
- 18 . The non-transitory computer-readable medium of claim 12 , wherein the operations further comprise generating response data indicative of a response to the health request, the response data comprising at least one of: a status field that indicates a status of the health test, a mode field that indicates the operating mode of the health test, a progress field that indicates an execution progress of the health test, or a completion field that indicates a time associated with a completion of the health test according to a type of completion indicated by the status field.
- 19 . The non-transitory computer-readable medium of claim 18 , wherein the status field represents an indication that the health test is in at least one of: an in-progress state, a failed state, an aborted state, a completed state, or a rejected state.
- 20 . The non-transitory computer-readable medium of claim 12 , wherein the control plane or the CSI driver comprises an interface structure, comprising an application programming interface (API) or a remote procedure call (RPC) interface, that is configured to process the health request, and wherein the interface structure is further configured to process: a request to identify volumes currently being tested; a request to abort the health test; a request to restart the health test; or a request to schedule the health test.
Description
BACKGROUND Containerization is a lightweight virtualization technique that provides high consistency, operating systems distribution portability, efficient resource management, and consistency across multiple environments. Thus, applications or application programming interfaces (APIs) can be containerized to provide numerous benefits to service providers and their subscribers. Due to the many benefits of containerization, many container orchestration (CO) systems and associated products have entered the marketplace to help automate and orchestrate containerization. One such example product is Kubernetes. Kubernetes is an open-sourced software tool that can effectively manage containerized applications with reduced manual intervention. Container Storage Interface (CSI) is a standard for exposing arbitrary block and file storage systems to containerized workloads on CO systems like Kubernetes or others. CSI enables storage vendors to develop a plugin once and have the plugin work across multiple CO systems, without requiring modifications to the core CO code. BRIEF DESCRIPTION OF THE DRAWINGS Numerous aspects, embodiments, objects, and advantages of the present embodiments will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which: FIG. 1 shows a schematic block diagram illustrating an example container orchestration (CO) system using a container storage interface driver to interface to a storage system in accordance with certain embodiments of this disclosure; FIG. 2 depicts schematic diagram illustrating CSI volume lifecycle APIs in accordance with certain embodiments of this disclosure; FIG. 3 depicts a schematic block diagram illustrating an example CO system that can leverage a CSI driver to initiate volume health tests for a storage system in accordance with certain embodiments of this disclosure; FIG. 4 depicts a schematic block diagram is depicted illustrating an example a process flow relating to a CO system initiating a volume health test for a storage system in accordance with certain embodiments of this disclosure; FIG. 5A depicts a schematic block diagram illustrating various examples of volume data in accordance with certain embodiments of this disclosure; FIG. 5B depicts a schematic block diagram illustrating an example format of response 337 in accordance with certain embodiments of this disclosure; FIG. 6 depicts a schematic block diagram illustrating an example device that can leverage a CSI driver to initiate volume health tests for a storage system and enforce rules for the volume health test to be performed in accordance with certain embodiments of this disclosure FIG. 7 depicts a schematic block diagram illustrating additional elements or aspects of the example device that can leverage a CSI driver to initiate volume health tests for a storage system and enforce rules for the volume health test to be performed in accordance with certain embodiments of this disclosure; FIG. 8 illustrates an example method that can leverage a CSI driver to initiate volume health testing for a storage system exposed to a CO system in accordance with certain embodiments of this disclosure; FIG. 9 illustrates an example method that can provide additional aspects or elements relating to leveraging a CSI driver to initiate volume health testing for a storage system exposed to a CO system in accordance with certain embodiments of this disclosure; FIG. 10 illustrates a block diagram of an example distributed file storage system that employs tiered cloud storage in accordance with certain embodiments of this disclosure; and FIG. 11 illustrates an example block diagram of a computer operable to execute certain embodiments of this disclosure. DETAILED DESCRIPTION Overview The disclosed subject matter is now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed subject matter. It may be evident, however, that the disclosed subject matter may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing the disclosed subject matter. To provide additional context, consider FIG. 1. FIG. 1 shows a schematic block diagram illustrating an example container orchestration (CO) system using a container storage interface driver to interface to a storage system in accordance with certain embodiments of this disclosure. For example, CO system 100 can represent a container orchestration platform cluster or the like. As a representative example used for the remainder of this document, CO system 100 is presented in the context (e.g., operation and nomenclature