US-20260127137-A1 - TECHNIQUE FOR CREATING AN IN-MEMORY COMPACT STATE OF SNAPSHOT METADATA

US20260127137A1US 20260127137 A1US20260127137 A1US 20260127137A1US-20260127137-A1

Abstract

A technique creates a compact state of snapshot metadata and associated selected snapshots that are frequently used and maintained in memory of a node of a cluster to facilitate processing of workflow operations associated with a logical entity in a disaster recovery (DR) environment. The compact state represents a minimal subset of snapshot metadata that is frequently used to perform operations in accordance with the DR workflow operations. In addition, metadata associated with the progress of the DR workflow operations processed by the node is periodically consolidated within the compact state. Illustratively, the selected frequently used snapshots of the logical entity include (i) a recently created snapshot; (ii) one or more reference snapshots; (iii) a snapshot scheduled for replication; and (iv) any snapshot that is queued for a current or future-scheduled operation. The technique is also directed to a snapshot and metadata eviction policy that is configured to evict infrequently used snapshots and snapshot metadata to improve memory space consumption of the memory.

Inventors

Abhishek Gupta
Freddy James
Pranab Patnaik
Ranjan MN

Assignees

Nutanix, Inc.

Dates

Publication Date: 20260507
Application Date: 20241107
Priority Date: 20210508

Claims (20)

1 . A method comprising: creating a compact state of snapshots and associated metadata used to process workflow operations on a computer node, wherein the snapshots include a reference snapshot and one or more incremental snapshots used for replication, and wherein the metadata includes timestamps used for time-ordered scans of the snapshots; and retaining the compact state of the snapshots and the metadata in a memory of the computer node based on a status in a workflow hierarchy as represented by a DR state indicating progress of the workflow operations performed using the snapshots, wherein the compact state of snapshots and metadata is reduced to an amount for performing the workflow operations using the snapshots with additional metadata accessible via on-demand paging from storage devices of a backing store of the computer node.
2 . The method of claim 1 wherein the compact state represents a reduced subset of the metadata in accordance with actual or expected performance of the workflow operations.
3 . The method of claim 1 wherein the metadata associated with progress of the workflow operations processed on the computer node is periodically consolidated within the compact state.
4 . The method of claim 1 further comprising dynamically allocating memory for the additional metadata needed to perform the workflow operations.
5 . The method of claim 1 further comprising evicting the additional metadata from the memory to the backing store once the workflow operations are completed.
6 . The method of claim 1 wherein the snapshots further include a snapshot scheduled for replication and a snapshot queued for a current or future-scheduled operation.
7 . The method of claim 1 wherein the status is defined by eviction rules and their reference to current or future application operations.
8 . The method of claim 1 wherein the state includes meta-information indicating a current or future progress of the workflow operations and the snapshots needed to process the workflow operations.
9 . A non-transitory computer readable medium including program instructions for execution on a processor of a node, the program instructions configured to: create a compact state of snapshots and associated metadata used to process workflow operations on the node, wherein the snapshots include a reference snapshot and one or more incremental snapshots used for replication, and wherein the metadata includes timestamps used for time-ordered scans of the snapshots; and retain the compact state of the snapshots and the metadata in a memory of the node based on a status in a workflow hierarchy as represented by a state indicating progress of the workflow operations performed using the snapshots, wherein the compact state of snapshots and metadata is reduced to an amount for performing the workflow operations using the snapshots with additional metadata accessible via on-demand paging from storage devices of a backing store of the node.
10 . The non-transitory computer readable medium of claim 9 wherein the compact state represents a reduced subset of the metadata in accordance with actual or expected performance of the workflow operations.
11 . The non-transitory computer readable medium of claim 9 wherein the metadata associated with progress of the workflow operations processed on the node is periodically consolidated within the compact state.
12 . The non-transitory computer readable medium of claim 9 wherein the program instructions are further configured to dynamically allocate memory for the additional metadata needed to perform the workflow operations.
13 . The non-transitory computer readable medium of claim 9 wherein the program instructions are further configured to evict the additional metadata from the memory to the backing store once the workflow operations are completed.
14 . The non-transitory computer readable medium of claim 9 wherein the snapshots further include a snapshot scheduled for replication and a snapshot queued for a current or future-scheduled operation.
15 . The non-transitory computer readable medium of claim 9 wherein the status is defined by eviction rules and their reference to current or future application operations.
16 . The non-transitory computer readable medium of claim 9 wherein the state includes meta-information indicating a current or future progress of the workflow operations and the snapshots needed to process the workflow operations.
17 . An apparatus comprising: a node having a memory and a processor configured to execute program instructions to: create a compact state of snapshots and associated metadata used to process workflow operations on the node, wherein the snapshots include a reference snapshot and one or more incremental snapshots used for replication, and wherein the metadata includes timestamps used for time-ordered scans of the snapshots; and retain the compact state of the snapshots and the metadata in a memory of the node based on a status in a workflow hierarchy as represented by a state indicating progress of the workflow operations performed using the snapshots, wherein the compact state of snapshots and metadata is reduced to an amount for performing the workflow operations using the snapshots with additional metadata accessible via on-demand paging from storage devices of a backing store of the node.
18 . The apparatus of claim 17 , wherein the compact state represents a reduced subset of the metadata in accordance with actual or expected performance of the workflow operations.
19 . The apparatus of claim 17 , wherein the status is defined by eviction rules and their reference to current or future application operations.
20 . The apparatus of claim 17 , wherein the state includes meta-information indicating a current or future progress of the workflow operations and the snapshots needed to process the workflow operations.

Description

BACKGROUND CROSS-REFERENCE TO RELATED APPLICATIONS The present application is a continuation of U.S. patent application Ser. No. 17/376,822, entitled TECHNIQUE FOR CREATING AN IN-MEMORY COMPACT STATE OF SNAPSHOT METADATA, filed on Jul. 15, 2021 by Abhishek Gupta et al., which claims the benefit of India Provisional Patent Application Serial No. 202141020922, which was filed on May 8, 2021, by Abhishek Gupta, et al. for TECHNIQUE FOR CREATING AN IN-MEMORY COMPACT STATE OF SNAPSHOT METADATA, which are hereby incorporated by reference. TECHNICAL FIELD The present disclosure relates to snapshots and, more specifically, to use of snapshots and snapshot metadata to facilitate processing of workflow operations in a disaster recovery (DR) environment. Background Information Data failover generally involves copying or replicating data of workloads among one or more nodes of clusters embodied as, e.g., datacenters to enable continued operation of data processing operations in a multi-site data replication environment, such as disaster recovery (DR). Such data replication may involve a large number of point-in-time images or “snapshots” of workloads that include data of the snapshot (e.g., a virtual disk exported to the VM) as well as snapshot metadata. However, not all snapshots and snapshot metadata may be needed for all snapshot operations as many snapshots are created to support arbitrary roll-back, which is rarely used. Yet, all of the snapshot metadata associated with each snapshot is typically maintained in memory even if some of the snapshots and metadata are infrequently used. Maintenance of infrequently used snapshots and snapshot metadata needlessly increases consumption of resources such as memory (i.e., memory footprint). BRIEF DESCRIPTION OF THE DRAWINGS The above and further advantages of the embodiments herein may be better understood by referring to the following description in conjunction with the accompanying drawings in which like reference numerals indicate identically or functionally similar elements, of which: FIG. 1 is a block diagram of a plurality of nodes interconnected as a cluster in a virtualized environment; FIG. 2 is a block diagram of a virtualization architecture executing on a node to implement the virtualization environment; FIG. 3 is a block diagram of a controller virtual machine of the virtualization architecture; FIG. 4 is a block diagram of an exemplary multi-site data replication environment configured for use in various deployments such as for disaster recovery (DR); and FIG. 5 is a block diagram illustrating a technique for creating a compact state of snapshots and associated metadata that is maintained in memory of a node of the cluster to facilitate processing of workflow operations in a DR environment. OVERVIEW The embodiments described herein are directed to a technique for creating a compact state of snapshot metadata and associated selected snapshots that are frequently used (or expected to be frequently used) and thus maintained in memory of a node of a cluster to facilitate processing of workflow operations associated with a logical entity, such as a virtual machine, in a disaster recovery (DR) environment. The compact state represents a reduced (e.g., minimal) subset of snapshot metadata in accordance with actual or expected performance of operations, such as frequently used DR workflow operations. In addition, metadata associated with the progress of the DR workflow operations processed by the node is periodically consolidated within the compact state. Illustratively, the selected, frequently-used snapshots of the logical entity (usually associated with DR of the logical entity) include (i) a recently created (latest) snapshot; (ii) one or more reference snapshots; (iii) a snapshot scheduled for replication; and (iv) any snapshot that is queued for a current or future-scheduled operation. The technique is also directed to a snapshot and metadata eviction policy that is configured to evict infrequently used snapshots and snapshot metadata to improve memory space consumption of the memory (i.e., the memory footprint). Eviction rules of the eviction policy are applied to the snapshots of the logical entity to ensure that the selected snapshots are not evicted from (i.e., are retained in) memory. In essence, the eviction policy retains snapshots essential for expected near-term use (e.g., based on a time threshold) and for DR operations (e.g., snapshot replication to other sites). As such, the eviction policy is application aware (e.g., DR workflow processing) and predictive of application object use. DESCRIPTION FIG. 1 is a block diagram of a plurality of nodes 110 interconnected as a cluster 100 and configured to provide compute and storage services for information, i.e., data and metadata, stored on storage devices of a virtualization environment. Each node 110 is illustratively embodied as a physical computer having hardware resources, such as one or more processor