EP-4736019-A1 - PERFORMING SNAPSHOTTING OPERATIONS USING DATA STRUCTURES STORING WEAK REFERENCES

EP4736019A1EP 4736019 A1EP4736019 A1EP 4736019A1EP-4736019-A1

Abstract

A processing system manages an ordered set of hierarchical data structures representing snapshots that define the state of a filesystem at different points in time. The data structures include nodes that store strong references, which can be utilized to read or write data, or weak references that can be only utilized to read data. In order to create a new snapshot, a root node is created for a new data structure representing the new snapshot by copying the root node of the data structure representing an active view of the filesystem. Any strong references stored in the copy of the root node are then changed to weak references. Snapshotting operations can then be performed with respect to the new snapshot utilizing the weak references stored in the root node of the new data structure.

Inventors

SINGH, Neeraj Kumar

Assignees

Microsoft Technology Licensing, LLC

Dates

Publication Date: 20260506
Application Date: 20240612

Claims (15)

1. A computer-implemented method, comprising: storing a first hierarchical data structure defining an active view of a filesystem, the first hierarchical data structure comprising a first root node storing a strong reference to a first internal node in the first hierarchical data structure; receiving a request to create a snapshot of the filesystem defined by the first hierarchical data structure; creating a second hierarchical data structure defining a new active view of the filesystem, the second hierarchical data structure comprising a second root node created by copying the first root node; and changing the strong reference to the first internal node in the first hierarchical data structure in the second root node to a weak reference to the first internal node in the first hierarchical data structure.
2. The computer-implemented method of claim 1, further comprising performing a search operation with respect to the active view of the filesystem by following the weak reference stored by the second root node of the second hierarchical data structure to the first internal node in the first hierarchical data structure.
3. The computer-implemented method of claim 1, further comprising: receiving a request to modify the second hierarchical data structure defining the new active view of the filesystem; traversing layers of the second hierarchical data structure, at each of the layers of the second hierarchical data structure, performing a copy- on-write operation to create a copy of a node from the first hierarchical data structure in the second hierarchical data structure, modifying weak references and strong references in nodes of the second hierarchical data structure to reference the first hierarchical data structure or the second hierarchical data structure, and modifying a record in a leaf node of the second hierarchical data structure according to the request.
4. The computer-implemented method of claim 1, further comprising: responsive to receiving the request to create the snapshot of the filesystem defined by the first hierarchical data structure, identifying an internal node in the first hierarchical data structure that is dirty 7 , and changing a strong reference in the second root node to the first internal node of the first hierarchical data structure that is dirty to a dirty weak reference.
5. The computer-implemented method of claim 4, further comprising: receiving a request to persist the snapshot of the filesystem defined by the second hierarchical data structure; and responsive to the request to persist the snapshot of the filesystem defined by the second hierarchical data structure, utilizing the dirty weak reference to obtain checksums for nodes in the first hierarchical data structure, using the checksums to compute and store new checksums for nodes in the second hierarchical data structure, and persisting the second hierarchical data structure to mass storage.
6. A computer-readable storage medium having computer-executable instructions stored thereupon which, when executed by a processing system, cause the processing system to: receive a request to create a snapshot of filesystem represented by a first B+ tree, the first B+ tree comprising a first root node storing a strong reference to a first internal node in the first B+ tree; and responsive to receiving the request to create the snapshot of the filesystem represented by the first B+ tree, create a second B+ tree representing an active view of the filesystem, the second B+ tree comprising a second root node produced by creating a copy of the first root node, and change the strong reference in the second root node to a weak reference to the first internal node in the first B+ tree.
7. The computer-readable storage medium of claim 6, having further computerexecutable instructions stored thereupon to: perform a search operation with respect to the active view of the filesystem by following the weak reference stored by the second root node of the second B+ tree to the first internal node in the first B+ tree.
8. The computer-readable storage medium of claim 6, having further computerexecutable instructions stored thereupon to: receive a request to modify the second B+ tree defining the active view of the filesystem; and responsive to the request to modify the second B+ tree, traverse layers of the second B+ tree, at each of the layers of the second B+ tree, perform a copy-on-write operation to create a copy of a node from the first B+ tree in the second B+ tree. modify weak references and strong references in nodes of the second B+ tree to reference the first B+ tree or the second B+ tree, and modify 7 a record in a leaf node of the second B+ tree according to the request.
9. The computer-readable storage medium of claim 6, having further computerexecutable instructions stored thereupon to: identify 7 a first internal node in the first B+ tree that is dirty; and change a strong reference in the second root node to the first internal node of the first Entree that is dirty 7 to a dirty 7 weak reference.
10. The computer-readable storage medium of claim 6. wherein the filesystem comprises data defining a mapping betyveen virtual cluster numbers (VCNs) and logical cluster numbers (LCNs) for a file.
11. A processing system, comprising: a processor; and a computer-readable storage medium having computer-executable instructions stored thereupon which, yvhen executed, cause the processing system to: receive a request to create a snapshot of filesystem represented by a first B+ tree, the first B+ tree comprising a first root node storing a strong reference to a first internal node in the first B+ tree; and responsive to receiving the request to create the snapshot of the filesystem represented by the first B+ tree, create a second B+ tree representing an active view of the filesystem, the second B+ tree comprising a second root node produced by creating a copy of the first root node, and change the strong reference in the second root node to a weak reference to the first internal node in the first B+ tree.
12. The processing system of claim 11, wherein the computer-readable storage medium has further computer-executable instructions stored thereupon to: perform a search operation with respect to the active view of the filesystem by following the weak reference stored by the second root node of the second B+ tree to the first internal node in the first B+ tree.
13. The processing system of claim 11, wherein the computer-readable storage medium has further computer-executable instructions stored thereupon to: receive a request to modify 7 the second B+ tree defining the active view of the filesystem; and responsive to the request to modify the second B+ tree, traverse layers of the second B+ tree, at each of the layers of the second B+ tree, perform a copy-on-\\rite operation to create a copy of a node from the first B+ tree in the second B+ tree, modify weak references and strong references in nodes of the second B+ tree to reference the first B+ tree or the second B+ tree, and modify a record in a leaf node of the second B+ tree according to the request.
14. The processing system of claim 11, wherein the computer-readable storage medium has further computer-executable instructions stored thereupon to: receive a request to delete the snapshot of the filesystem represented by the first B+ tree; and responsive to the request to delete the snapshot of the filesystem represented by the first B+ tree, identify leaf nodes in the first B+ tree having an associated strong reference from another node in the first B+ tree and a corresponding leaf node in the second B+ tree having an associated weak reference from another node in the second B+ tree, store data referencing the identified leaf nodes, and utilize the stored data to determine when the identified leaf nodes can be deleted.
15. The processing system of claim 11. wherein the filesystem comprises data defining a mapping between virtual cluster numbers (VCNs) and logical cluster numbers (LCNs) for a file.

Description

PERFORMING SNAPSHOTTING OPERATIONS USING DATA STRUCTURES STORING WEAK REFERENCES BACKGROUND [0001] Many types of processing systems support snapshotting functionality that enable data, such as files, filesystems, volumes, pools, and database tables, to be rolled back to a prior state or to be read as a historical version. Using this functionality, a read-only snapshot instance can be created that captures a consistent version of the data at a point in time. As the active view, or views, of the data changes over time, the data for a snapshot instance remains readable by explicit request. [0002] Some filesystems implement aspects of snapshotting functionality using hierarchical reference counting of data and metadata. Using hierarchical reference counting and copy-on-write (“COW”), a hierarchical (e.g., tree-shaped) data structure is utilized to organize the data and metadata stored by the filesystem. Additional metadata is also maintained that counts the number of in-use references to a particular portion (e.g., a block or a cluster) of a storage device. So long as the count is non-zero, the associated portion of the storage device is considered to be in use by the filesystem. If the count becomes zero, the portion is no longer considered to be in use by the filesystem and can be freed. [0003] When a snapshot instance is created, a new tree root is created that references any existing data, which requires incrementing the reference count. Further modification of the active view requires hierarchically copying nodes of the tree that lead to the data being modified. Also, unmodified portions of the tree pointed to by copied nodes must have their reference counts incremented, since copying a node creates new incoming references. Hierarchical reference counting in this manner can impose significant processing overhead when updating filesystem metadata. [0004] Some processing systems provide snapshotting functionality without the use of hierarchical reference counting by utilizing hierarchical data structures corresponding to snapshots that store nodes along with strong or weak references. A strong reference in a snapshot instance indicates that referenced data is “owned” by that snapshot instance and that it is inaccessible by reading from an older snapshot. A weak reference in a snapshot instance indicates that the referenced data is also accessible from an older snapshot instance. A weak reference to data in an active view implies that the data cannot be modified in place, since an older snapshot instance is also referencing that data, and that data may be requested at some point in the future. [0005] Current snapshotting implementations that utilize weak and strong references also suffer from technical drawbacks. For example, searching for records using implementations such as these can require progressively searching multiple hierarchical data structures corresponding to multiple point-in-time snapshots, which can be computationally intensive. SUMMARY [0006] Technologies are disclosed herein for performing snapshotting operations using data structures representing snapshot instances (referred to herein as ‘‘snapshots”) with weak references to other data structures representing prior snapshots. Through implementations of the disclosed technologies, snapshotting operations can be performed in a manner that does not utilize hierarchical reference counting and that matches or reduces the algorithmic complexity of previous solutions that rely on the use of strong and weak references. In particular, using the disclosed technologies, new filesystem snapshots can be created in 0(1) time complexity, filesystem snapshots can be modified in 0(1 ogN) time complexity, and adjacent filesystem snapshots can be merged to delete a snapshot in O(size of A) time complexity. Other technical benefits not specifically mentioned herein might also be realized through implementations of the disclosed subject matter. [0007] In order to provide aspects of the functionality disclosed herein, a processing system manages an ordered set of hierarchical data structures corresponding to individual pointin-time snapshots and one or more active views. In an embodiment, the hierarchical data structures are B+ trees. B+ trees are self-balancing hierarchical tree structures having a variable, but often large, number of children per node. B+ trees include nodes storing keys, with leaf nodes including records for key-value mappings corresponding to at least a subset of an assigned key range. The data stored in B+ trees for active views are logically modifiable by a client, while B+ trees for snapshots are logically immutable. [0008] In an embodiment, the B+ trees representing snapshots define the state of a filesystem at a particular point in time. For instance, a B+ tree can define the state of filesystem metadata, such as a streams table that specifies a mapping between virtual cluster numbers (“VCNs”) and logical cluster numbers (“LCNs”) for