Search

US-12619504-B2 - Opportunistic setting of relationships when cloning files across namespaces in a deduplication filesystem

US12619504B2US 12619504 B2US12619504 B2US 12619504B2US-12619504-B2

Abstract

Enhancing data replication performance optimization by maintaining parent-child file relationships when cloning an entire or portion of a first namespace to a second namespace for backups cloned across namespaces in a deduplication filesystem. The file copies may be made by either virtual synthetic (VS) copies that keep a one-way relationship between parent-child files or fast copy overwrite (FCOW) copies that keep a two-way parent-child relationship. Extended attributes are used as metadata for the files to indicate a path or recipe to generate the child file. The extended attributes for the cloned parent and child files are fixed to reference the second namespace so that a relative relationship between the parent and child files is maintained after cloning to the second namespace.

Inventors

  • Nitin Madan
  • Salil Dangi
  • Alok Katiyar

Assignees

  • DELL PRODUCTS L.P.

Dates

Publication Date
20260505
Application Date
20240731

Claims (18)

  1. 1 . A computer-implemented method of maintaining parent-child file relationships when cloning an entire first namespace to a second namespace, comprising: fastcopying a parent file to a child file within the first namespace, wherein the child file has target extended attributes indicating a path relationship to the parent file; cloning the first namespace to the second namespace including the parent file and child file to generate a cloned parent file and cloned child file in the second namespace; finding the path relationship between the cloned parent file and cloned child file in the second namespace; and amending the found path relationship in target extended attributes of the cloned child file to reference the cloned parent file in the second namespace.
  2. 2 . The method of claim 1 wherein the fastcopying comprises making a virtual synthetic (VS) copy of the parent file to generate the child file, and wherein the target extended attributes of the child file comprise a synthetic recipe consisting of metadata specifying how to generate the child file.
  3. 3 . The method of claim 1 wherein the fastcopying comprises making a fastcopy overwrite (FCOW) copy of the parent file to generate the child file, and wherein the target extended attributes of the child file consist of metadata specifying how to generate the child file.
  4. 4 . The method of claim 3 wherein the parent file has base extended attributes indicating a path relationship to the child file, the method further comprising amending a path relationship in the parent extended attributes of the cloned parent file to reference the second namespace for the cloned parent file in the second namespace.
  5. 5 . The method of claim 1 wherein the cloning comprises a namespace fastcopy operation.
  6. 6 . The method of claim 5 further comprising cloning each file in the first namespace to the second namespace and fixing respective extended attributes for each pair of parent and child copied files as new inodes are created in the second namespace for corresponding cloned parent and child files.
  7. 7 . The method of claim 5 wherein the amending maintains an original relative file relationship between the cloned parent and child files in the second namespace to leverage optimizations provided by the copying step with respect to subsequent replication or differencing operations.
  8. 8 . The method of claim 7 wherein the replication or differencing operations are performed by backup software in a deduplication backup system comprising a data storage server running a Data Domain File System (DDFS).
  9. 9 . The method of claim 8 wherein files of the deduplication backup system are stored in a Merkle tree structure with content data stored in a bottom level of the tree and indexed by fingerprints, and further wherein the copying step copies metadata of a parent file comprising inode information and a reference to the file L6 fingerprint in a first Merkle tree to a second directory for a child file.
  10. 10 . The method of claim 9 wherein the amending step comprises fixing metadata of at least one of the target extended attributes of the cloned child file or base extended attributes.
  11. 11 . A computer-implemented method of maintaining parent-child file relationships when cloning a portion of a first namespace to a second namespace, comprising: virtual synthetic (VS) copying a parent file to a child file within the first namespace wherein the first namespace contains additional files not copied to the second namespace, wherein the child file has target extended attributes indicating a path relationship to the parent file; cloning the first namespace to the second namespace including the parent file and child file to generate a cloned parent file and cloned child file in the second namespace; verifying that each of the parent file and child file are identical to the respective cloned parent file and cloned child file; finding that the path relationship between the cloned parent file and cloned child file in the second namespace; and amending the found path relationship in target extended attributes of the cloned child file to reference the second namespace for the cloned child file in the second namespace.
  12. 12 . The method of claim 11 wherein the target extended attributes of the child file comprise a synthetic recipe consisting of metadata specifying how to generate the child file.
  13. 13 . The method of claim 12 wherein the cloning comprises a namespace fastcopy operation.
  14. 14 . The method of claim 12 wherein the amending comprises modifying metadata of the target extended attributes of the child file to maintain an original relative file relationship between parent and child files in the second namespace to leverage optimizations provided by the copying step with respect to subsequent replication or differencing operations.
  15. 15 . A computer-implemented method of maintaining parent-child file relationships when cloning a portion of a first namespace to a second namespace, comprising: fast copy overwrite (FCOW) copying a parent file to a child file within the first namespace, wherein the child file has target extended attributes indicating a path relationship to the parent file, and the parent file has base extended attributes indicating a path relationship to the child file; cloning the first namespace to the second namespace including the parent file and child file to generate a cloned parent file and cloned child file in the second namespace; verifying that each of the parent file and child file are identical to the respective cloned parent file and cloned child file; finding that the path relationship between the cloned parent file and cloned child file in the second namespace; first amending the found path relationship in target extended attributes of the cloned parent file to reference the second namespace for the cloned child file in the second namespace; and second amending the path relationship in the parent extended attributes of the cloned parent file to reference the cloned parent file in the second namespace.
  16. 16 . The method of claim 15 wherein the target extended attributes of the child file consist of metadata specifying how to generate the child file.
  17. 17 . The method of claim 15 wherein the cloning comprises a namespace fastcopy operation.
  18. 18 . The method of claim 17 wherein the amending comprises modifying metadata of the target extended attributes of the child file to maintain an original relative file relationship between parent and child files in the second namespace to leverage optimizations provided by the copying step with respect to subsequent replication or differencing operations.

Description

TECHNICAL FIELD Embodiments relate to deduplication backup systems, and specifically to opportunistically setting or repairing relationships for backups cloned across namespaces. BACKGROUND OF THE INVENTION Data deduplication is a form of single-instance storage that eliminates redundant copies of data to reduce storage use. Data compression methods are used to store only unique instances of data by replacing redundant data blocks with pointers to the unique copies. As new data is written, duplicate chunks are replaced with these pointer references to previously stored data. Deduplication systems support various different backup operations such as full, differential, and incremental backups. A synthetic backup is the process of generating a file from a complete copy of a file created in the past and one or more incremental copies created later, and backups may be referred to as virtual synthetic backups of these various backup types. Backup programs use namespaces to ensure that a given set of objects have unique names so that they can be easily identified. Namespaces are commonly structured as hierarchies to allow use of names in different contexts. Deduplication filesystems like the PowerProtect Data Domain File System (DDFS) have very efficient file cloning methods, such as the fastcopy process. A fastcopy creates a new inode (new namespace entity), which points to the same content between the original file and the copied file. It is a significant challenge, however, to maintain file relationships between the files when cloning across namespaces. What is needed, therefore, is a process to opportunistically establish, set or repair file relationships in the context of the data cloning operations across different namespaces. The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions. EMC, Data Domain, Data Domain Restorer, and Data Domain Boost trademarks of Dell Technologies, Inc. BRIEF DESCRIPTION OF THE DRAWINGS In the following drawings, like reference numerals designate like structural elements. Although the figures depict various examples, the one or more embodiments and implementations described herein are not limited to the examples depicted in the figures. FIG. 1 is a diagram of a computing network implementing a method for enhancing data replication performance by preserving fastcopy-overwrite optimization for backups cloned across namespace subdivisions, under some embodiments. FIG. 2 illustrates files an example Merkle tree representation of files in a deduplication backup system, under some embodiments. FIG. 3 illustrates a Data Domain filesystem Merkle tree accessed by a file under an example embodiment. FIG. 4 illustrates the composition of a virtual synthetic backup file, under some embodiments. FIG. 5 illustrates generating a virtual synthetic backup file, under some embodiments. FIG. 6 is a diagram illustrating the fastcopy method of copying data, under some embodiments. FIG. 7 illustrates an example of a fastcopy overwrite method, under some embodiments. FIG. 8 illustrates the use of extended attributes with cloned files, under some embodiments. FIG. 9 illustrates example file relationships generated in an Mtree during an FCOW process, under some embodiments. FIG. 10 illustrates the fastcopy of files across namespaces, under some embodiments. FIG. 11 illustrates the virtual synthetic copying of files across namespaces, under some embodiments. FIG. 12 illustrates fixing or maintaining file relationships for cloned files implicitly for VS copies, under some embodiments. FIG. 13 illustrates fixing or maintaining file relationships for cloned files implicitly for FCOW copies, under some embodiments. FIG. 14 is a flowchart that illustrates a method of fixing file relationships for cloned files implicitly for VS copies, under some embodiments. FIG. 15 is a flowchart that illustrates a method of fixing file relationships for cloned files implicitly for FCOW copies, under some embodiments. FIG. 16 illustrates a file copy operation by VS using opportunistic fixing of file relationships, under some embodiments. FIG. 17 illustrates the effect of an opportunistic fix operation on the files of FIG. 16, under an example embodiment. FIG. 18 illustrates a file copy operation by FCOW using opportunistic fixing of file relationships, under some embodiments. FIG. 19 illustrates the effect of an opportunistic fix operation on the files of FIG. 18, under an example embodiment. FIG. 20 is a flowchart illustrating a method of opportunistically fixing file relationships for either VS or FCOW c