Search

CN-115374059-B - Priority-based management method and device for life cycle of stored data

CN115374059BCN 115374059 BCN115374059 BCN 115374059BCN-115374059-B

Abstract

The invention provides a priority-based management method and device for a life cycle of stored data. The method comprises the steps of binding at least one migration strategy to be executed with i node in a file system, wherein the i node corresponds to a directory or a file in a distributed file system, and executing the corresponding migration strategy according to the i node bound with the i node. The management method and the device for the life cycle of the stored data based on the priority can realize flexible binding of the strategy and the catalogue.

Inventors

  • JIN HAODONG
  • YIN WEI
  • WEN ZHONGLING
  • ZHOU ZEXIANG

Assignees

  • 北京同有飞骥科技股份有限公司

Dates

Publication Date
20260512
Application Date
20220916

Claims (6)

  1. 1. A priority-based management method for a life cycle of stored data, comprising: binding at least one migration policy to be executed with an inode in a file system, wherein the inode corresponds to a directory or a file in a distributed file system; executing a corresponding migration policy according to the inode bound thereto, including: judging whether the file accords with the migration strategy or not according to the migration strategy, and if so, migrating; under the current directory, creating an empty file, and setting storeid of the file inode of the empty file as storeid of a migration strategy; copying the content and metadata information of the original file into the newly-built file; renaming the new file name to be the original file name; when the file migration fails, the name of the failed file, and the offset of the data failure need to be recorded, and these data are recorded into the key-value database.
  2. 2. The priority-based management method of a lifecycle of stored data as recited in claim 1, further comprising: and taking the directory inode id with the strategy as a key and taking the directory migration state as a value to form a key value pair, and storing the key value pair in a key-value database for recording.
  3. 3. The priority-based management method of a life cycle of stored data according to claim 2, wherein when a plurality of migration tasks need to access a key-value database, a mutual exclusion lock is set on the accessed key-value database.
  4. 4. The method for managing a lifecycle of stored data based on priority as recited in claim 1, wherein the placement function hidden from migration policy is implemented in an internal api of the distributed file system.
  5. 5. The method for managing a priority-based storage data lifecycle of claim 4, wherein the implementation of the placement function comprises: when creating a file, the open_file interface obtains the migration strategy of the directory, and if the upper directory is not available, the upper directory is searched until the directory containing the migration strategy is found; and comparing the metadata information of the file with the migration strategy, and setting storeid of the file as the designated id of the migration strategy if the metadata information of the file accords with the migration strategy.
  6. 6. A priority-based management device for storing a data lifecycle, comprising: one or more processors; storage means for storing one or more programs, The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the priority-based method of managing a lifecycle of stored data according to any one of claims 1 to 5.

Description

Priority-based management method and device for life cycle of stored data Technical Field The present invention relates to the field of data storage technologies, and in particular, to a method and an apparatus for managing a life cycle of stored data based on priority. Background In the field of data storage of a distributed file system, files have various metadata characteristics as a basic unit of data storage (such as users to which the files belong, file access time, etc.), and some files need to have a higher data redundancy level (such as 3 copies). The storage strategy is to set a series of conditions according to certain characteristics of the file (such as the file owner, the file access time, the file suffix and the like), classify the files meeting the conditions into one type, and store the files in a specific storage area. Storage area refers to a collection of storage media having different levels of data redundancy and consisting of a set of storage media of different performance, availability and reliability. And the data classification is to store the files in regions according to a storage strategy, and certain types of files are stored in the same region, so that the files in the same type are convenient to manage. Data migration refers to the fact that certain attributes of a file are changeable, such as (belonging user, modification time, access time, etc.). Files may be periodically migrated to a particular storage area according to policies constituted by these file attributes. One policy often cannot meet the requirement of multiple data management under multiple scenes of a user, so multiple policies are needed, and in order to control the execution sequence of the multiple policies, the concept of priority of the policies is needed. In order to meet the data management requirements of different users for files in different scenes, it is necessary to provide such a multi-user multi-policy data management method and device. Chinese patent application CN103605728a discloses a data hierarchical storage method and system. The technical scheme mainly introduces a data hierarchical storage method and system. The system consists of a preset placement strategy and a migration strategy, and a placement module and a migration module. The basic unit of data staging is a file (i.e., inode), each of which binds a storage device tier identifier. The placement strategy and the migration strategy are bound with the mounting point of the hierarchical storage device. The mount point of the hierarchical storage device refers to a mount point (mount directory) of the hierarchical storage device. And binding the inode with the storage equipment by a client-defined placement and migration strategy and a placement and migration module to realize hierarchical storage of the data. The technical scheme has the following defects: 1. The placement strategy and the migration strategy are bound with the mounting points of the hierarchical storage equipment, namely the strategy- > the mounting points of the hierarchical storage equipment. This may be the case where one policy corresponds to one hierarchical storage device mount point, but when there are a plurality of placement policies corresponding to one hierarchical storage mount point with a migration policy, the priority problem of the policy will not be solved. 2. The placement policy and migration policy are defined as 2 policies, which are unreasonable here because they all perform the same function as each other, migrating a particular file to the target storage device. Should be uniformly defined as a migration policy. Except that the migration policy has both file initialization placement and file migration functions. 3. The placement strategy and the migration strategy are bound with the mounting point of the hierarchical storage device, and the design is unreasonable. 4. In the migration process, the failed processing is not optimized for the migration of the large file or the migration of the file. Disclosure of Invention The invention aims to provide a priority-based management method and device for the life cycle of stored data, which can realize flexible binding of policies and catalogs. In order to solve the technical problems, the invention provides a priority-based management method for a life cycle of storage data, which comprises the steps of binding at least one migration policy to be executed with an inode in a file system, wherein the inode corresponds to a directory or a file in a distributed file system, and executing the corresponding migration policy according to the inode bound with the migration policy. In some embodiments, executing a corresponding migration policy according to the inode bound with the file comprises judging whether the file accords with the migration policy according to the migration policy, if so, migrating, creating an empty file under the current directory, setting storeid of the file inode as storeid of the