CN-120610849-B - Data disaster recovery method supporting multi-backup path switching

CN120610849BCN 120610849 BCN120610849 BCN 120610849BCN-120610849-B

Abstract

The invention discloses a data disaster recovery method supporting multi-backup path switching, which comprises the steps of generating dynamic granularity data fragments based on sensitivity of input data and storage medium characteristics, generating metadata labels comprising data fragment identification, storage preference and recovery priority, constructing a heterogeneous transmission path performance model through the metadata labels, generating a path quality map, distributing an optimal transmission path for each data fragment, monitoring path performance, detecting abnormality and migrating fragments through standard deviation of accumulated transmission delay, adopting Merkle proof of incremental integrity verification and blockchain storage to carry out parallel verification, and retransmitting the verification failure fragments preferentially.

Inventors

ZHOU HAOYUAN
DONG FENG
GUO CHANGRONG

Assignees

连云港中锦燊信息科技有限公司

Dates

Publication Date: 20260508
Application Date: 20250529

Claims (6)

1. A data disaster recovery method supporting multi-backup path switching is characterized by comprising the following steps: Acquiring input data, generating data fragments with dynamic granularity based on the sensitivity of the input data and the characteristics of a storage medium, and generating metadata tags for each data fragment; the obtaining input data, generating data fragments with dynamic granularity based on sensitivity and storage medium characteristics of the input data, and generating metadata labels for each data fragment, comprises: Generating a sensitivity score based on the entropy density, access pattern scale, and structural association of the input data; calculating the adaptation degree of the data fragments and the storage medium based on the sequential read bandwidth, the random write delay and the maximum IOPS of the storage medium; determining data fragment granularity according to the sensitivity score and the adaptation degree, and generating a metadata tag containing data fragment identification, storage preference and recovery priority; Further comprises: the data slicing identifier in the metadata tag is generated by adopting a hash value, the storage preference in the metadata tag is determined by the storage medium type and the data slicing I/O mode, and the recovery priority in the metadata tag is calculated based on the sensitivity score and the service SLA coefficient; constructing a performance model of the heterogeneous transmission path through the metadata tag, and generating a path quality map; based on the metadata tag and the path quality map, distributing an optimal transmission path for each data fragment, and monitoring the performance of the distributed optimal transmission path; Based on the metadata tag and the path quality map, allocating an optimal transmission path for each data fragment, including: Constructing a multidimensional decision matrix comprising medium matching degree, network stability and recovery priority based on storage preference and recovery priority in the metadata tag and a composed path quality map; According to the multidimensional decision matrix, an optimal transmission path is distributed for each data fragment, the optimal transmission path is distributed for each data fragment to serve as a constraint optimization problem, and the constraint optimization problem is solved through an improved Hungary algorithm; the improved hungarian algorithm comprises: converting the medium matching degree, the network stability and the recovery priority in the multidimensional decision matrix into a single weight matrix through weighted summation, and establishing the limit constraint of transmission path load, recovery time and resources; and according to the monitoring result, performing incremental integrity verification on the recovered data fragments, generating and storing Merkle evidence of a blockchain by using a metadata tag to perform parallel verification, and retransmitting the data fragments with failed verification, thereby realizing efficient backup and priority recovery of the data fragments.
2. The method for recovering disaster recovery of data supporting multi-backup path switching as claimed in claim 1, wherein constructing a performance model of heterogeneous transmission paths by the metadata tag, generating a path quality map, comprises: defining a plurality of transmission paths from a source node to a backup storage node based on the storage preference in the metadata tag, and constructing a performance model of a heterogeneous transmission path; predicting response time of path medium characteristics based on storage medium characteristics and real-time queue requests in storage preference through the performance model of the heterogeneous transmission paths, and simultaneously, predicting network stability by considering transmission success rate and delay fluctuation of the transmission paths; And forming a path quality map according to the response time and the network stability of the obtained path medium characteristics.
3. The data disaster recovery method supporting multi-backup path switching as claimed in claim 1, wherein the medium matching degree is a matching degree of a data slicing I/O mode and path medium characteristics.
4. The data disaster recovery method supporting multi-backup path switching as claimed in claim 1, wherein monitoring performance of the allocated optimal transmission path comprises: judging whether the performance of the distributed optimal transmission path is abnormal or not through the standard deviation of the accumulated transmission delay, and when the performance abnormality of the distributed optimal transmission path is detected, migrating data fragments into a new optimal transmission path based on the path quality map.
5. The method for recovering a data disaster recovery supporting multi-backup path switching as claimed in claim 1 or 4, wherein performing incremental integrity verification on the recovered data slices according to the monitoring result comprises: and positioning the data fragments in the backup storage according to the data fragment identifiers in the metadata tags, reading the data fragment contents, generating Merkle certificates for the data fragments of each read data content, and writing Merkle root hashes into the lightweight block chain.
6. The method for recovering disaster recovery of data supporting multi-backup path switching as claimed in claim 4, wherein Merkle proof generated by using metadata tag and stored in blockchain is used for parallel check, and retransmitting data fragments failing check comprises: using a thread pool to check the data fragments of each read data content in parallel, and matching the data fragment identification with the generated Merkle certification; If the data fragments which are failed to be checked in parallel by using the thread pool exist, triggering retransmission based on the recovery priority in the metadata tag.

Description

Data disaster recovery method supporting multi-backup path switching Technical Field The invention relates to the technical field of data disaster recovery, in particular to a data disaster recovery method supporting multi-backup path switching. Background With the rapid development of information technology, data has become an important asset for enterprises and organizations, and data disaster recovery technology has become an important means for guaranteeing data security and business continuity. The traditional data disaster recovery method generally adopts a single backup path to carry out data transmission and storage, and the granularity of data fragments is fixed, so that the differences of data sensitivity, storage medium characteristics and network path performance are not considered, and the differentiated requirements of different service scenes on the data recovery speed and reliability are difficult to meet in practical application. In recent years, with the rise of cloud computing, distributed storage and blockchain technologies, data disaster recovery technology gradually develops to the directions of refinement, intellectualization and high efficiency. However, the existing data disaster recovery methods still have some disadvantages. First, conventional methods generally consider only sensitivity differences of data itself and performance characteristics of storage media, resulting in a lack of pertinence of data slicing granularity and path selection policy, and difficulty in achieving optimal performance of data backup and recovery. Secondly, the conventional method often ignores real-time performance fluctuation of a transmission path and response characteristics of a storage medium in a path selection process, and an effective path quality evaluation model cannot be established, so that the path has blindness in the process of selection. Third, the conventional method lacks an integrity verification mechanism in the data recovery process, and particularly fails to effectively utilize the blockchain technology to perform parallel verification and fast retransmission, so that the data recovery process consumes longer time, and the fast recovery requirement of the data is difficult to meet. Therefore, there is a need for a data disaster recovery method that can consider data sensitivity, storage medium characteristics, and transmission path performance and support multi-backup path switching. CN118152186A discloses a data recovery method and a device in a distributed object storage system, which mainly utilize an object recovery component to recover backup data from a tape library to a distributed file system to obtain the whole content of a recovery object, analyze and obtain metadata of each object by scanning the whole content of the recovery object and write the metadata into the distributed database, and then acquire the corresponding object from the distributed file system according to the metadata of the corresponding object when an object acquisition request initiated by a client is received. Although metadata is mentioned, this scheme does not take the metadata into further consideration, resulting in failure to take into account sensitivity differences of the data itself and performance characteristics of the storage medium. CN117149526A discloses a data recovery method, a system and equipment suitable for an MPP database, which mainly uses a client to initiate a data backup request, imports information to be backed up, connects the MPP database, reads metadata information, backs up the data file from an HDFS to an object storage by acquiring a backup table of the MPP database, the metadata information and the data file to be stored in the HDFS, then selects a corresponding backup table, acquires a path of the backup file on the object storage by the recorded backup information, acquires a designated path on the object storage by calling an object storage API, and recovers the file to the database on the MPP database. The scheme mainly backs up to an object storage file system instead of a storage medium, and considers a backup path, but lacks of consideration to the storage medium, so that the data recovery speed of different service scenes is difficult to meet. Disclosure of Invention This section is intended to outline some aspects of embodiments of the application and to briefly introduce some preferred embodiments. Some simplifications or omissions may be made in this section as well as in the description of the application and in the title of the application, which may not be used to limit the scope of the application. The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides a data disaster recovery method supporting multi-backup path switching, which is used for solving the problems in the background technology. In order to solve the technical problems, the invention provides a data disaster