CN-122019264-A - Data service continuity guaranteeing method based on multi-level disaster recovery strategy

CN122019264ACN 122019264 ACN122019264 ACN 122019264ACN-122019264-A

Abstract

The invention relates to a data service continuity guaranteeing method based on a multi-level disaster recovery strategy, which comprises the steps of distributing service grades according to key indexes of a service system and generating a differential disaster recovery strategy template, deploying a same-city dual-active cluster for high-priority service according to the disaster recovery strategy template, deploying main-standby asynchronous replication for medium-priority service and timed snapshot backup for low-priority service, and constructing a remote disaster recovery center through asynchronous log replication and incremental snapshot according to the disaster recovery strategy template to support snapshot and log mixed recovery so as to guarantee cross-regional service continuity. The method realizes resource and demand matching through service hierarchical planning, optimizes cost and efficiency, combines a same-city dual-activity mechanism with a distributed consistency protocol, guarantees second-level switching and zero data loss, adopts snapshot and log mixed recovery for remote disaster recovery, improves cross-region recovery efficiency, realizes automatic fault sensing and closed loop response, remarkably reduces manual intervention, and improves system reliability and availability.

Inventors

LI RUI
ZHONG JING
MA YANPENG
CHENG RAN
GUO SHUAI
LIU DERAN
WU ZHENGZHONG

Assignees

中国人民解放军61618部队

Dates

Publication Date: 20260512
Application Date: 20251230

Claims (10)

1. A data service continuity guarantee method based on a multi-level disaster recovery strategy is characterized by comprising the following steps of, Distributing service grades according to key indexes of a service system and generating a differential disaster recovery strategy template; according to the disaster recovery strategy template, deploying the same-city dual-activity cluster for the high-priority service, deploying the main-standby asynchronous replication for the medium-priority service, deploying the timing snapshot backup for the low-priority service, and deploying the timing snapshot backup for the low-priority service; And constructing a remote disaster recovery center through asynchronous log replication and incremental snapshot according to the disaster recovery strategy template, and supporting snapshot and log mixed recovery so as to ensure cross-regional service continuity.
2. The method for guaranteeing continuity of data traffic based on multi-level disaster recovery policies according to claim 1, wherein said assigning traffic levels and generating differentiated disaster recovery policy templates comprises, Acquiring key indexes of a service system, including service importance, a recovery time target, a recovery point target, concurrent access quantity and data throughput; weighting calculation is carried out on the key indexes based on an analytic hierarchy process, a fuzzy comprehensive evaluation or a machine learning model, comprehensive scores are generated, and corresponding high, medium and low service grades are distributed; and according to the service level, a disaster recovery strategy template library is called to generate a disaster recovery strategy template of a corresponding network topology, database architecture and data synchronization mechanism.
3. The method for guaranteeing continuity of data traffic based on multi-level disaster recovery policies according to claim 2, wherein said deploying co-city dual-active clusters for high-priority traffic comprises, Deploying service nodes at two physical isolation data centers adjacent to each other in geographic position, and configuring a distributed database to support real-time service synchronization; Calculating weights based on network delay or node health states through a load balancer, distributing user requests, and adopting an optimized two-stage submission protocol to ensure data consistency among nodes; And configuring a heartbeat detection and virtual IP drift mechanism, and automatically switching the flow to a healthy service node when a fault is detected, so as to realize second-level switching.
4. The method for guaranteeing continuity of data traffic based on multi-level disaster recovery policies as recited in claim 3, wherein said deploying primary and backup asynchronous copies for medium priority traffic comprises, Deploying a main node in a main data center, deploying a standby node in a standby data center adjacent to a geographic position, and configuring an asynchronous replication channel based on a database service log; Monitoring the state of the main node through a health detection mechanism, and supporting automatic or manual confirmation of switching to the standby node; And updating the routing rule in the switching process, and ensuring that the service recovery time is controlled at the minute level.
5. The method for guaranteeing continuity of data traffic based on multi-level disaster recovery policies as recited in claim 4, wherein said deploying the timed snapshot backup for the low-priority traffic comprises, Configuring a timing snapshot generation mechanism in a production environment, and backing up a data mirror image through object storage or a data center in different places; manually or semi-automatically loading snapshot images under disaster conditions to recover the service system state; And updating backup metadata and recovery logs to ensure that the target control of the data recovery point is in the hour level.
6. The method for guaranteeing continuity of data traffic based on multi-level disaster recovery policies as recited in claim 5, wherein constructing the remote disaster recovery center by asynchronous log replication and incremental snapshot comprises, Configuring a read-only copy library and a snapshot warehouse in a trans-regional disaster recovery center, and transmitting service logs through asynchronous log copying and high-priority channels; Periodically generating incremental snapshots, and combining log compensation to realize snapshot and log mixed recovery; when the local node is not available, loading the latest snapshot and playing back the log, switching the disaster recovery center to a writable mode and updating the access path; The generating of the incremental snapshot includes, Converting the business operation of the main data center into a database business log, and optimizing trans-regional transmission through compression and priority queues; generating incremental snapshots based on a differential analysis algorithm, reducing the data transmission quantity and storing the data transmission quantity in a snapshot warehouse of a disaster recovery center; during the recovery process, the snapshot is loaded after integrity verification, and the latest data changes are compensated for by log replay.
7. The method for guaranteeing continuity of data traffic based on multi-level disaster recovery policies as recited in claim 1, further comprising, The intelligent dispatching platform is used for executing multidimensional monitoring, fault sensing, strategy matching and automatic switching, and dynamically optimizing disaster recovery strategy and resource allocation based on real-time data and historical data, and the method specifically comprises, The indexes such as heartbeat signals, database delay, business error rate and the like are collected through multidimensional monitoring, and a multi-factor weighting model is adopted to judge the fault state; Based on the service level and the fault range, invoking a strategy engine to match a switching strategy and generating an executable task sequence; issuing a switching, reconstructing or back-switching instruction through a scheduling execution module, and realizing automatic execution on a cloud computing platform; And analyzing historical fault data by using a machine learning model, and dynamically optimizing strategy parameters and scheduling efficiency.
8. The data service continuity guarantee system based on the multi-level disaster recovery strategy is characterized by comprising a disaster recovery level planning module, a same-city disaster recovery deployment module and a different-place disaster recovery construction module; The disaster recovery level planning module is used for distributing service levels and generating a differential disaster recovery strategy template according to key indexes of the service system; The same-city disaster recovery deployment module is used for deploying the same-city dual-activity cluster for high-priority service according to the disaster recovery strategy template, deploying main-standby asynchronous replication for medium-priority service and deploying timing snapshot backup for low-priority service; The remote disaster recovery building module is used for building a remote disaster recovery center through asynchronous log replication and incremental snapshot according to the disaster recovery strategy template, and supporting snapshot and log mixed recovery so as to ensure cross-region service continuity.
9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program executable on the processor, and the processor implements the steps in the multi-level disaster recovery policy-based data service continuity assurance method according to any one of claims 1-7 when the program is executed on the processor.
10. A storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the multi-level disaster recovery policy based data service continuity assurance method according to any one of claims 1 to 7.

Description

Data service continuity guaranteeing method based on multi-level disaster recovery strategy Technical Field The invention belongs to the technical field of cloud computing and distributed systems, and particularly relates to a data service continuity guaranteeing method based on a multi-level disaster recovery strategy. Background With the rapid development of digital economy, the importance of key business systems in the fields of finance, telecommunications, energy, government affairs, electronic commerce and the like is increasing, and high availability and data persistence are core demands. The widespread use of cloud computing, edge computing, and distributed systems has led to increasingly complex system architectures, distributed, and cross-domain, presenting greater challenges to disaster recovery capabilities. The traditional disaster recovery scheme mainly adopts a main and standby deployment mode, an asynchronous backup mode or a timing snapshot mode to cope with the scenes of equipment faults, network interruption, natural disasters and the like. However, the schemes have the obvious defects of slow response speed, often taking a few minutes or longer in a switching process, difficulty in meeting the requirements of zero loss and second-level recovery of core services, more manual intervention, low efficiency depending on static scripts or manual operation, weak data consistency, frequently caused data delay or loss due to asynchronous copying between main and standby, and difficulty in configuring resources according to service importance differentiation due to lack of fine classification of disaster recovery strategies, so that cost and efficiency are difficult to balance. Under the trend of distributed multi-place deployment, the existing scheme is insufficient in the aspects of consistency of double-activity data in the same city, remote disaster recovery switching efficiency and automatic scheduling capability, and is difficult to meet the requirements of seamless switching and cross-region recovery of high-priority business. Therefore, a multi-level, automatic and intelligent disaster recovery system is needed, the problems of slow response, complex switching and poor reliability are solved, and efficient service continuity guarantee is realized. Disclosure of Invention The invention aims to provide a data service continuity guarantee method based on a multi-level disaster recovery strategy, which aims to solve the problems of coarse strategy granularity, high difficulty in deployment of double activities in the same city, delay in switching disaster recovery in different places and weak automatic scheduling capability in the existing disaster recovery scheme. In order to achieve one of the above objects, an embodiment of the present invention provides a method for guaranteeing continuity of data service based on a multi-level disaster recovery policy, the method comprising, Distributing service grades according to key indexes of a service system and generating a differential disaster recovery strategy template; According to the disaster recovery strategy template, deploying the same-city dual-activity cluster for high-priority service, deploying main-backup asynchronous replication for medium-priority service, and deploying timing snapshot backup for low-priority service; And constructing a remote disaster recovery center through asynchronous log replication and incremental snapshot according to the disaster recovery strategy template, and supporting snapshot and log mixed recovery so as to ensure cross-regional service continuity. As a further improvement of an embodiment of the present invention, the method further includes, the assigning a service level and generating a differentiated disaster recovery policy template includes, Acquiring key indexes of a service system, including service importance, a recovery time target, a recovery point target, concurrent access quantity and data throughput; weighting calculation is carried out on the key indexes based on an analytic hierarchy process, a fuzzy comprehensive evaluation or a machine learning model, comprehensive scores are generated, and corresponding high, medium and low service grades are distributed; and according to the service level, a disaster recovery strategy template library is called to generate a disaster recovery strategy template of a corresponding network topology, database architecture and data synchronization mechanism. As a further improvement of an embodiment of the present invention, the method further includes, deploying the co-city dual active cluster for the high priority service includes, Deploying service nodes at two physical isolation data centers adjacent to each other in geographic position, and configuring a distributed database to support real-time service synchronization; Calculating weights based on network delay or node health states through a load balancer, distributing user requests, and adopting an