CN-122001899-A - Cross-domain data synchronization method, system, equipment and medium for dual-activity arbitration
Abstract
The application relates to a cross-domain data synchronization method, a system, equipment and a medium for double-activity arbitration. The method comprises the steps of obtaining a write-in request, generating operation log records of a source station according to the write-in request, slicing the operation log records according to entity identifications through a replicator and pushing the operation log records to a remote station to obtain operation log record copies of the remote station, carrying out conflict candidate detection on compressed causal summaries in the operation log record copies and operation log records which are locally concurrent and pending and correspond to time stamps to obtain conflict candidate sets, determining arbitration strategies of conflict pairs according to operation semantic tags in the conflict candidate sets, carrying out merging operation on the conflict pairs according to the arbitration strategies to obtain arbitration certificates, and writing the final states of the write-in request indicated by the arbitration certificates into local materialized views of the source station and the remote station and carrying out persistence processing to obtain data visible consistent results. The self-arbitration synchronization under the dynamic conflict condition can be realized by adopting the method.
Inventors
- YANG MENG
- CAI XIAOYU
- ZHOU YI
- LIU SHUXIAN
- PAN LEI
- ZHANG BING
- CHENG JIE
- SHEN LIUQING
- XIE LEI
- CHENG RUIYING
Assignees
- 国家电网有限公司信息通信中心
Dates
- Publication Date
- 20260508
- Application Date
- 20251210
Claims (10)
- 1. A dual-living arbitration cross-domain data synchronization method, the method comprising: The method comprises the steps of obtaining a writing request and generating an operation log record of a source site according to the writing request, wherein the operation log record comprises an entity identifier, an operation identifier, the source site, an operation type, load data, a timestamp, a compressed causal abstract, an operation hash value and an operation semantic label; the operation log record is fragmented according to the entity identifier through a replicator and pushed to a remote site, so that an operation log record copy of the remote site is obtained; In response to obtaining that a local concurrency pending operation log record corresponding to the entity identifier of the operation log record copy exists in a local pre-write log of the remote site, performing conflict candidate detection on the compressed causal abstract in the operation log record copy and the local concurrency pending operation log record corresponding to a timestamp to obtain a conflict candidate set, wherein the conflict candidate set comprises conflict pairs and the operation semantic tags thereof; Determining arbitration strategies of all conflict pairs according to the operation semantic tags in the conflict candidate set, and carrying out merging operation on all conflict pairs according to the arbitration strategies to obtain arbitration certificates; and writing the final state of the write-in request indicated by the arbitration certificate into the local physical views of the source site and the remote site, and performing persistence processing to obtain a data visible consistent result.
- 2. The method of claim 1, wherein prior to obtaining the write request and generating the oplog record for the source site from the write request, further comprising: The entity to be managed is classified according to service consistency requirements and semantic features to obtain a consistency grouping mapping table, wherein the consistency grouping mapping table comprises entity identifications and operation semantic tags corresponding to the entity identifications, and the operation semantic tags comprise strong consistency classes, combinable classes and final consistency classes.
- 3. The method of claim 2, wherein generating an oplog record for a source site from the write request comprises: Generating an operation identifier for the write request; Generating a compressed causal summary based on a local nearest interaction site set of the source site, wherein the compressed causal summary comprises a short vector summary of a hot site, a compressed summary of a long-tail site and the corresponding time stamp of the short vector summary, the compressed summary of the long-tail site, the local nearest interaction node set is obtained from a site activity list in a preset time window, and the site activity list comprises the hot site and the long-tail site; calculating an operation hash value for the operation log record, and writing the operation hash value, the operation identifier and the compressed cause and effect abstract into the operation log record; and writing the operation log record into a local pre-write log of the source site, and receiving a return local commit acknowledgement.
- 4. The method of claim 2, wherein said determining arbitration policies for each conflict pair based on the operation semantic tags in the conflict candidate set comprises: when the operation semantic tag is the mergeable class, the arbitration policy is a fast merge mode, wherein the fast merge mode corresponds to the step of carrying out server-side merge on the operation type and the load data of the conflict pair by an arbiter based on a mergeable algorithm to generate a merge result, and the arbitration certificate is obtained; When the operation semantic tag is the final consistent class, the arbitration policy is a deterministic semantic arbitration mode, wherein the deterministic semantic arbitration mode corresponds to a deterministic semantic merge function which is executed on the operation type and the load data of the conflict pair by an arbiter under the condition that a source station, a remote station and a witness node vote are obtained to generate deterministic output, and the arbitration certificate is obtained; And when the operation semantic tag is of the strong consistency type, the arbitration policy is a short-term local consensus mode, the short-term local consensus mode corresponds to a local scale consensus group formed by a source station, a remote station and a witness node, and based on arbitration voting of the local scale consensus group, a final data state is determined according to the operation type and the load data of the conflict pair, and the arbitration certificate is obtained.
- 5. The method of claim 3, wherein generating an oplog record for a source site from the write request further comprises: When the operation semantic label is the mergeable class, carrying out merging operation on each corresponding operation log record in the local pre-write log based on an idempotent convergence algorithm to obtain the updated operation log record; and when the operation semantic label is the final consistency class or the strong consistency class, marking the operation log record corresponding to the local write-ahead log as a temporary storage state, wherein the temporary storage state is used for indicating a source station to lock the entity identifier so as to prevent the local concurrent write request.
- 6. The method according to claim 1, wherein the method further comprises: And performing difference detection based on an anti-entropy synchronization algorithm of the entity-level merck tree, comparing the merck tree of each site, and when detecting that the difference exists, replaying the corresponding operation log record according to the decision sequence recorded in the arbitration certificate and the certificate sequence, and updating the merck tree of each site.
- 7. The method of claim 5, wherein the method further comprises: When the operation log is recorded as the temporary storage state, submitting ready voting is carried out by an arbiter and the witness node, and a voting result is obtained; When the voting result is most response agreement, generating an arbitration certificate containing decision of participating sites and agreement commit ready by an arbiter, wherein the arbitration certificate is used for indicating each site to write and convert the temporary storage state of the operation log record into final commit according to the arbitration certificate; And when the voting result is that the commit condition is not met, generating an arbitration certificate containing a compensation instruction by an arbiter, wherein the compensation instruction is used for instructing each site to execute compensation action on the operation log record.
- 8. A dual living arbitration cross-domain data synchronization system, the system comprising: The system comprises a data modification request module, a data processing module and a data processing module, wherein the data modification request module is used for acquiring a write-in request and generating an operation log record of a source site according to the write-in request, wherein the operation log record comprises an entity identifier, an operation identifier, the source site, an operation type, load data, a time stamp, a compressed causal abstract, an operation hash value and an operation semantic label; The cross-domain pushing module is used for slicing the operation log record according to the entity identifier through the replicator and pushing the operation log record to a remote site to obtain an operation log record copy of the remote site; A request conflict module, configured to respond to obtaining that a local concurrency pending corresponding to the entity identifier of the operation log record copy exists in an operation log record of the remote site, and detect conflict candidates according to the compressed causal abstract in the operation log record copy and the local concurrency pending corresponding to a timestamp, so as to obtain a conflict candidate set; the arbitration module is used for determining arbitration strategies of all conflict pairs according to the operation semantic tags in the conflict candidate set, and carrying out merging operation on all the conflict pairs according to the arbitration strategies to obtain arbitration certificates; And the data synchronization module is used for writing the arbitration certificate into the local physical views of the source site and the remote site and performing persistence processing to obtain a data visible consistent result.
- 9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the method of any one of claims 1 to 7 when executing the computer program.
- 10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any of claims 1 to 7.
Description
Cross-domain data synchronization method, system, equipment and medium for dual-activity arbitration Technical Field The invention belongs to the technical field of data processing, and particularly relates to a cross-domain data synchronization method, system, equipment and medium for double-activity arbitration. Background With the development of distributed storage and wide area network multi-active system technology, dual-active data synchronization and arbitration technology appears, each site can write nearby to reduce local delay, support multi-site parallel service to improve availability, and rely on conflict detection and decision mechanism to avoid brain cracks and restore global consistency. In the traditional technology, one class aims at strong consistency or distributed transaction, ensures the atomicity and the sequency of cross-domain writing through distributed consensus (Paxos/Raft) or two-stage submission (2 PC), and the other class takes the availability and low delay as priority, adopts a final consistency method to locally submit and merge in the background. However, the above method, based on distributed transaction or global consensus with strong consistency, has high delay, limited throughput and easy growth time blocking under wide area network, seriously affects the writing delay and availability of dual-activity scene. In addition, vector clocks/version vectors for causal tracking expand metadata in O (N) as the number of nodes grows, resulting in significant increases in storage, transmission and computation overhead. Disclosure of Invention Accordingly, there is a need for a dual living arbitration cross-domain data synchronization method, system, device and medium that can maintain low latency local writing while improving cross-domain consistency decisions. In a first aspect, the present application provides a dual-living arbitration cross-domain data synchronization method, including: The method comprises the steps of obtaining a writing request and generating an operation log record of a source site according to the writing request, wherein the operation log record comprises an entity identifier, an operation identifier, the source site, an operation type, load data, a time stamp, a compression cause and effect abstract, an operation hash value and an operation semantic label; fragmenting the operation log record according to the entity identifier through a duplicator and pushing the operation log record to a remote site to obtain an operation log record duplicate of the remote site; In response to obtaining that a local concurrency pending operation log record corresponding to the entity identification of the operation log record copy exists in a local write-ahead log of a remote site, performing conflict candidate detection on a compressed causal abstract in the operation log record copy and the local concurrency pending operation log record corresponding to a time stamp to obtain a conflict candidate set; determining arbitration strategies of all conflict pairs according to operation semantic tags in the conflict candidate set, and carrying out merging operation on all conflict pairs according to the arbitration strategies to obtain arbitration certificates; And writing the final state of the write request indicated by the arbitration certificate into the local characterization view of the source site and the remote site, and performing persistence processing to obtain a data visible consistent result. In one embodiment, before obtaining the write request and generating the operation log record of the source site according to the write request, the method further includes: The entity to be managed is classified according to service consistency requirements and semantic features to obtain a consistency grouping mapping table, wherein the consistency grouping mapping table comprises entity identifications and corresponding operation semantic tags, and the operation semantic tags comprise strong consistency classes, combinable classes and final consistency classes. In one embodiment, generating an oplog record for a source site from a write request includes: generating an operation identifier for the write request; generating a compressed causal summary based on a local nearest interaction site set of a source site, wherein the compressed causal summary comprises a short vector summary of a hot site, a compressed summary of a long-tail site and a corresponding timestamp thereof; Calculating an operation hash value for the operation log record, and adding the operation hash value, the operation identifier and the compressed causal abstract into the operation log record; the oplog record is written to the source site's local pre-write log and a return local commit acknowledgement is received. In one embodiment, determining an arbitration policy for each conflict pair based on the operation semantic tags in the conflict candidate set includes: When the operation semantic tag