CN-115438723-B - Data fusion method, device, equipment and storage medium

CN115438723BCN 115438723 BCN115438723 BCN 115438723BCN-115438723-B

Abstract

The embodiment of the application provides a data fusion method, a device, equipment and a storage medium, and relates to the technical field of data processing, wherein the method comprises the following steps: the method comprises the steps of obtaining a plurality of first data to be compared in a preset period from a first data center, obtaining a plurality of second data to be compared in the preset period from a second data center, and determining the same data unique identifier in the plurality of first data to be compared and the plurality of second data to be compared as a first data identifier. For any first data identifier, acquiring first data to be compared corresponding to the first data identifier as first target data, and acquiring second data to be compared corresponding to the first data identifier as second target data; the second target data is updated based on the first transaction state of the first target data and the second transaction state of the second target data. The application fully considers the precedence relationship of each transaction state in the transaction scene, so that the updated second target data is more accurate.

Inventors

FENG GUANJUN
LI JING
WEN YUTIAN

Assignees

中国银联股份有限公司

Dates

Publication Date: 20260512
Application Date: 20220819

Claims (12)

1. A method of data fusion, comprising: Acquiring a plurality of first data to be compared in a preset period from a first data center, and acquiring a plurality of second data to be compared in the preset period from a second data center, wherein the preset period is determined based on a data center switching time point; determining the same data unique identifiers in the first data to be compared and the second data to be compared as first data identifiers; for any first data identifier, first data to be compared corresponding to the first data identifier is obtained from the plurality of first data to be compared to serve as first target data, second data to be compared corresponding to the first data identifier is obtained from the plurality of second data to be compared to serve as second target data, and if a first transaction state of the first target data is not empty and a second transaction state of the second target data is not empty, the second target data is updated based on a transaction sequence relation between the first transaction state and the second transaction state.
2. The method as recited in claim 1, further comprising: If the first transaction state in the first target data is not null, the second transaction state in the second target data is not null, and the first transaction state and the second transaction state are different, respectively determining a first state position corresponding to the first transaction state and a second state position corresponding to the second transaction state based on a preset transaction state machine; And if the first state position is located behind the second state position, updating the second target data by using the first target data.
3. The method as recited in claim 2, further comprising: If the first transaction state is the same as the second transaction state, judging the transaction time point of the first target data and the transaction time point of the second target data; And if the transaction time point of the first target data is later than the transaction time point of the second target data, updating the second target data by using the first target data.
4. The method as recited in claim 1, further comprising: If the first transaction state in the first target data is null and the second transaction state in the second target data is null, respectively determining M first follow-up transaction data corresponding to the first target data and N second follow-up transaction data corresponding to the second target data, wherein M > =0 and N > =0; respectively determining the data unique identifiers corresponding to the M pieces of first follow-up transaction data as first follow-up identifiers, and respectively determining the data unique identifiers corresponding to the N pieces of second follow-up transaction data as second follow-up identifiers; Based on a preset transaction state machine, respectively determining first subsequent state positions corresponding to the M pieces of first subsequent transaction data, and respectively determining second subsequent state positions corresponding to the N pieces of second subsequent transaction data; determining a target follow-up transaction data chain based on the obtained M first follow-up identifications and N second follow-up identifications, and the M first follow-up state positions and the N second follow-up state positions; updating a second transaction state in the second target data based on the target subsequent transaction data chain.
5. The method of claim 4, wherein determining the target subsequent transaction data chain based on the obtained M first subsequent identifications and N second subsequent identifications, and the M first subsequent status locations and the N second subsequent status locations comprises: if the same follow-up mark does not exist in the M first follow-up marks and the N second follow-up marks, respectively corresponding first follow-up transaction data of the M first follow-up state positions and corresponding second follow-up transaction data of the N second follow-up state positions are used as target follow-up transaction data; And sequencing the target subsequent transaction data according to the transaction time points to obtain the target subsequent transaction data chain.
6. The method as recited in claim 5, further comprising: If the M first follow-up identifiers and the N second follow-up identifiers are the same, dividing the first follow-up identifiers and the second follow-up identifiers which are the same in follow-up identifiers into a group to obtain at least one identifier matching group; for any identifier matching group, determining a first subsequent state position corresponding to the first matching identifier and a second subsequent state position corresponding to the second matching identifier; deleting the subsequent transaction data corresponding to the subsequent state position in the first subsequent state position and the second subsequent state position; Taking the first subsequent transaction data corresponding to the rest P first subsequent state positions and the second subsequent transaction data corresponding to the Q second subsequent state positions as target subsequent transaction data, wherein 0< = P < = M,0< = Q < = N; And sequencing the target subsequent transaction data according to the transaction time points to obtain the target subsequent transaction data chain.
7. The method of claim 4, wherein updating the second transaction state in the second target data based on the target subsequent transaction data chain comprises: Determining a first position relation corresponding to any two adjacent target follow-up transaction data in the target follow-up transaction data chain based on the preset transaction state machine; Determining a second position relation of the two adjacent target follow-up transaction data in the target follow-up transaction data chain; And if the first position relation is the same as the second position relation, determining a second transaction state in the second target data based on the transaction states corresponding to the target subsequent transaction data in the target subsequent transaction data chain.
8. The method of claim 1, wherein the data unique identifier comprises an application service unique identifier and a center service unique identifier, further comprising: determining at least one data pair to be compared with different application service unique identifiers and same center service unique identifiers from the plurality of first data to be compared and the plurality of second data to be compared, wherein the data pair to be compared comprises first data to be compared and second data to be compared; And updating the second data to be compared in the data pair to be compared by adopting the first data to be compared in the data pair to be compared if the transaction time point of the first data to be compared in the data pair to be compared is earlier than the transaction time point of the second data to be compared in the data pair to be compared.
9. The method as recited in claim 1, further comprising: Judging whether a first attribute value corresponding to a first attribute identifier is in a preset range or not according to the first attribute identifier in the second target data, and if not, adding the second target data to an abnormal file; And determining a second attribute identifier associated with the first attribute identifier aiming at the first attribute identifier in the second target data, judging whether a first attribute value corresponding to the first attribute identifier and a second attribute value corresponding to the second attribute identifier meet a preset relation, if so, adding the second target data to an abnormal file, wherein the abnormal file is used for manual review.
10.A data fusion device, comprising: The acquisition module is used for acquiring a plurality of first data to be compared in a preset period from the first data center and acquiring a plurality of second data to be compared in the preset period from the second data center, wherein the preset period is determined based on a data center switching time point; The determining module is used for determining the same data unique identifier in the plurality of first data to be compared and the plurality of second data to be compared as a first data identifier; The updating module is used for acquiring first data to be compared corresponding to the first data identifier from the plurality of first data to be compared as first target data and acquiring second data to be compared corresponding to the first data identifier from the plurality of second data to be compared as second target data, and updating the second target data based on the trade sequence relation between the first trade state and the second trade state if the first trade state of the first target data is not empty and the second trade state of the second target data is not empty.
11. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of claims 1-9 when the program is executed by the processor.
12. A computer readable storage medium, characterized in that it stores a computer program executable by a computer device, which program, when run on the computer device, causes the computer device to perform the steps of the method according to any one of claims 1-9.

Description

Data fusion method, device, equipment and storage medium Technical Field The embodiment of the invention relates to the technical field of data processing, in particular to a data fusion method, a device, equipment and a storage medium. Background With the rapid development of internet technology, the scale of business systems is larger and larger, and the loss caused by each technical failure is immeasurable. In order to improve disaster recovery capability of a service system, the current service system generally adopts a different-place multi-activity architecture, namely, data centers are arranged at different geographic positions, and different data centers can provide service to the outside. The data stored in different data centers are mutually backed up, and because the different data centers have synchronous time delay when backing up the data, the data stored in the different data centers are not completely consistent at any time point. For a certain business service, the A data center is set to provide the business service outwards, when the A data center has equipment failure, the A data center is often switched to other data centers, and the other data centers continue to provide the business service. However, due to the synchronization delay between different data centers, other data centers may not have the data corresponding to the service, or the data of other data centers are inconsistent with the data of the a data center, which may cause errors in the service provided by other data centers. At present, a preset period including a switching time point is generally determined, data in the preset period is acquired from an A data center and is used as source data, the source data is copied to other data centers, when the source data is inconsistent with the data of the other data centers, judgment is carried out according to the updating time of the data, and the data corresponding to the later updating time is selected for updating. The method can cause problems of data omission and the like, and cannot ensure the data integrity and continuity of the data center. Disclosure of Invention The embodiment of the application provides a data fusion method, a device, equipment and a storage medium, which are used for guaranteeing the data integrity and continuity of a data center. In one aspect, an embodiment of the present application provides a data fusion method, where the method includes: Acquiring a plurality of first data to be compared in a preset period from a first data center, and acquiring a plurality of second data to be compared in the preset period from a second data center, wherein the preset period is determined based on a data center switching time point; determining the same data unique identifiers in the first data to be compared and the second data to be compared as first data identifiers; for any first data identifier, first data to be compared corresponding to the first data identifier is obtained from the plurality of first data to be compared to serve as first target data, second data to be compared corresponding to the first data identifier is obtained from the plurality of second data to be compared to serve as second target data, and the second target data is updated based on a first transaction state of the first target data and a second transaction state of the second target data. Optionally, the updating the second target data based on the first transaction state of the first target data and the second transaction state of the second target data includes: If the first transaction state in the first target data is not null, the second transaction state in the second target data is not null, and the first transaction state and the second transaction state are different, respectively determining a first state position corresponding to the first transaction state and a second state position corresponding to the second transaction state based on a preset transaction state machine; And if the first state position is located behind the second state position, updating the second target data by using the first target data. Optionally, the method further comprises: If the first transaction state is the same as the second transaction state, judging the transaction time point of the first target data and the transaction time point of the second target data; And if the transaction time point of the first target data is later than the transaction time point of the second target data, updating the second target data by using the first target data. Optionally, the updating the second target data based on the first transaction state of the first target data and the second transaction state of the second target data includes: If the first transaction state in the first target data is null and the second transaction state in the second target data is null, respectively determining M first follow-up transaction data corresponding to the first target data and N second follow-up transacti