CN-121979852-A - Data migration method and system for multiple data sources
Abstract
The invention relates to the technical field of electrodata migration, in particular to a data migration method and a system for multiple data sources, wherein the method comprises the following steps: confirming an outflow processing platform and a plurality of electric data flows, acquiring a plurality of connected data source connectors, acquiring an electric data flow data packet set, acquiring a normal node set and an migrated node set, acquiring a node topology structure diagram, identifying the node topology structure diagram, acquiring an identification topology structure diagram, sequentially extracting the migrated nodes from the migrated node set, confirming a target migrated position, acquiring a candidate receiving node set, acquiring an optimal receiving node according to the migrated nodes and the candidate receiving node set, receiving a data migration instruction, performing data migration on the migrated nodes according to the optimal receiving node and the data migration instruction, acquiring the migrated nodes, summarizing the migrated nodes, acquiring the migrated node set, and completing a data migration method of multiple data sources based on the migrated node set. The invention can improve the high efficiency and timeliness of data migration.
Inventors
- QU WEIQIANG
- GUI NING
- LI PENGFEI
- CAO CHEN
- CHEN ZEYANG
- TAN XIAOGANG
- LIU CHUNQI
- XU LIMING
- WANG XINYAO
- LIU YIWEN
- LIU BO
Assignees
- 黄河水利水电开发集团有限公司
- 北京华科同安监控技术有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251130
Claims (10)
- 1. A method of data migration for multiple data sources, the method comprising: Confirming an outflow processing platform and a plurality of electric data flows, wherein the outflow processing platform comprises a plurality of data source connectors; Acquiring a plurality of connected data source connectors based on the plurality of data source connectors and the plurality of electrical data streams, and acquiring an electrical data stream data packet set based on the plurality of connected data source connectors; acquiring a normal node set and an migrating node set based on the current data stream data packet set, acquiring a node topology structure diagram, and identifying the node topology structure diagram by using the normal node set and the migrating node set to acquire an identification topology structure diagram; Sequentially extracting the migrating nodes from the migrating node set, confirming a target migrating position in the identification topological structure diagram according to the extracted migrating nodes, and acquiring a candidate receiving node set from the identification topological structure diagram according to the target migrating position; obtaining an optimal receiving node according to the migration node and the candidate receiving node set, receiving a data migration instruction, and performing data migration on the migration node according to the optimal receiving node and the data migration instruction to obtain a migrated node; Summarizing the migrated nodes to obtain a migrated node set, and completing the data migration method of multiple data sources based on the migrated node set.
- 2. The method of data migration for multiple data sources of claim 1, wherein said obtaining a set of electrical data flow packets based on a plurality of connected data source connectors comprises: the following is performed for each of the plurality of connected data source connectors: Monitoring the connected data source connector by using a pre-constructed load monitor to obtain an electric data flow rate sequence, and acquiring an electric data flow ID according to the connected data source connector, wherein the electric data flow rate sequence comprises a plurality of electric data flow rates; And packaging the electric data stream ID and the electric data stream velocity sequence to obtain electric data stream data packets, and summarizing the electric data stream data packets to obtain an electric data stream data packet set, wherein the electric data stream data packet set comprises a plurality of electric data stream data packets, and the electric data stream data packets, the connected data source connectors and the electric data streams are in one-to-one correspondence.
- 3. The method for migrating data from multiple data sources of claim 2, wherein said obtaining a normal node set and an migrating node set based on a set of electrical data flow packets comprises: Sequentially extracting the electric data flow data packets from the electric data flow data packet set, and executing the following operations on the extracted electric data flow data packets: Carrying out stability test on an electric data flow velocity sequence in an electric data flow data packet to obtain a detection result, wherein the detection result is stable or unstable; If the detection result is unstable, acquiring a differential flow rate sequence based on the electric data flow rate sequence, taking the differential flow rate sequence as the electric data flow rate sequence, returning to the step of performing stability test on the electric data flow rate sequence in the electric data flow data packet until the detection result is stable, taking the differential flow rate sequence corresponding to the stable detection result as a stable data flow rate sequence, confirming the corresponding differential order according to the stable data flow rate sequence, and acquiring an optimal prediction model according to the stable data flow rate sequence and the differential order; If the detection result is stable, acquiring an optimal prediction model according to the electric data flow velocity sequence; Predicting an electric data stream corresponding to the extracted electric data stream data packet by utilizing an optimal prediction model to obtain a predicted resource occupation parameter set, wherein the predicted resource occupation parameter set comprises a predicted CPU occupation rate, a predicted memory occupation rate, a predicted network bandwidth occupation rate and a predicted disk I/O occupation rate; Judging the predicted resource occupation parameter set by utilizing a pre-constructed resource judging module to obtain an immigrating node or a normal node; And respectively summarizing the normal nodes and the migrating nodes to obtain a normal node set and a migrating node set.
- 4. A method of data migration for multiple data sources according to claim 3, wherein said obtaining a differential flow rate sequence based on an electrical data flow rate sequence comprises: sequentially extracting electrical data flow rates from the sequence of electrical data flow rates, and performing the following operations on each of the extracted electrical data flow rates: Acquiring adjacent electrical data flow rates from the sequence of electrical data flow rates based on the extracted electrical data flow rates, wherein the adjacent electrical data flow rates are adjacent and lag the electrical data flow rates in the sequence of electrical data flow rates; calculating an electric data flow rate difference value according to the electric data flow rate and the adjacent electric data flow rate, wherein the electric data flow rate difference value is obtained by subtracting the adjacent electric data flow rate from the electric data flow rate; And summarizing the electric data flow velocity difference values to obtain a differential flow velocity sequence.
- 5. The method for data migration of multiple data sources of claim 4, wherein said obtaining an optimal prediction model based on a sequence of stationary data flow rates and differential order comprises: Determining an autoregressive order candidate value set and a translational average order candidate value set, sequentially extracting autoregressive order candidate values from the autoregressive order candidate value set, and acquiring an order combination set based on the extracted autoregressive order candidate values and the translational average order candidate value set, wherein the order combination set comprises a plurality of order combinations, and the order combination comprises an autoregressive order candidate value and a translational average order candidate value; Summarizing the order combination sets to obtain an order combination set, sequentially extracting the order combination from the order combination set, and executing the following operations on the extracted order combination: Forming an ARIMA model according to the differential order and the order combination, and training the ARIMA model by utilizing a stable data flow rate sequence to obtain a maximum likelihood function value; obtaining the total number of samples according to the stable data flow rate sequence, and calculating a model BIC value according to the total number of samples, the maximum likelihood function value and the differential order; summarizing the model BIC values to obtain a model BIC value set, combining the order set corresponding to the minimum model BIC value in the model BIC value set into an optimal combination, and confirming an optimal prediction model based on the optimal combination.
- 6. The method for migrating multi-data sources of claim 5, wherein the model BIC values are calculated as follows: , Wherein, the The value of the model BIC is represented, Representing the candidate values of the auto-regressive order, Representing the translation average order candidate, Representing the total number of samples, A logarithmic function is represented and is used to represent, Representing the maximum likelihood function value.
- 7. The method for migrating data from multiple data sources according to claim 6, wherein said determining the predicted resource occupation parameter set by using the pre-constructed resource determination module to obtain the migrated node or the normal node comprises: determining a lower resource limit and an upper resource limit according to the resource determination module, and determining whether predicted resource occupation parameters in the predicted resource occupation parameter set are smaller than the lower resource limit; if the predicted resource occupation parameters in the predicted resource occupation parameter set are smaller than the lower limit of the resource, taking the electric data flow corresponding to the extracted electric data flow data packet as a resource surplus node; Otherwise, judging whether the predicted resource occupation parameter set has the predicted resource occupation parameter larger than the upper limit of the resource; if predicted resource occupation parameters larger than the upper limit of the resources exist in the predicted resource occupation parameter set, the electric data flow is used as a resource overload node; if the predicted resource occupation parameter set does not have the predicted resource occupation parameter larger than the upper limit of the resource, taking the electric data stream as a normal node; and taking the resource excess node or the resource overload node as an outgoing node.
- 8. The method for migrating data from multiple data sources of claim 7, wherein said obtaining the best corresponding node from the migrated node and the candidate corresponding node set comprises: if the migrated node is a resource surplus node, acquiring total data quantity required to be migrated of the resource surplus node, confirming a surplus candidate receiving node set from the candidate receiving node set according to the resource surplus node, and confirming the optimal surplus receiving node from the surplus candidate receiving node set according to the total data quantity required to be migrated; If the mobile node is a resource overload node, confirming a node prediction resource occupation parameter set according to the resource overload node, confirming a maximum resource occupation rate according to the node prediction resource occupation parameter set, and calculating the load to be reduced according to the maximum resource occupation rate and a preset target resource occupation rate, wherein the load to be reduced is a value obtained by subtracting the target resource occupation rate from the maximum resource occupation rate; Calculating a data flow rate value to be migrated according to the load quantity to be reduced and the maximum resource occupancy rate, acquiring overload migration data from the resource overload node according to the data flow rate value to be migrated, acquiring an overload candidate receiving node set from the candidate receiving node set according to the resource overload node, and confirming an optimal overload receiving node from the overload candidate receiving node set according to the overload migration data; the optimal excess receiving node or the optimal overload receiving node is taken as the optimal receiving node.
- 9. The data migration method of multiple data sources according to claim 8, wherein said identifying the best surplus sink node from the set of surplus candidate sink nodes according to the total amount of data to be migrated comprises: Sequentially extracting excess candidate receiving nodes from the excess candidate receiving node set, acquiring an excess candidate node resource occupancy rate set according to the extracted excess candidate receiving nodes, and predicting a receiving occupancy rate set according to the total amount of data to be migrated and the excess candidate node resource occupancy rate set; If the receiving occupancy rate set has the receiving occupancy rate greater than the preset resource upper limit, removing the surplus candidate receiving nodes from the surplus candidate receiving node set to obtain an updated surplus candidate receiving node set, taking the updated surplus candidate receiving node set as the surplus candidate receiving node set, and returning to the step of sequentially extracting the surplus candidate receiving nodes from the surplus candidate receiving node set until all the surplus candidate receiving nodes in the surplus candidate receiving node set are extracted; If the receiving occupancy rate which is larger than the upper limit of the resources does not exist in the receiving occupancy rate set, the surplus candidate receiving nodes are used as effective receiving nodes; summarizing the effective receiving nodes to obtain an effective receiving node set, and obtaining the optimal surplus receiving nodes according to the effective receiving node set.
- 10. A data migration system for multiple data sources, the system comprising: The data source connection module is used for confirming an outflow processing platform and a plurality of electric data flows, wherein the flow processing platform comprises a plurality of data source connectors, a plurality of connected data source connectors are obtained based on the plurality of data source connectors and the plurality of electric data flows, and an electric data flow data packet set is obtained based on the plurality of connected data source connectors; The node identification module is used for acquiring a normal node set and an outgoing node set based on the electric data stream data packet set, acquiring a node topology structure diagram, utilizing the normal node set and the outgoing node set to identify the node topology structure diagram, acquiring an identification topology structure diagram, sequentially extracting the outgoing nodes from the outgoing node set, confirming a target outgoing position in the identification topology structure diagram according to the extracted outgoing nodes, and acquiring a candidate receiving node set from the identification topology structure diagram according to the target outgoing position; The optimal receiving node confirmation module is used for acquiring an optimal receiving node according to the migration node and the candidate receiving node set, receiving a data migration instruction, and carrying out data migration on the migration node according to the optimal receiving node and the data migration instruction to obtain a migrated node; and the electromigration module is used for summarizing the migrated nodes to obtain a migrated node set, and completing the data migration method of the multiple data sources based on the migrated node set.
Description
Data migration method and system for multiple data sources Technical Field The invention relates to the technical field of electrodata migration, in particular to a data migration method and system for multiple data sources. Background Multiple data sources refer to data from multiple different places. Data migration refers to the process of transferring data from one location or system to another. Although the demand for data migration is increasing, there are many limitations in the existing data migration methods. Conventional data migration methods often rely on manual operations or simple scripting tools, which are inefficient and prone to error when handling multiple data sources. Secondly, the existing data migration method often does not fully consider resource management, and cannot dynamically adjust resource allocation in the migration process, so that resource waste or system overload is caused. Therefore, how to improve the efficiency and timeliness of data migration is a technical problem that needs to be solved urgently. Disclosure of Invention The invention provides a data migration method of multiple data sources and a computer readable storage medium, and mainly aims to improve the high efficiency and timeliness of data migration. In order to achieve the above object, the present invention provides a data migration method for multiple data sources, including: Confirming an outflow processing platform and a plurality of electric data flows, wherein the outflow processing platform comprises a plurality of data source connectors; Acquiring a plurality of connected data source connectors based on the plurality of data source connectors and the plurality of electrical data streams, and acquiring an electrical data stream data packet set based on the plurality of connected data source connectors; acquiring a normal node set and an migrating node set based on the current data stream data packet set, acquiring a node topology structure diagram, and identifying the node topology structure diagram by using the normal node set and the migrating node set to acquire an identification topology structure diagram; Sequentially extracting the migrating nodes from the migrating node set, confirming a target migrating position in the identification topological structure diagram according to the extracted migrating nodes, and acquiring a candidate receiving node set from the identification topological structure diagram according to the target migrating position; obtaining an optimal receiving node according to the migration node and the candidate receiving node set, receiving a data migration instruction, and performing data migration on the migration node according to the optimal receiving node and the data migration instruction to obtain a migrated node; Summarizing the migrated nodes to obtain a migrated node set, and completing the data migration method of multiple data sources based on the migrated node set. Optionally, the acquiring the set of electrical data flow data packets based on the plurality of connected data source connectors includes: the following is performed for each of the plurality of connected data source connectors: Monitoring the connected data source connector by using a pre-constructed load monitor to obtain an electric data flow rate sequence, and acquiring an electric data flow ID according to the connected data source connector, wherein the electric data flow rate sequence comprises a plurality of electric data flow rates; And packaging the electric data stream ID and the electric data stream velocity sequence to obtain electric data stream data packets, and summarizing the electric data stream data packets to obtain an electric data stream data packet set, wherein the electric data stream data packet set comprises a plurality of electric data stream data packets, and the electric data stream data packets, the connected data source connectors and the electric data streams are in one-to-one correspondence. Optionally, the acquiring the normal node set and the migrating node set based on the electric data flow data packet set includes: Sequentially extracting the electric data flow data packets from the electric data flow data packet set, and executing the following operations on the extracted electric data flow data packets: Carrying out stability test on an electric data flow velocity sequence in an electric data flow data packet to obtain a detection result, wherein the detection result is stable or unstable; If the detection result is unstable, acquiring a differential flow rate sequence based on the electric data flow rate sequence, taking the differential flow rate sequence as the electric data flow rate sequence, returning to the step of performing stability test on the electric data flow rate sequence in the electric data flow data packet until the detection result is stable, taking the differential flow rate sequence corresponding to the stable detection result as a stable