CN-122027230-A - Cross-region data sharing method and device
Abstract
The invention discloses a cross-region data sharing method and device, wherein the method comprises the steps of receiving access requests from different regional heterogeneous data sources, determining an optimal data transmission path according to encryption grade identification and communication indexes of all data transmission paths in a communication network acquired in real time, determining data prefetching amount distributed by all nodes in the optimal data transmission path according to available bandwidth of the optimal data transmission path, executing a data prefetching task according to the determined data prefetching amount, caching prefetched data and corresponding metadata to local, updating a pre-built global data resource catalog according to cached data and metadata, responding to a data operation request of a user, inquiring the updated global data resource catalog to locate target data, and executing access operation on the target data. The invention can effectively maintain the consistency and the integrity of the data while ensuring the access performance, and realize flexible dispatching and on-demand sharing of the global data.
Inventors
- ZENG JINGYONG
- XU HUIRU
- LIU YAJING
- WANG HUA
- XU DINGYU
- WANG YUNFEI
- WANG LIHUA
Assignees
- 昆仑数智科技有限责任公司
- 中国石油天然气集团有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260116
Claims (13)
- 1. The cross-region data sharing method is characterized by comprising the following steps of: receiving access requests from different regional heterogeneous data sources, wherein the access requests carry encryption grade identifiers of the heterogeneous data sources; determining an optimal data transmission path according to the encryption grade identification and communication indexes of each data transmission path in the communication network acquired in real time; determining the data prefetching amount distributed by each node in the optimal data transmission path according to the available bandwidth of the optimal data transmission path, executing a data prefetching task according to the determined data prefetching amount, and caching prefetched data and corresponding metadata to the local; Updating a pre-constructed global data resource catalog according to the cached data and metadata; And responding to a data operation request of a user, inquiring the updated global data resource catalog to locate target data, and executing access operation on the target data.
- 2. The method of claim 1, wherein determining the optimal data transmission path based on the encryption level identification and the communication metrics for each data transmission path in the communication network collected in real-time comprises: determining path security weights corresponding to the encryption grade identifiers according to a preset mapping relation between the encryption grade and the security weights; Based on the path safety weight and the communication index of each data transmission path, calculating the comprehensive evaluation value of each path through weighted summation; and selecting an optimal data transmission path according to the comprehensive evaluation value.
- 3. The method of claim 1, wherein the communication metrics comprise at least one of a delay rate, a bandwidth utilization rate, and a packet loss rate.
- 4. The method of claim 1, wherein determining the amount of data prefetching allocated by each node in the optimal data transmission path based on the available bandwidth of the optimal data transmission path comprises: Acquiring the current available bandwidth of an optimal data transmission path and the historical data access characteristics of each node; the data prefetching amount of each node is calculated based on the current available bandwidth and the historical data access characteristic, wherein the data prefetching amount distributed by the node is proportional to the access frequency of the node.
- 5. The method of claim 1, wherein querying the updated global data resource directory to locate target data in response to a data operation request from a user, performing an access operation to the target data, comprises: analyzing the data operation request to obtain a target data identifier and an operation type; querying the updated global data resource catalog to obtain target data position information and data state corresponding to the target data identifier; Based on the target data location information, the data state and the operation type, selecting local cache direct access or initiating cross-node cooperative access to execute access operation to the target data.
- 6. A cross-regional data sharing apparatus, comprising: The access request receiving module is used for receiving access requests from different regional heterogeneous data sources, wherein the access requests carry encryption grade identifiers of the heterogeneous data sources; The optimal data transmission path determining module is used for determining an optimal data transmission path according to the encryption grade identification and the communication index of each data transmission path in the communication network acquired in real time; The data prefetching amount determining module is used for determining the data prefetching amount distributed by each node in the optimal data transmission path according to the available bandwidth of the optimal data transmission path, executing a data prefetching task according to the determined data prefetching amount, and caching prefetched data and corresponding metadata to the local; the global data resource catalog updating module is used for updating a pre-constructed global data resource catalog according to the cached data and metadata; And the access module of the target data is used for responding to the data operation request of the user, inquiring the updated global data resource catalog to locate the target data and executing the access operation on the target data.
- 7. The apparatus of claim 6, wherein the optimal data transmission path determination module is to: determining path security weights corresponding to the encryption grade identifiers according to a preset mapping relation between the encryption grade and the security weights; Based on the path safety weight and the communication index of each data transmission path, calculating the comprehensive evaluation value of each path through weighted summation; and selecting an optimal data transmission path according to the comprehensive evaluation value.
- 8. The apparatus of claim 6, wherein the communication metrics comprise at least one of a delay rate, a bandwidth utilization rate, and a packet loss rate.
- 9. The apparatus of claim 6, wherein the data prefetch amount determination module is specifically configured to: Acquiring the current available bandwidth of an optimal data transmission path and the historical data access characteristics of each node; the data prefetching amount of each node is calculated based on the current available bandwidth and the historical data access characteristic, wherein the data prefetching amount distributed by the node is proportional to the access frequency of the node.
- 10. The apparatus of claim 6, wherein the access module of the target data is specifically configured to: analyzing the data operation request to obtain a target data identifier and an operation type; querying the updated global data resource catalog to obtain target data position information and data state corresponding to the target data identifier; Based on the target data location information, the data state and the operation type, selecting local cache direct access or initiating cross-node cooperative access to execute access operation to the target data.
- 11. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 5 when executing the computer program.
- 12. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the method of any of claims 1 to 5.
- 13. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the method of any of claims 1 to 5.
Description
Cross-region data sharing method and device Technical Field The present invention relates to the field of data management technologies, and in particular, to a method and an apparatus for cross-regional data sharing. Background With the deep digital transformation of enterprises, unified management and efficient use of cross-region and multi-source heterogeneous data are becoming urgent demands. Currently, data is typically dispersed among multiple independent systems in different geographical areas, with the types of storage and access protocols employed by these systems being different, such as databases, file systems, and various types of cloud storage services. The decentralized and heterogeneous architecture causes serious data isolation phenomenon, so that the construction of a real-time and unified global data view becomes extremely difficult, and the cooperative efficiency and the real-time decision making capability of the cross-service units are further restricted. To address the challenges described above, a common prior art in the industry is to build a global resource catalog. Each branch node located in a different region needs to synchronize the local data volume or increment to a centralized headquarter data center first. After the data are gathered, gradually constructing a global data resource catalog at the central node by extracting metadata of the data sources or combining with a manual configuration mode. After that, each node user can query and access the data gathered to the center based on the catalog. However, this conventional centralization scheme has inherent drawbacks in practical deployment. Firstly, the data ready delay is higher, and the requirement of business on data instantaneity cannot be met. Secondly, dynamic change of metadata of a synchronous source end cannot be perceived in real time, so that directory information is old. Under a cross-regional distributed read-write scene, the existing mechanism is difficult to maintain the consistency and the integrity of data effectively while ensuring the access performance, so that the flexible scheduling and the on-demand sharing of the global data are difficult to realize. Accordingly, a method is needed to solve the above-mentioned problems. Disclosure of Invention The embodiment of the invention provides a cross-region data sharing method, which is used for effectively maintaining the consistency and the integrity of data while ensuring the access performance and realizing flexible dispatching and on-demand sharing of global data, and comprises the following steps: receiving access requests from different regional heterogeneous data sources, wherein the access requests carry encryption grade identifiers of the heterogeneous data sources; determining an optimal data transmission path according to the encryption grade identification and communication indexes of each data transmission path in the communication network acquired in real time; determining the data prefetching amount distributed by each node in the optimal data transmission path according to the available bandwidth of the optimal data transmission path, executing a data prefetching task according to the determined data prefetching amount, and caching prefetched data and corresponding metadata to the local; Updating a pre-constructed global data resource catalog according to the cached data and metadata; And responding to a data operation request of a user, inquiring the updated global data resource catalog to locate target data, and executing access operation on the target data. The embodiment of the invention also provides a cross-region data sharing device, which is used for effectively maintaining the consistency and the integrity of data while ensuring the access performance and realizing flexible dispatching and on-demand sharing of the global data, and comprises the following steps: The access request receiving module is used for receiving access requests from different regional heterogeneous data sources, wherein the access requests carry encryption grade identifiers of the heterogeneous data sources; The optimal data transmission path determining module is used for determining an optimal data transmission path according to the encryption grade identification and the communication index of each data transmission path in the communication network acquired in real time; The data prefetching amount determining module is used for determining the data prefetching amount distributed by each node in the optimal data transmission path according to the available bandwidth of the optimal data transmission path, executing a data prefetching task according to the determined data prefetching amount, and caching prefetched data and corresponding metadata to the local; the global data resource catalog updating module is used for updating a pre-constructed global data resource catalog according to the cached data and metadata; And the access module of the target data is u