CN-121996140-A - Data partitioning method, device, equipment and medium in distributed storage

CN121996140ACN 121996140 ACN121996140 ACN 121996140ACN-121996140-A

Abstract

The application relates to the technical field of data storage, in particular to a data partitioning method, a device, equipment and a medium in distributed storage, wherein the method comprises the steps of determining a partition table corresponding to a main copy according to codes and total partition numbers corresponding to a plurality of nodes; according to the physical node identifiers corresponding to the numbers of all the bits in the partition table corresponding to the master copy, a rotary magnetic suspension mode is adopted to generate the partition table corresponding to the slave copy, so that the physical node identifiers corresponding to the same number of the partition table corresponding to the slave copy and the partition table corresponding to the master copy are different, the master copy and the slave copy of the same data are prevented from being distributed on the same logical node and physical node, and data integrity guarantee can be provided when a single physical node is powered down.

Inventors

YU MINGCHEN
PENG CONG
WANG GANG
WANG XINGEN
WANG XINYU

Assignees

浙江邦盛科技股份有限公司

Dates

Publication Date: 20260508
Application Date: 20241101

Claims (10)

1. A method for partitioning data in a distributed store, comprising: Acquiring codes corresponding to a plurality of nodes in a cluster, physical node identifiers corresponding to the nodes and a total partition number, wherein the physical node identifiers represent physical node information of the nodes; Determining a partition table corresponding to the main copy according to codes and total partition numbers corresponding to a plurality of nodes, wherein each digit of the partition table corresponds to one partition, and the numerical value of each digit corresponds to one node; And generating a partition table corresponding to the slave copy by adopting a rotary magnetic suspension mode according to the physical node identifiers corresponding to the bit numbers in the partition table corresponding to the master copy, so that the physical node identifiers corresponding to the same bit number of the partition table corresponding to the slave copy and the partition table corresponding to the master copy are different.
2. The method for partitioning data in distributed storage according to claim 1, wherein generating the partition table corresponding to the slave copy by adopting a rotating magnetic levitation mode according to the physical node identifier corresponding to each bit number in the partition table corresponding to the master copy comprises: Rotating an array in the partition table corresponding to the master copy by a first preset bit number according to a preset direction to obtain an initial partition table corresponding to the slave copy; Sequentially determining whether a physical node identifier corresponding to the number of bits in an initial partition table corresponding to the slave copy is the same as a physical node identifier corresponding to the corresponding number of bits in a partition table corresponding to the master copy aiming at the slave copy; if the number of the copies is greater than 2, rotating the array in the partition table corresponding to the slave copy by a third preset bit number according to a preset direction to obtain other initial partition tables corresponding to the slave copy; and for other slave copies, sequentially determining whether the physical node identifiers corresponding to the bit numbers in the initial partition tables corresponding to the other slave copies are the same as the physical node identifiers corresponding to the corresponding bit numbers in the partition tables corresponding to the slave copies, and if so, rotating the array after the bit numbers by a fourth preset bit number according to a preset direction until all the bit numbers are traversed to obtain the partition tables corresponding to the other slave copies.
3. The method of partitioning data in a distributed storage according to claim 1, wherein determining a partition table corresponding to the primary replica according to the codes and the total partition number corresponding to each of the plurality of nodes comprises: obtaining code hash values corresponding to a plurality of nodes in a cluster, wherein the code hash values are determined based on codes corresponding to the nodes, and one code corresponds to a unique node; Splicing according to the code hash value of the target node and the suffix corresponding to the main copy to obtain seed information, and determining a preference sequence result corresponding to the target node based on the seed information and the total partition number; And determining the partition corresponding to each of the plurality of nodes according to the preference sequence result corresponding to each of the plurality of nodes so as to obtain a partition table corresponding to the main copy.
4. The method for partitioning data in a distributed storage according to claim 3, wherein said preference sequence result comprises an offset, a step size, a displacement value, and a total partition number; Correspondingly, determining a preference sequence result corresponding to the target node based on the seed information and the total partition number comprises the following steps: Performing first abstract calculation on the seed information to obtain a first hash value; Performing second digest calculation on the seed information to obtain a second hash value; taking a modulus by using the first hash value and the total partition number to obtain an offset corresponding to the target node; Calculating according to a preset formula and the second hash value and the total partition number to obtain a step length corresponding to the target node; The preset formula is that the step length=the second hash value mod (total partition number-1) +1; and determining a displacement value according to the offset, the step length and the total partition number.
5. The method of partitioning data in a distributed storage according to claim 3, wherein determining the partition to which each of the plurality of nodes corresponds according to the preference sequence result to which each of the plurality of nodes corresponds comprises: Obtaining a bit number by taking a model according to the offset corresponding to a target node and the total partition number, wherein the target node is any node in the plurality of nodes; Updating the offset according to the step length, repeatedly executing the module taking according to the offset corresponding to the target node and the total partition number to obtain the bit number until the offset is equal to the displacement value, and obtaining all the bit numbers corresponding to the target node; And determining the partition corresponding to each of the plurality of nodes according to all the digits corresponding to each of the plurality of nodes.
6. The method for partitioning data in distributed storage according to claim 5, wherein after modulus is obtained according to the offset corresponding to the target node and the total partition number, the method further comprises: And if the bit number corresponding to the target node is the same as the bit number corresponding to other previous nodes, ignoring the bit number of the target node.
7. The method for partitioning data in a distributed storage according to claim 5, wherein before determining the partition to which each of the plurality of nodes corresponds according to all bits to which each of the plurality of nodes corresponds, further comprising: Determining whether there is a target partition of unassigned nodes; If the target partition exists, determining the partition number corresponding to each node, and taking the node with the minimum partition number as the node corresponding to the target partition.
8. A data partitioning apparatus in a distributed storage, comprising: the acquisition module is used for acquiring codes corresponding to a plurality of nodes in the cluster, physical node identifiers corresponding to the nodes and the total partition number, wherein the physical node identifiers represent physical node information to which the nodes belong; The main copy partition table determining module is used for determining a partition table corresponding to the main copy according to codes and total partition numbers corresponding to a plurality of nodes, wherein each digit of the partition table corresponds to one partition, and the numerical value of each digit corresponds to one node; and the slave copy partition table determining module is used for generating a partition table corresponding to the slave copy by adopting a rotary magnetic suspension mode according to the physical node identifiers corresponding to the bit numbers in the partition table corresponding to the master copy, so that the physical node identifiers corresponding to the same bit number of the partition table corresponding to the slave copy and the partition table corresponding to the master copy are different.
9. An electronic device, comprising: One or more processors; A memory; One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the steps of the method of data partitioning in a distributed store according to any one of claims 1 to 7.
10. A computer readable storage medium, characterized in that the storage medium stores at least one instruction, at least one program, code set or instruction set, the at least one instruction, at least one program, code set or instruction set being loaded by a processor and performing the steps of the data partitioning method in a distributed storage according to any of claims 1 to 7.

Description

Data partitioning method, device, equipment and medium in distributed storage Technical Field The present application relates to the field of data storage technologies, and in particular, to a method, an apparatus, a device, and a medium for partitioning data in distributed storage. Background In the rapid development of modern information technology, data storage and management are becoming a critical area. With the advent of the large data age, the dramatic increase in data size has made traditional storage solutions a significant challenge. Distributed storage systems are receiving extensive attention due to their high scalability and fault tolerance, becoming an effective solution to the problem of large-scale data storage. The distributed storage system can improve the access speed of data and the reliability of the system to some extent by storing data in a plurality of nodes in a distributed manner. However, as different copies of the data are distributed on different nodes, randomness exists, and if a power failure condition occurs in a physical node, the data is easy to lose. Therefore, how to ensure the integrity of data under the condition of power failure becomes a technical problem to be solved by the technicians in the field. Disclosure of Invention The application aims to provide a data partitioning method, a device, equipment and a medium in distributed storage, which can ensure the integrity of data under the condition of power failure. In a first aspect, a method for partitioning data in distributed storage is provided, including: Acquiring codes corresponding to a plurality of nodes in a cluster, physical node identifiers corresponding to the nodes and a total partition number, wherein the physical node identifiers represent physical node information of the nodes; Determining a partition table corresponding to the main copy according to codes and total partition numbers corresponding to a plurality of nodes, wherein each digit of the partition table corresponds to one partition, and the numerical value of each digit corresponds to one node; And generating a partition table corresponding to the slave copy by adopting a rotary magnetic suspension mode according to the physical node identifiers corresponding to the bit numbers in the partition table corresponding to the master copy, so that the physical node identifiers corresponding to the same bit number of the partition table corresponding to the slave copy and the partition table corresponding to the master copy are different. The application can be further configured in a preferred example to generate a partition table corresponding to a slave copy by adopting a rotary magnetic levitation mode according to the physical node identifiers corresponding to the numbers of each bit in the partition table corresponding to the master copy, which comprises the following steps: Rotating an array in the partition table corresponding to the master copy by a first preset bit number according to a preset direction to obtain an initial partition table corresponding to the slave copy; Sequentially determining whether a physical node identifier corresponding to the number of bits in an initial partition table corresponding to the slave copy is the same as a physical node identifier corresponding to the corresponding number of bits in a partition table corresponding to the master copy aiming at the slave copy; If the number of the copies is greater than 2, rotating an array in the partition table corresponding to the slave copy by a third preset bit number according to a preset direction to obtain other initial partition tables corresponding to the slave copy; and for other slave copies, sequentially determining whether the physical node identifiers corresponding to the bit numbers in the initial partition tables corresponding to the other slave copies are the same as the physical node identifiers corresponding to the corresponding bit numbers in the partition tables corresponding to the slave copies, and if so, rotating the array after the bit numbers by a fourth preset bit number according to a preset direction until all the bit numbers are traversed to obtain the partition tables corresponding to the other slave copies. The present application may be further configured in a preferred example to determine a partition table corresponding to the primary copy according to codes and total partition numbers corresponding to each of the plurality of nodes, including: obtaining code hash values corresponding to a plurality of nodes in a cluster, wherein the code hash values are determined based on codes corresponding to the nodes, and one code corresponds to a unique node; Splicing according to the code hash value of the target node and the suffix corresponding to the main copy to obtain seed information, and determining a preference sequence result corresponding to the target node based on the seed information and the total partition number; And determining