CN-122027692-A - Cloud disk data fragment self-healing method, device, equipment and storage medium based on entropy flow engine
Abstract
The method comprises the steps of collecting signal intensity of peripheral edge aggregation nodes and TCP window change rate by a user terminal node, calculating corresponding entropy values, dividing a file to be uploaded into a plurality of slice segments according to the entropy values and distributing the slice segments to the edge aggregation nodes, constructing a slice chaos matrix by the edge aggregation nodes through a multi-layer chaos entropy flow aggregation algorithm, generating a monomer slice entropy diagram, sending the monomer slice entropy diagram to a cloud, fusing the monomer slice entropy diagram into a global entropy fusion diagram by the cloud reconstruction node, determining an optimal data block splicing path based on a path energy function minimization strategy, generating a reconstruction data flow, performing dynamic entropy matching processing by a feedback verification node, verifying the original file, and covering the damaged file. According to the method, dynamic self-adaptive data blocking is realized through multidimensional entropy evaluation so as to adapt to a weak network environment, and the transmission stability and the recovery reliability of cloud disk data are improved.
Inventors
- ZHANG DINGDING
- TAN MINGWU
- LI XIAOQING
- ZHANG YUN
- WANG CHAOPENG
- WANG KAI
- CHENG WENQIANG
- Zhan Chaosheng
- HU JUNHUA
- Zeng Xianxun
Assignees
- 中移互联网有限公司
- 中国移动通信集团有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251231
Claims (13)
- 1. The cloud disk data fragment self-healing method based on the entropy flow engine is characterized by being applied to a client, and comprises the following steps: S1, a user terminal node collects the signal intensity and the TCP window change rate of a peripheral edge aggregation node and calculates corresponding entropy values respectively; S2, the edge aggregation node receives the slice segment, a multi-layer chaotic entropy flow aggregation algorithm is executed to construct a segment chaotic matrix, entropy correlation aggregation is carried out to generate a monomer segment entropy diagram, and the monomer segment entropy diagram is sent to a cloud reconstruction node; S3, the cloud reconstruction node fuses the received plurality of monomer fragment entropy diagrams into a global entropy fusion diagram, and determines an optimal data block splicing path from the diagram based on a path energy function minimization strategy to generate a reconstruction data stream; And S4, the feedback verification node receives the reconstructed data stream, performs dynamic entropy matching processing and performs pattern matching verification on the reconstructed data stream and the original file, and covers the damaged file of the user terminal node after verification is passed.
- 2. The method for self-healing cloud disk data fragments based on the entropy flow engine according to claim 1, wherein in step S1, the corresponding entropy value is calculated, and the method specifically comprises the following steps: Normalizing the acquired signal strength to acquire a signal stability index, and carrying out dynamic window fluctuation modeling on the TCP window change rate to acquire a transmission stability index, wherein the transmission stability index and the TCP window change rate are in a negative correlation; Constructing a multidimensional entropy value evaluation model based on the signal stability index, the transmission stability index and the network jitter index; And taking the weighted sum of the complement of the signal stability index, the complement of the transmission stability index and the logarithmic value of the deviation of the network jitter index as an entropy value.
- 3. The method for self-healing cloud disk data fragments based on the entropy flow engine according to claim 1, wherein in step S1, the file to be uploaded is divided into a plurality of slice fragments, and the method specifically comprises the following steps: judging whether the entropy value and the signal strength of the current uploading path meet entropy sensitive conditions or not; If the entropy sensitive condition is met, the user terminal node calculates a block granularity adjusting factor, wherein the block granularity adjusting factor is obtained by carrying out weighted summation on the complement of the signal stability index and the complement of the transmission stability index; And determining the data block size of the entropy sensitive block according to the block granularity adjusting factor, wherein the data block size and the block granularity adjusting factor are in an exponential decay relation, so that the less the network environment is unstable, the smaller the block granularity is.
- 4. The cloud disk data fragment self-healing method based on the entropy flow engine of claim 1, wherein after the step S1, further comprises: the edge aggregation node calculates a signal entropy difference value between the node and the peripheral adjacent edge aggregation node, wherein the signal entropy difference value is obtained by subtracting the average entropy value of the adjacent edge aggregation node from the current entropy value of the node; the edge aggregation node calculates a transmission stability disturbance index, wherein the transmission stability disturbance index is the TCP window fluctuation rate of the edge aggregation node minus the average TCP window fluctuation rate of the adjacent edge aggregation nodes; And if the signal entropy difference value is larger than a preset threshold value or the transmission stability disturbance index is larger than the preset threshold value, the edge aggregation node generates a sub-block triggering instruction and transmits the sub-block triggering instruction back to the user terminal node so as to instruct the user terminal node to execute the data block process again.
- 5. The cloud disk data fragment self-healing method based on the entropy flow engine of claim 1, wherein in step S2, the fragment chaos matrix specifically comprises: file block ID, block uploading progress, uploading source terminal identification, entropy value of each block and signal entropy marking bit of the edge aggregation node; the method for performing entropy correlation polymerization specifically comprises the following steps: Calculating all block average entropy values of the current node; If the average entropy value of all the blocks of the current node is higher than the entropy value of the whole segments of the adjacent edge aggregation nodes, the whole segments of the current node are further segmented into a plurality of sub-blocks and recursively distributed to the adjacent edge aggregation nodes until the average entropy value of all the blocks of the current node is equal to the entropy value of the whole segments of the adjacent edge aggregation nodes.
- 6. The method for self-healing cloud disk data fragments based on entropy flow engine of claim 1, wherein in step S2, edges in the monomer fragment entropy diagram represent entropy correlation weights between two data blocks; the value of the entropy correlation weight is inversely related to the sum of the absolute value of the entropy value difference between two data blocks and the absolute value of the uploading timestamp difference, and the data blocks with the closer entropy values and the closer uploading time have larger weights.
- 7. The method for self-healing cloud disk data fragments based on entropy flow engine according to claim 1, wherein in step S3, before determining the optimal data block splicing path, the method further comprises: The cloud reconstruction node traverses the global entropy fusion graph to determine candidate data block splicing paths; the cloud reconstruction node calculates a path entropy value continuity index of each candidate path, wherein the path entropy value continuity index is in negative correlation with entropy value differences between adjacent data blocks in the path, and performs normalization processing according to the total logic length of the path; And if the continuity index of the path entropy value is smaller than a preset continuity threshold value, indicating that the path corresponding to the path entropy value has a data disorder or fragmentation risk, and triggering the path energy function minimization strategy by a cloud reconstruction node.
- 8. The method for self-healing cloud disk data fragments based on entropy flow engine according to claim 1, wherein in step S3, the path energy function is used for calculating the total energy value of the path, and the calculation of the total energy value specifically comprises the following steps: For each pair of data blocks in the path, calculating a product term of the logical matching degree and the entropy correlation weight; calculating a distance penalty term for each pair of data blocks in the path, wherein the distance penalty term is defined as the product of a logical position distance between the data blocks and a logarithmic function based on entropy correlation weights, and the logarithmic function enables the distance penalty term to increase in logarithmic order when the entropy correlation weights between the data blocks are lower; the total energy value is a weighted sum of the product of all data block pairs and the distance penalty term.
- 9. The method for self-healing cloud disk data fragments based on the entropy flow engine according to claim 1, wherein in step S3, the determining an optimal data block splicing path specifically comprises the following steps: The cloud reconstruction node sorts all candidate paths according to the total energy value, and selects a plurality of paths with the lowest energy value as candidate data block splicing paths; Calculating a reconstructed data entropy value for each candidate data block splicing path, wherein the reconstructed data entropy value is a weighted average value of entropy values of all data blocks in the path, and the weight decays exponentially along with the logic distance between the data blocks; If the entropy value of the reconstruction data of a certain path is smaller than a preset final entropy value threshold value and the hash value of the reconstruction data of the path is consistent with the hash value of the original file, the reconstruction data of the path is determined to be an optimal data block splicing path, and if a plurality of paths all meet the conditions, a path with the minimum total energy value is selected.
- 10. The method for self-healing cloud disk data fragments based on an entropy flow engine according to claim 1, wherein in step S4, the step of performing dynamic entropy matching processing includes calculating a spectrum matching degree between a reconstructed entropy flow spectrum and an initial entropy spectrum; The spectrum matching degree is a weighted sum of spectrum structure similarity and content hash matching rate, wherein the spectrum structure similarity is determined based on cosine similarity between a reconstructed entropy flow graph spectrum vector constructed by a reconstructed data stream and an initial entropy graph spectrum vector, and the content hash matching rate is a proportion of the number of data blocks passing hash check in the reconstructed data stream to the total number of data blocks.
- 11. An entropy flow engine-based cloud disk data fragment self-healing device, which is configured at a client, the device comprising: the user terminal node module is used for collecting the signal intensity and the TCP window change rate of the peripheral edge aggregation node, and respectively calculating corresponding entropy values; The edge aggregation node module is connected with the user terminal node module and is used for receiving the slice fragments, executing a multi-layer chaotic entropy flow aggregation algorithm to construct a fragment chaotic matrix, performing entropy correlation aggregation to generate a monomer fragment entropy diagram and transmitting the monomer fragment entropy diagram to a cloud reconstruction node; the cloud reconstruction node module is connected with the edge aggregation node module and is used for fusing the received multiple monomer fragment entropy diagrams into a global entropy fusion diagram, determining an optimal data block splicing path from the diagram based on a path energy function minimization strategy and generating a reconstruction data stream; The feedback verification node module is connected with the cloud reconstruction node module and the user terminal node module and is used for receiving the reconstruction data stream, performing dynamic entropy matching processing and carrying out pattern matching verification on the original file, and covering the damaged file of the user terminal node after verification is passed.
- 12. An apparatus comprising a processor, and a memory communicatively coupled to the processor; The memory stores computer-executable instructions; the processor executes the computer-executable instructions stored in the memory to implement a cloud disk data fragment self-healing method based on an entropy flow engine as claimed in any one of claims 1 to 10, or implement a cloud disk data fragment self-healing device based on an entropy flow engine as claimed in claim 11.
- 13. A computer readable storage medium, wherein computer executable instructions are stored in the computer readable storage medium, and when executed by a processor, the computer executable instructions are configured to implement a cloud disk data fragment self-healing method based on an entropy flow engine according to any one of claims 1 to 10, or implement a cloud disk data fragment self-healing device based on an entropy flow engine according to claim 11.
Description
Cloud disk data fragment self-healing method, device, equipment and storage medium based on entropy flow engine Technical Field The disclosure relates to the technical field of data transmission and cloud storage, in particular to a cloud disk data fragment self-healing method, device and equipment based on an entropy flow engine and a storage medium. Background With the popularization of mobile internet and edge computing technology, cloud disk services have evolved from pure data storage to multi-terminal collaboration and real-time data synchronization. In a massive file uploading scene, in order to improve transmission efficiency, a file slicing uploading technology is generally adopted, namely, a large file is divided into data blocks with fixed sizes to be transmitted in parallel, and merging and checking are carried out at a cloud end. This mechanism works well in a stable broadband network environment, but in a mobile network or edge access environment, network signals often accompany high frequency fluctuations and bursty jitter, resulting in a high degree of uncertainty in the channel state. Most of the existing cloud disk data transmission schemes adopt a static blocking strategy, namely, a fixed blocking size is set in an initial file uploading stage, and cannot be adjusted according to real-time changes of a network environment in a transmission process. When the network environment is degraded, such as the signal strength is weakened or the TCP window is severely fluctuated, the uploading of the large data block with fixed granularity is easy to fail due to the overtime or the packet loss of the transmission, and the frequent retransmission mechanism further aggravates the network congestion, thereby reducing the success rate and the efficiency of the transmission. In addition, the existing redundancy check mechanism (such as traditional CRC check or MD5 hash verification) only focuses on the static consistency of data content, lacks the dynamic perception capability of the stability of a transmission link, and cannot adaptively tune the data flow at the transmission source. At the data receiving and reconstructing end, the existing cloud merging mechanism generally relies on the sequence numbers of the data blocks for linear splicing. When the arrival sequence of the data slice is seriously disordered due to different network transmission paths or retransmission delay, and even fragmentation is missing, the original logic structure of the file is difficult to accurately restore by the traditional reconstruction algorithm. The existing fault-tolerant mechanism is mostly to perform one-time hash comparison after all data are received, once verification fails, the system can only judge file damage and require a user to upload again, and a mechanism capable of performing logic self-healing and intelligent reorganization on a data fragment level is lacking. When facing a complex dynamic network environment, the verification mode is difficult to ensure the integrity and recovery efficiency of data, and cannot meet the cloud storage requirement of high reliability. Disclosure of Invention The present disclosure aims to solve, at least to some extent, one of the technical problems in the related art. For this reason, the first aspect of the present disclosure proposes a cloud disk data fragment self-healing method based on an entropy flow engine, which is applied to a client. The method solves the problems of fragmentation and reconstruction difficulty of data transmission in a dynamic network environment, and realizes automatic data restoration through multi-node cooperative processing. In the method, a user terminal node first performs context awareness and data preprocessing. The user terminal node collects the signal intensity of the peripheral edge aggregation node and the change rate of a Transmission Control Protocol (TCP) window. Based on the collected data, the user terminal node calculates a corresponding entropy value, which is used to quantify the uncertainty of the current communication path. And according to the calculated entropy value, the user terminal node divides the file to be uploaded into a plurality of slice segments, and distributes the slice segments to the corresponding edge aggregation nodes. Subsequently, the edge aggregation node receives slice segments from the user terminal nodes as an intermediate hierarchy. And the edge aggregation node executes a multi-layer chaotic entropy flow aggregation algorithm to construct a fragment chaotic matrix containing file block identification and entropy value information. Meanwhile, the edge aggregation node carries out entropy correlation aggregation processing on the received slice fragments to generate a monomer fragment entropy diagram, wherein the diagram characterizes logic association among local data blocks, and the logic association is sent to the cloud reconstruction node. The cloud reconstruction node is responsible for lo