Search

CN-122019475-A - MERKLETREE-based file hash addressing method and MERKLETREE-based file hash addressing system

CN122019475ACN 122019475 ACN122019475 ACN 122019475ACN-122019475-A

Abstract

The invention relates to the technical field of data processing, in particular to a file hash addressing method and system based on MERKLETREE, wherein the method comprises the steps of obtaining file attribute information of a target file to divide the target file into a plurality of key data blocks; determining whether the segmentation process meets the expected standard, if so, carrying out hash calculation on each key data block to generate a leaf node hash set corresponding to the target file, constructing a target MERKLETREE structure, determining path information of each key data block based on the target MERKLETREE structure, encrypting the path information of each key data block to generate addressing identifications corresponding to each key data block, and determining a verification root hash value based on the addressing identifications corresponding to the data blocks to be verified to determine whether the data blocks to be verified meet the data integrity standard. The invention can improve the integrity verification efficiency and verification safety of any data block of the file.

Inventors

  • LIU FEIFEI
  • FANG JIE

Assignees

  • 杭州光宇云科技有限公司

Dates

Publication Date
20260512
Application Date
20260415

Claims (10)

  1. 1. A MERKLETREE-based file hash addressing method, comprising: Acquiring file attribute information of a target file to determine a file attribute characterization value, and dividing the target file into a plurality of key data blocks according to the file attribute characterization value, wherein the file attribute information comprises a file type and a data size; encrypting each key data block, and determining an encryption characterization value according to the data size of each encrypted key data block so as to determine whether the segmentation process meets the expected standard; If the target file meets the expected standard, carrying out hash calculation on each key data block to determine a hash value corresponding to each key data block so as to generate a leaf node hash set corresponding to the target file; Generating a storage root hash value based on the leaf node hash set to construct a target MERKLETREE structure; determining path information of each key data block based on the target MERKLETREE structure, and encrypting the path information of each key data block to generate an addressing identifier corresponding to each key data block; Determining a verification root hash value based on an addressing identifier corresponding to a data block to be verified, and determining whether the data block to be verified meets a data integrity standard or not based on a comparison result of the verification root hash value and a storage root hash value, wherein the data block to be verified is any key data block.
  2. 2. The MERKLETREE-based file hash addressing method as claimed in claim 1, wherein determining said file attribute characterization value comprises: Determining a file type characterization value based on the file type of the target file and a preset type mapping table; Determining a file data characterization value based on a comparison result of the data size of the target file and a preset data size; And determining the file attribute characterization value based on the file type characterization value and the file data characterization value.
  3. 3. The MERKLETREE-based file hash addressing method as claimed in claim 2, wherein dividing the target file into a plurality of key data blocks according to the file attribute characterization value comprises: determining a target segmentation number according to the file attribute characterization value and a preset segmentation number; And dividing the target file based on the target division number to obtain a plurality of key data blocks.
  4. 4. A file hash addressing method based on MERKLETREE, according to claim 3, wherein determining said cryptographic token value comprises: determining a first comparison value based on the comparison result of the data sizes before and after encryption of each key data block; determining a second comparison value based on the comparison result of the data size after encryption of each key data block; The encryption characterization value is determined based on the first comparison value and the second comparison value.
  5. 5. The MERKLETREE-based file hash addressing method as claimed in claim 4, wherein determining whether a splitting process meets an expected criterion comprises: and determining whether the segmentation process meets the expected standard or not based on the comparison result of the encryption characterization value and the preset characterization value.
  6. 6. The MERKLETREE-based file hash addressing method as claimed in claim 5, wherein constructing a target MERKLETREE structure comprises: Hashing every two of the leaf nodes in the leaf node hash set to obtain a plurality of leaf node hash groups; Performing hash calculation after splicing hash values of the hash groups of the leaf nodes to obtain hash values of a plurality of father nodes; Recursively performing the above steps generates a storage root hash value to construct the target MERKLETREE structure.
  7. 7. The MERKLETREE-based file hash addressing method as set forth in claim 6, wherein generating an addressing identification corresponding to any one of said critical data blocks includes: Determining leaf nodes of the key data blocks based on meta information of the key data blocks, wherein the meta information comprises data block serial numbers of the key data blocks; Determining a branch direction from a leaf node to a root node of the critical data block based on the target MERKLETREE structure to determine path information of the critical data block; And encrypting the path information of the key data block to generate an addressing identifier corresponding to the key data block.
  8. 8. The MERKLETREE-based file hash addressing method as claimed in claim 7, wherein determining said verification root hash value comprises: determining a brother node hash value and a verification path corresponding to the data block to be verified based on the addressing identifier corresponding to the data block to be verified; And calculating hash values layer by layer based on the brother node hash values and the verification path to obtain verification root hash values.
  9. 9. The MERKLETREE-based file hash addressing method as claimed in claim 8, wherein determining whether a target data block meets data integrity criteria comprises: And if the verification root hash value is consistent with the storage root hash value, determining that the data block to be verified meets the data integrity standard.
  10. 10. A file hash addressing system employing the MERKLETREE-based file hash addressing method of any one of claims 1-9, comprising: The data acquisition module is used for acquiring file attribute information of a target file, wherein the file attribute information comprises a file type and a data size; the file segmentation module is used for determining a file attribute characterization value based on the file attribute information of the target file so as to segment the target file into a plurality of key data blocks; the segmentation verification module is used for encrypting each key data block and determining an encryption characterization value according to the data size of each encrypted key data block so as to determine whether the segmentation process accords with an expected standard or not; The tree construction module is used for carrying out hash calculation on each key data block under the condition of meeting the expected standard to determine a hash value corresponding to each key data block so as to generate a leaf node hash set corresponding to the target file, and generating a storage root hash value based on the leaf node hash set so as to construct a target MERKLETREE structure; The addressing identifier generation module is used for determining the path information of each key data block based on the target MERKLETREE structure and encrypting the path information of each key data block to generate an addressing identifier corresponding to each key data block; The integrity verification module is used for determining a verification root hash value based on an addressing identifier corresponding to a data block to be verified, and determining whether the data block to be verified meets a data integrity standard or not based on a comparison result of the verification root hash value and a storage root hash value, wherein the data block to be verified is any key data block.

Description

MERKLETREE-based file hash addressing method and MERKLETREE-based file hash addressing system Technical Field The invention relates to the technical field of data processing, in particular to a file hash addressing method and system based on MERKLETREE. Background In the digital information age today, data has become a central asset for businesses and individuals. With the rapid development of cloud computing, big data and distributed storage technology, massive data are usually stored on a plurality of physical nodes in the form of fragments or copies, so as to improve storage efficiency and data availability. However, distributed storage environments introduce serious data security challenges while providing convenience. Hardware failures, software errors, human error operations, and malicious network attacks of the storage nodes may all result in loss or corruption of the stored data. Under the background, how to efficiently and accurately verify the integrity of remote storage data on the premise of not frequently transmitting complete data copies has become a research hotspot in the field of information security. Conventional integrity checking methods, such as cyclic redundancy check or one-way hash functions, typically require that the complete file be obtained for computation and comparison. However, in the distributed system, because the volume of the file is huge and the storage positions are scattered, frequent reading of the complete file occupies a large amount of network bandwidth and I/O resources, the verification efficiency is low and the fault-tolerant positioning capability is not provided. To solve the above problems, MERKLE TREE data structures have been introduced. As a hash binary tree MERKLE TREE can block the file data and then calculate the hash value layer by layer, and finally generate a unique root hash. When a certain data block needs to be verified, verification can be completed quickly by only providing the data block and the hash nodes on the adjacent paths of the data block, and the whole file does not need to be acquired. There are still some technical dead zones in the actual deployment. When the existing scheme is used for data blocking, the fixed blocking size is often adopted, and the adaptability to the attribute of the file is lacking. For example, for files of different formats or different sizes, a fixed blocking policy may cause critical data boundaries to be cut off, affecting the efficiency and security of subsequent encryption and authentication. Disclosure of Invention Therefore, the invention provides a file hash addressing method and system based on MERKLETREE, which are used for solving the problems that in the prior art, verification efficiency is low and verification safety is influenced on file data blocking due to the fact that a complete file is read. In order to achieve the above object, in one aspect, the present invention provides a file hash addressing method based on MERKLETREE, including: Acquiring file attribute information of a target file to determine a file attribute characterization value, and dividing the target file into a plurality of key data blocks according to the file attribute characterization value, wherein the file attribute information comprises a file type and a data size; encrypting each key data block, and determining an encryption characterization value according to the data size of each encrypted key data block so as to determine whether the segmentation process meets the expected standard; If the target file meets the expected standard, carrying out hash calculation on each key data block to determine a hash value corresponding to each key data block so as to generate a leaf node hash set corresponding to the target file; Generating a storage root hash value based on the leaf node hash set to construct a target MERKLETREE structure; determining path information of each key data block based on the target MERKLETREE structure, and encrypting the path information of each key data block to generate an addressing identifier corresponding to each key data block; Determining a verification root hash value based on an addressing identifier corresponding to a data block to be verified, and determining whether the data block to be verified meets a data integrity standard or not based on a comparison result of the verification root hash value and a storage root hash value, wherein the data block to be verified is any key data block. Further, determining the file attribute characterization value includes: Determining a file type characterization value based on the file type of the target file and a preset type mapping table; Determining a file data characterization value based on a comparison result of the data size of the target file and a preset data size; And determining the file attribute characterization value based on the file type characterization value and the file data characterization value. Further, the target file is divided