Search

CN-121659349-B - Distributed-type-based financial data storage method and system

CN121659349BCN 121659349 BCN121659349 BCN 121659349BCN-121659349-B

Abstract

The invention relates to the field of data processing, in particular to a distributed-type-based financial data storage method and system, wherein the method comprises the steps of obtaining a financial transaction data stream to be stored and dividing the data stream into a plurality of data blocks; calculating the distribution change degree of the data blocks aiming at single data blocks, respectively determining the weighted probability for constructing a compression coding table, the encryption round of an encryption algorithm and the replication factor in a distributed storage system based on the distribution change degree, sequentially compressing, encrypting and storing the data blocks based on the determined three key parameters, and executing the compression, encryption and storage on all the data blocks so as to finish the distributed safe storage of the financial transaction data stream. The invention regulates compression, encryption and storage links through unified index linkage, improves the statistic analysis resistance of the encrypted data, and realizes the self-adaptive balance of storage safety and resource cost.

Inventors

  • MA QINGZHE
  • WANG JIE
  • YANG DONG

Assignees

  • 浙江优创信息技术有限公司

Dates

Publication Date
20260508
Application Date
20260209

Claims (8)

  1. 1. A distributed-based financial data storage method, comprising: Acquiring a financial transaction data stream to be stored, and dividing the data stream into continuous data blocks according to a preset size, wherein the data stream is in a byte sequence form; Calculating the distribution change degree of a data block aiming at a single data block, respectively determining the weighting probability for constructing a compression coding table, the encryption round of an encryption algorithm and the replication factor in a distributed storage system based on the distribution change degree, adaptively compressing the data block based on the determined weighting probability to obtain a compressed data block, encrypting the compressed data block based on the determined encryption round to obtain an encrypted data block, and storing the encrypted data block based on the determined replication factor; performing the compression, encryption and storage on all the data blocks, so as to finish the distributed safe storage of the financial transaction data stream; the distribution change degree of the data block specifically comprises the steps of calculating the distribution change degree of each character in the data block, wherein the calculation process of the distribution change degree of any character in the current data block is as follows: The method comprises the steps of calculating the occurrence probability of a character in a current data block, obtaining the average value and standard deviation of the occurrence probability of the character in all data blocks, calculating the difference value between the occurrence probability of the character in the current data block and the occurrence probability of the character in a previous data block, and obtaining a probability difference value; obtaining the maximum value and the minimum value of the probability difference value of the character in all data blocks; Carrying out normalization calculation based on the absolute value of the probability difference value and the maximum value and the minimum value to obtain a first factor, and carrying out deviation weighted calculation based on the occurrence probability of the character in the current data block and the average value and standard deviation of the occurrence probability of the character in all the data blocks to obtain a second factor; and multiplying the first factor by the second factor to obtain the distribution variation degree of the character in the current data block.
  2. 2. The distributed-based financial data storage method of claim 1, wherein the obtaining of the weighted probabilities comprises: and for any character in the current data block, acquiring the historical occurrence probability of the character in the historical financial data, and multiplying the historical occurrence probability of the character by the distribution change degree of the character to obtain the weighted probability of the character.
  3. 3. The distributed-based financial data storage method of claim 2, wherein the adaptive compression is: based on the weighted probability of all characters in the current data block, a Huffman tree is constructed through a Huffman coding algorithm, and the constructed Huffman tree is utilized to compress the current data block.
  4. 4. The distributed-based financial data storage method of claim 1, wherein the obtaining of the encryption round comprises: calculating the average value of the distribution variation degree of all characters in the current data block; And calculating the product of the average value of the distribution variation degrees of all the characters and a preset first adjustment amplitude, rounding down the product, and calculating the difference value between a preset basic encryption round and the rounding down result to obtain the encryption round.
  5. 5. The distributed-based financial data storage method of claim 4, wherein the encryption algorithm is an AES algorithm.
  6. 6. The distributed-based financial data storage method of claim 1, wherein the obtaining of the replication factor comprises: Calculating the product of the average value of the distribution variation degree of all characters and a preset second adjustment amplitude, rounding down the product result, and calculating the difference value between a preset maximum replication factor and the rounding down result to obtain an intermediate value; and taking the larger value of the intermediate value and the preset minimum replication factor as the replication factor.
  7. 7. The distributed-based financial data storage method of claim 6, wherein the encrypted data blocks are stored to a plurality of data nodes in accordance with the replication factor.
  8. 8. A distributed-based financial data storage system comprising a processor and a memory storing computer program instructions which, when executed by the processor, implement the distributed-based financial data storage method of any one of claims 1-7.

Description

Distributed-type-based financial data storage method and system Technical Field The present invention relates to the field of data processing. More particularly, the present invention relates to a distributed-based financial data storage method and system. Background With the rapid development of financial science and technology, applications such as mobile payment and internet banking are increasingly popular, and financial data volume is explosively increased. The traditional centralized storage architecture faces the problems of storage bottleneck, single-point failure, insufficient expansibility and the like, and is difficult to adapt to massive and high-concurrency financial business demands. The distributed storage technology can theoretically improve the availability and reliability of the system through multi-node cooperation, but the multi-copy redundancy mechanism also brings significant storage cost pressure. The existing scheme generally adopts a mode of 'firstly compressing, then encrypting and then storing', namely compressing data through algorithms such as Huffman coding, protecting data security by using an encryption algorithm, and finally carrying out distributed storage. However, the financial data generally has the characteristics of dense numerical values and strong local correlation, and after static huffman coding compression, the data distribution is often not uniform enough, and a local statistical mode is remained. When encryption is directly carried out on the basis, the distribution characteristics are easy to be utilized by statistical analysis methods such as differential attack and the like, and the actual security of encryption is weakened. Therefore, how to improve the data distribution uniformity in the compression stage to enhance the encryption anti-attack capability and realize the self-adaptive balance of security and cost in the storage stage becomes a key problem to be solved in the current financial distributed storage field. Disclosure of Invention In order to solve the above-described technical problems, the present invention provides the following aspects. In a first aspect, a distributed-based financial data storage method includes: Acquiring a financial transaction data stream to be stored, and dividing the data stream into continuous data blocks according to a preset size, wherein the data stream is in a byte sequence form; Calculating the distribution change degree of a data block aiming at a single data block, respectively determining the weighting probability for constructing a compression coding table, the encryption round of an encryption algorithm and the replication factor in a distributed storage system based on the distribution change degree, adaptively compressing the data block based on the determined weighting probability to obtain a compressed data block, encrypting the compressed data block based on the determined encryption round to obtain an encrypted data block, and storing the encrypted data block based on the determined replication factor; performing the compression, encryption and storage on all the data blocks, so as to finish the distributed safe storage of the financial transaction data stream; the distribution change degree of each data block is calculated based on the probability of each character in the current block, the probability change between adjacent blocks and the global probability statistic value in the whole data stream. Preferably, the distribution change degree of the data block specifically comprises the steps of calculating the distribution change degree of each character in the data block, wherein the calculation process of the distribution change degree of any character in the current data block is as follows: The method comprises the steps of calculating the occurrence probability of a character in a current data block, obtaining the average value and standard deviation of the occurrence probability of the character in all data blocks, calculating the difference value between the occurrence probability of the character in the current data block and the occurrence probability of the character in a previous data block, and obtaining a probability difference value; obtaining the maximum value and the minimum value of the probability difference value of the character in all data blocks; Carrying out normalization calculation based on the absolute value of the probability difference value and the maximum value and the minimum value to obtain a first factor, and carrying out deviation weighted calculation based on the occurrence probability of the character in the current data block and the average value and standard deviation of the occurrence probability of the character in all the data blocks to obtain a second factor; and multiplying the first factor by the second factor to obtain the distribution variation degree of the character in the current data block. Preferably, the obtaining of the weighted probability includes