Search

CN-115630029-B - Data compression method, device, equipment and storage medium

CN115630029BCN 115630029 BCN115630029 BCN 115630029BCN-115630029-B

Abstract

The invention provides a data compression method, device and equipment and a storage medium, wherein the data compression method comprises the steps of compressing original data to obtain a first compression result, obtaining first skip frame data based on the first compression result, wherein the first skip frame data is used for positioning the original data, compressing the first skip frame data to obtain compressed first skip frame data, splicing the compressed first skip frame data after the first compression result, obtaining second skip frame data based on the first compression result and the compressed first skip frame data, wherein the second skip frame data is used for positioning the first skip frame data and the original data, compressing the second skip frame data to obtain compressed second skip frame data, splicing the compressed second skip frame data before the first compression result, and outputting the second compression result. The compressed file obtained by the data compression method can randomly access a certain section of data in the compressed file, and does not need to decompress all the compressed file.

Inventors

  • CHEN HUALIN
  • MO YINGSHENG
  • Ou Shunyin

Assignees

  • 深圳市紫光同创电子有限公司

Dates

Publication Date
20260505
Application Date
20221009

Claims (7)

  1. 1. A method of data compression, the method comprising: compressing the original data to obtain a first compression result; Acquiring first skip frame data based on the first compression result, wherein the first skip frame data is used for positioning the original data, compressing the first skip frame data to obtain compressed first skip frame data, and splicing the compressed first skip frame data after the first compression result; Acquiring second skipped frame data based on the first compression result and the compressed first skipped frame data, the second skipped frame data being used to locate the first skipped frame data and the original data; Compressing the second skipped frame data to obtain compressed second skipped frame data, splicing the compressed second skipped frame data before the first compression result, and outputting a second compression result; the compressing the original data to obtain a first compression result includes: splitting core data in the original data into frame data, and compressing the frame data to obtain compressed frame data, wherein the core data is data to be accessed; recording first information corresponding to the compressed frame data, compressing the index data to obtain compressed index data, and splicing the compressed index data after the compressed frame data; recording second information corresponding to the index data compressed into compressed index data, obtaining data head data, compressing the data head data to obtain compressed data head data, and splicing the compressed data head data before the compressed frame data to obtain a first compression result; the obtaining first skipped frame data based on the first compression result includes: Recording the index data compressed into fourth information corresponding to the compressed index data, and generating first metadata based on the third information and the fourth information; storing the first metadata to a first skip frame structure corresponding to the first skip frame data, identifying a first identifier for storing the first metadata skip frame, and storing the first identifier to the first skip frame structure where the first metadata is located to obtain the first skip frame data; The obtaining second skipped frame data based on the first compression result and the compressed first skipped frame data includes: recording the first skip frame data compressed into fifth information corresponding to the compressed first skip frame data, recording sixth information corresponding to the compressed data head data, and generating second metadata based on the fifth information and the sixth information; And storing the second metadata to a second skipped frame structure corresponding to the second skipped frame data, identifying a second identifier for storing the second skipped frame of the second metadata, and storing the second identifier to the second skipped frame structure where the second metadata is located to obtain the second skipped frame data.
  2. 2. The method of claim 1, wherein recording the third information corresponding to the compressed frame data as the frame data, recording the fourth information corresponding to the compressed index data as the index data, and generating the first metadata based on the third information and the fourth information, comprises: Recording a first position and a first data size of the frame data compressed into the compressed frame data, and generating first positioning information of the frame data according to the first position and the first data size; recording a second position of the index data corresponding to the frame data in the compressed index data, and generating second positioning information of the index data according to the second position; And obtaining the first metadata based on the first positioning information and the second positioning information.
  3. 3. The method of claim 1, wherein said recording said first skipped frame data compression as said compressed first skipped frame data corresponds to fifth information, recording said compressed data header data corresponds to sixth information, generating second metadata based on said fifth information and said sixth information, comprising: Recording the first skip frame data corresponding to the frame data, compressing the first skip frame data into a third position and a third data size in the compressed first skip frame data, and generating third positioning information of the first skip frame data according to the third position and the third data size; Recording data head data corresponding to the frame data, compressing the data head data into a fourth position and a fourth data size in the compressed data head data, and generating fourth positioning information of the data head data according to the fourth position and the fourth data size; And obtaining the second metadata based on the third positioning information and the fourth positioning information.
  4. 4. The method of claim 1, wherein the first skipped frame data and the second skipped frame data are skipped frame data provided in Zstd standard.
  5. 5. A data compression apparatus, the apparatus comprising: the first acquisition module is used for compressing the original data to obtain a first compression result; The second acquisition module is used for acquiring first skip frame data based on the first compression result, wherein the first skip frame data is used for positioning the original data, compressing the first skip frame data to obtain compressed first skip frame data, and splicing the compressed first skip frame data after the first compression result; A third obtaining module, configured to obtain second skipped frame data based on the first compression result and the compressed first skipped frame data, where the second skipped frame data is used to locate the first skipped frame data and the original data; The compression output module is used for carrying out compression processing on the second skipped frame data to obtain compressed second skipped frame data, and the compressed second skipped frame data is spliced before the first compression result to output a second compression result; the compressing the original data to obtain a first compression result includes: splitting core data in the original data into frame data, and compressing the frame data to obtain compressed frame data, wherein the core data is data to be accessed; recording first information corresponding to the compressed frame data, compressing the index data to obtain compressed index data, and splicing the compressed index data after the compressed frame data; recording second information corresponding to the index data compressed into compressed index data, obtaining data head data, compressing the data head data to obtain compressed data head data, and splicing the compressed data head data before the compressed frame data to obtain a first compression result; the obtaining first skipped frame data based on the first compression result includes: Recording the index data compressed into fourth information corresponding to the compressed index data, and generating first metadata based on the third information and the fourth information; storing the first metadata to a first skip frame structure corresponding to the first skip frame data, identifying a first identifier for storing the first metadata skip frame, and storing the first identifier to the first skip frame structure where the first metadata is located to obtain the first skip frame data; The obtaining second skipped frame data based on the first compression result and the compressed first skipped frame data includes: recording the first skip frame data compressed into fifth information corresponding to the compressed first skip frame data, recording sixth information corresponding to the compressed data head data, and generating second metadata based on the fifth information and the sixth information; and storing the second metadata to a second skipped frame structure corresponding to the second skipped frame data, identifying a second identifier for storing the second skipped frame of the second metadata, and storing the second identifier to the second skipped frame structure where the second metadata is located to obtain the second skipped frame data. .
  6. 6. A computer device, the computer device comprising: a memory for storing a computer program; Processor for executing the computer program to implement the data compression method according to any of claims 1-4.
  7. 7. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements a data compression method according to any of claims 1-4.

Description

Data compression method, device, equipment and storage medium Technical Field The invention belongs to the technical field of field programmable gate array (FPGA, field Programmable GATE ARRAY) integrated circuits, and particularly relates to a data compression method, a data compression device, data compression equipment and a storage medium. Background The compression algorithm generally refers to two factors, namely the compression ratio and the compression efficiency, and some compression algorithms have very high compression ratio but poor compression efficiency, and some compression algorithms have good compression efficiency but not high compression ratio. Zstandard (zstd for short) is a quick lossless compression algorithm of Facebook open source, and better combines compression ratio and compression efficiency. However, the compressed file obtained by compression through zstd algorithm does not support random access to the compressed file, when a certain section of data of the compressed file needs to be accessed, the compressed file can only be read from the head of the compressed file in sequence, so that the efficiency of reading the compressed file in certain application scenes is reduced, and the application range of zstd algorithm is reduced. Disclosure of Invention Based on the method, the device, the equipment and the storage medium for compressing the data are provided, and the problems that the compressed file obtained by compressing the existing zstd algorithm does not support specific data in the random access compressed file, so that the efficiency of reading the compressed file in certain application scenes is reduced, and the application range of the zstd algorithm is reduced are solved. The invention provides a data compression method, which comprises the following steps: compressing the original data to obtain a first compression result; Acquiring first skip frame data based on the first compression result, wherein the first skip frame data is used for positioning the original data, compressing the first skip frame data to obtain compressed first skip frame data, and splicing the compressed first skip frame data after the first compression result; Acquiring second skipped frame data based on the first compression result and the compressed first skipped frame data, the second skipped frame data being used to locate the first skipped frame data and the original data; And compressing the second skipped frame data to obtain compressed second skipped frame data, splicing the compressed second skipped frame data before the first compression result, and outputting a second compression result. Further, the compressing the original data to obtain a first compression result includes: splitting core data in the original data into frame data, and compressing the frame data to obtain compressed frame data, wherein the core data is data to be accessed; recording first information corresponding to the compressed frame data, compressing the index data to obtain compressed index data, and splicing the compressed index data after the compressed frame data; recording the index data and compressing the index data into second information corresponding to the compressed index data to obtain data head data, compressing the data head data to obtain compressed data head data, and splicing the compressed data head data before the compressed frame data to obtain a first compression result. Further, the obtaining the first skipped frame data based on the first compression result includes: Recording the index data compressed into fourth information corresponding to the compressed index data, and generating first metadata based on the third information and the fourth information; Storing the first metadata to a first skip frame structure corresponding to the first skip frame data, identifying and storing a first identification of the first metadata skip frame, and storing the first identification to the first skip frame structure where the first metadata is located to obtain the first skip frame data. Further, the recording the third information corresponding to the compressed frame data, the recording the fourth information corresponding to the compressed index data, and the generating the first metadata based on the third information and the fourth information, includes: Recording a first position and a first data size of the frame data compressed into the compressed frame data, and generating first positioning information of the frame data according to the first position and the first data size; recording a second position of the index data corresponding to the frame data in the compressed index data, and generating second positioning information of the index data according to the second position; And obtaining the first metadata based on the first positioning information and the second positioning information. Further, the acquiring second skipped frame data based on the first compression result and the com