Search

CN-121979553-A - Differential upgrading method for compressed file

CN121979553ACN 121979553 ACN121979553 ACN 121979553ACN-121979553-A

Abstract

The invention belongs to the technical field of computer software and data processing, and provides a differential upgrading method for compressed files, which comprises a differential stage and a restoration stage, wherein the differential stage comprises the steps of firstly decompressing an original compressed package of a starting version and a target version, unifying standardized file authorities, modifying, accessing and changing time stamps for two decompressed folders, recompressing to generate a first new compressed file and a second new compressed file with byte level certainty, generating a first differential file based on standardized directory structure and content difference, simultaneously comparing the second new compressed file with the original compressed file of the target version, carrying out lossless analysis on ZIP metadata, structuring modeling difference information, and generating a second differential file, and the restoration stage sequentially comprises the steps of carrying out standardization, first differential restoration, recompression and accurate reconstruction of metadata driven by the second differential, and outputting a result completely consistent with the byte level of the original compressed file of the target version. The method can improve differential compression rate and reduction fidelity.

Inventors

  • HAN LIGUANG
  • LIU JIANWEI

Assignees

  • 红石阳光(北京)科技股份有限公司

Dates

Publication Date
20260505
Application Date
20260115

Claims (10)

  1. 1. The differential upgrading method for the compressed file is characterized by comprising a differential stage and a reduction stage; the differential stage includes: S1, decompressing an original compressed file of a starting version and an original compressed file of a target version respectively to obtain a first folder and a second folder; S2, setting uniform file attributes and time stamps of all files in the first folder and the second folder respectively to obtain a first new folder and a second new folder; S3, compressing the first new folder and the second new folder by using a fixed compression command to respectively obtain a first new compressed file and a second new compressed file; S4, generating a first differential file based on the file content and structure difference between the first new folder and the second new folder; S5, generating a second differential file based on the difference of binary data and metadata between the second new compressed file and the original compressed file of the target version; The reduction stage comprises: R1, decompressing an original compressed file of a starting version to obtain a first folder; R2, uniformly setting file attributes and time stamps of all files in the first folder to obtain a first new folder; R3, restoring the first new folder by using the first differential file to obtain a second reconstructed folder; r4, compressing the second reconstructed folder by using a fixed compression command to obtain a second new compressed file; And R5, restoring the second new compressed file by using the second differential file to obtain an original compressed file of the target version.
  2. 2. The differential upgrade method for compressed files according to claim 1, wherein in step S4, the first differential file is generated using a binary differential algorithm.
  3. 3. The differential upgrade method for compressed files according to claim 1, wherein in step S5, the second differential file is generated using a binary differential algorithm.
  4. 4. The differential upgrade method for compressed files according to claim 1, wherein the unified file attribute and timestamp settings comprise: And setting the modification time, the access time and the change time of all files in the folder to the same fixed value, and setting the file authority attribute to the same fixed configuration.
  5. 5. A differential upgrade method for compressed files according to claim 1, wherein the fixed compression command includes a specified compression tool, compression level and compression parameters to ensure that compressed files with completely consistent byte levels are output when compressing the same file contents and directory structure sets.
  6. 6. The differential upgrade method of compressed files according to claim 1, wherein the original compressed file of the initial version and the original compressed file of the target version are both ZIP and ZIP64 format files.
  7. 7. The differential upgrade method for compressed files according to claim 1, wherein the original compressed file is an automobile ECU (electronic control unit) firmware OTA upgrade package.
  8. 8. The differential upgrade method for compressed files according to claim 1, wherein step S5 further comprises reconstructing the complete metadata of the original compressed file of the target version, specifically comprising the following operations: Before the binary differential operation between the second new compressed file and the original compressed file of the target version is executed, performing lossless parsing operation on the original compressed file of the target version, wherein the lossless parsing operation is realized by directly reading an original byte stream of the original compressed file of the target version and completing grammar and semantic compliance verification according to ZIP format specifications; and in the process of generating the second differential file, based on the metadata set of the second new compressed file and the metadata set of the original compressed file of the target version, carrying out differential calculation on binary data of the corresponding field to generate a simplified differential file special for metadata reconstruction.
  9. 9. The differential upgrade method for compressed files according to claim 8, wherein performing lossless parsing operation on the original compressed file of the target version comprises performing differential analysis and instructional coding on the metadata set, and the specific steps are as follows: The metadata collection extracted from the original compressed file of the target version is compared with the corresponding structure information extracted from the second new compressed file item by item to identify the metadata difference type, wherein the metadata difference type comprises item adding and deleting, field value changing and byte filling; Adding and deleting the items, generating an item operation instruction according to the specific information of the newly added items or the deleted items, extracting metadata of the newly added items in the original compressed file of the target version, generating an item operation instruction with the formats of the newly added instruction, the item identification and the complete metadata, and acquiring the corresponding central directory item index position of the deleted items in the second new compressed file, and generating an item operation instruction with the formats of the deleted instruction and the item identification; for field value change, if the change occurs in a continuous byte block range and the continuous byte block does not bear an independent semantic field boundary in a ZIP structure, a binary difference algorithm optimized based on a rolling hash algorithm and a Longest Common Subsequence (LCS) algorithm is adopted, and a compact binary patch stream is output; if the change occurs on the structured field with the clearly defined grammar position and semantic type, adopting the structured field patch instruction to encode, wherein the format of the structured field patch instruction is a field identification and a new byte value; The method comprises the steps of performing filling consistency judgment on byte filling, namely judging whether filling is introduced by a compression algorithm, a block alignment strategy or a tool chain implementation difference in the process of generating a second new compressed file by re-compressing a second new folder by comparing initial offset, filling length and filling byte mode of each filling area in a target version of original compressed file and the second new compressed file, and generating a filling control instruction if the newly added filling or filling length is judged to change and the filling area is positioned at the tail end of a local file header or at the position where filling is allowed by ZIP format specifications such as the current rear of a central directory strip or the vicinity of a ZIP64 locator structure, wherein the filling control instruction is in a format of initial offset address, total length of filling bytes and filling mode in the ZIP file byte stream; arranging the entry operation instruction, the structured field patch instruction and the filling control instruction according to the byte offset sequence in the ZIP file structure to obtain an instruction sequence; The method comprises the steps of packaging an instruction sequence and an embedded lightweight virtual machine byte code program together to form a second differential file, wherein the virtual machine byte code program has the functions of instruction decoding, memory mapping access, ZIP structure offset calculation, metadata byte writing and CRC32 check sum recalculation, and when in operation, a complete byte stream of an original compressed file of a target version is taken as an initial memory mirror image, each instruction is sequentially interpreted and executed in a specified execution context, and finally complete metadata which is completely consistent with the original compressed file of the target version is reconstructed.
  10. 10. The differential upgrade method for compressed files according to claim 9, wherein in step R5, the specific step of restoring the second new compressed file by using the second differential file to obtain the original compressed file of the target version is as follows: Compressing the second new folder by using a fixed compression command to generate an intermediate ZIP file comprising target file contents and a basic ZIP structure; And loading and running an embedded lightweight virtual machine byte code program packaged in the second differential file, taking the complete byte stream of the intermediate ZIP file as input, executing a metadata reconstruction instruction coded in the byte stream to generate a final file with the byte level completely consistent with the original compressed file of the target version, wherein the embedded lightweight virtual machine is a special virtual machine based on a stack architecture, and an instruction set of the embedded lightweight virtual machine comprises 16 core operation instructions and covers core functions such as metadata reading and writing, offset calculation, check sum recalculation and the like.

Description

Differential upgrading method for compressed file Technical Field The invention relates to the technical field of computer software and data processing, in particular to an intelligent network-connected automobile ECU (electronic control unit) OTA (remote on-line upgrade) scene, and particularly relates to a differential upgrade method for compressed files. Background The current intelligent network automobile field has increasingly urgent requirements for OTA (remote on-line upgrade) of an ECU (electronic control unit), and an OTA upgrade whole package of the ECU is usually provided by a provider and has two main core characteristics that firstly, the file size is huge, the upgrade package is directly caused to be downloaded too long, the user experience and the upgrade efficiency are seriously affected, and secondly, the upgrade package is compressed and stored in a ZIP format so as to adapt to the storage and transmission requirements of an on-vehicle system. To solve the problem of time consuming downloading, differential upgrade is the mainstream technology choice, but differential processing of upgrade packages for automobiles ECUZIP faces multiple technical bottlenecks, namely ZIP format specifications allow compression tools to introduce non-content related variables in the generation process, including file timestamps (modification, access, change time), file permissions (Unix external attributes), compression tool identifications (version madeby), local file header stuffing bytes, central directory entry order, ZIP64 extended locator positions, checksum calculation contexts and the like, and different tool chains (such as ZIP, 7z, info-ZIP, miniz) and even different versions or running environments of the same tool can cause the same source file set to generate ZIP outputs with different hash values. The integrity and consistency requirements of the automobile OTA upgrade on the upgrade package are extremely high, and the restored ZIP file and the original compressed file of the target version must be ensured to be completely consistent in hash values of MD5, SHA256 and the like. Otherwise, the integrity check of the vehicle-mounted system cannot be passed, so that the upgrade fails and even the vehicle function abnormality is caused. In addition, ZIP metadata is not pure derived information, fields such as CRC32, compressed/uncompressed size, file name length and the like of each entry in a central directory depend on actual layout and filling policy of a local file header, and the length of extra fields, compression method identification, encryption mark and the like in the local file header are dynamically influenced by file content characteristics and compression parameters to form cross-level coupling influence. The traditional scheme attempts to normalize through presetting a timestamp or neglecting a timestamp field, but cannot solve the problems of semantic ambiguity of external attribute fields (such as inconsistent mapping of Windows and Linux authorities), dynamic offset of a filling area, uncontrollable ordering of central directory entries, sensitive triggering conditions of ZIP64 expansion structures and the like, so that systematic deviation exists between a decompressed and recompressed ZIP file and an original upgrade package at a metadata level, the first-layer patch which only depends on content difference cannot support high-precision restoration, change of MD5 values of the recompressed file is caused more directly, and strict verification requirements of automobile OTA upgrade are not met. The existing binary differential tool (such as bsdiff, xdelta) is not optimized for the grammar structure constraint, semantic layering characteristic and high consistency requirement of the automobile OTA scene of the ZIP format, and only coarse-granularity byte stream comparison is carried out on metadata difference, so that not only is the load of downloading increased by redundant patches generated, but also automatic recalculation and consistency maintenance of check fields such as CRC32 cannot be ensured, and the severe requirement of an ECU upgrade package on byte level restoration precision is difficult to meet. Therefore, there is a need for a dual-stage differential mechanism that combines file content certainty, directory structure stability, and ZIP metadata grammar compliance, while reducing the volume of the upgrade package and the download time, ensuring that the ZIP file after decompression and recompression is completely consistent with the byte level of the original target version, and realizing efficient and reliable end-to-end differential upgrade of the upgrade package in the automobile ECUZIP format. Disclosure of Invention The invention provides a differential upgrading method for compressed files, which aims to solve the technical problem that in the differential upgrading of an existing automobile ECUZIP format upgrading packet, the hash value of a target version ZI