US-12627316-B2 - Data processing method and data processing apparatus
Abstract
The present invention proposes a data processing method, a data processing apparatus, an electronic device, and a computer-readable storage medium. The method includes: in response to a data processing instruction, segmenting an original data to obtain multiple data segments of the original data to enable parallel processing of the multiple data segments, wherein the multiple data segments includes a first data segment; determining whether the first data segment is suitable for using a preset run-length encoding; and when the first data segment is suitable for using the run-length encoding, using the run-length encoding to perform a run-length encoding processing on the first data segment. According to some embodiments, the original data is initially segmented into multiple data segments, and each data segment can be processed independently and concurrently, thereby enhancing the speed of data compression processing.
Inventors
- Chenchen Lu
- Haiqi Tang
- Kuen Hung Tsoi
- Xinyu NIU
Assignees
- SHENZHEN CORERAIN TECHNOLOGIES CO., LTD.
Dates
- Publication Date
- 20260512
- Application Date
- 20240430
- Priority Date
- 20231211
Claims (13)
- 1 . A data processing method, characterized by comprising: in response to a data processing instruction, segmenting an original data to obtain multiple data segments of the original data to enable parallel processing of the multiple data segments, wherein the multiple data segments comprise a first data segment; determining whether the first data segment is suitable for using a preset run-length encoding; and when the first data segment is suitable for using the run-length encoding, using the run-length encoding to perform a run-length encoding processing on the first data segment; the multiple data segments further comprise a second data segment, and the data processing method further comprises: determining whether the second data segment is suitable for using the run-length encoding; and when the second data segment is not suitable for using the run-length encoding, not processing the second data segment or encoding the second data segment using a different encoding method than the run-length encoding.
- 2 . The data processing method according to claim 1 , characterized in that the step of in response to a data processing instruction, segmenting an original data to obtain multiple data segments of the original data comprises: segmenting the original data according to a length of the original data; segmenting the original data according to a data distribution of the original data; or segmenting the original data according to a preset segmentation constant.
- 3 . The data processing method according to claim 1 , characterized by further comprising: concatenating the first data segment and the second data segment.
- 4 . The data processing method according to claim 3 , characterized in that the step of concatenating the first data segment and the second data segment comprises: when both the first data segment and the second data segment are processed using or not using the run-length encoding, directly concatenating the first data segment and the second data segment.
- 5 . The data processing method according to claim 3 , characterized in that the step of concatenating the first data segment and the second data segment comprises: when only one of the first data segment and the second data segment is processed using the run-length encoding, marking a position of the first data segment after the run-length encoding processing.
- 6 . The data processing method according to claim 5 , characterized in that the step of marking a position of the first data segment after the run-length encoding processing comprises: adding a special character before and after the position of the first data segment after the run-length encoding processing to mark the position of the first data segment after the run-length encoding processing; or storing a position data of the first data segment after the run-length encoding processing to mark the position of the first data segment after the run-length encoding processing.
- 7 . An electronic device, comprising: a processor; and a memory storing a computer program that, when executed by the processor, causes the processor to execute the data processing method as claimed in claim 1 .
- 8 . An electronic device, comprising: a processor; and a memory storing a computer program that, when executed by the processor, causes the processor to execute the data processing method as claimed in claim 2 .
- 9 . An electronic device, comprising: a processor; and a memory storing a computer program that, when executed by the processor, causes the processor to execute the data processing method as claimed in claim 3 .
- 10 . An electronic device, comprising: a processor; and a memory storing a computer program that, when executed by the processor, causes the processor to execute the data processing method as claimed in claim 4 .
- 11 . An electronic device, comprising: a processor; and a memory storing a computer program that, when executed by the processor, causes the processor to execute the data processing method as claimed in claim 5 .
- 12 . An electronic device, comprising: a processor; and a memory storing a computer program that, when executed by the processor, causes the processor to execute the data processing method as claimed in claim 6 .
- 13 . A data processing apparatus, characterized by comprising: a segmenting unit, configured to in response to a data processing instruction, segment an original data to obtain multiple data segments of the original data to enable parallel processing of the multiple data segments, wherein the multiple data segments comprise a first data segment and a second data segment; a run-length encoding processing confirmation unit, configured to determine whether the first data segment is suitable for using a preset run-length encoding, and configured to determine whether the second data segment is suitable for using the run-length encoding; and a run-length encoding unit, configured to when the first data segment is suitable for using the run-length encoding, use the run-length encoding to perform a run-length encoding processing on the first data segment, and configured to when the second data segment is not suitable for using the run-length encoding, not process the second data segment or encoding the second data segment using a different encoding method than the run-length encoding.
Description
CROSS REFERENCE TO RELATED APPLICATIONS The present application claims the benefit of Chinese Patent Application No. 202311685329.1 filed on Dec. 11, 2023, the contents of which are incorporated herein by reference in their entirety. TECHNICAL FIELD The present invention relates to the field of data transmission, and in particular, to a data processing method and data processing apparatus, electronic device. BACKGROUND TECHNIQUE No matter how fast the transmission speed of computers and networks is, users always require a faster experience. In order to reduce the capacity of transmitted data, we usually compress the data. There are numerous data compression algorithms, some lossless and some lossy, but their primary objective is to minimize storage space and transmission volume. Run-length compression coding replaces consecutive occurrences of the same data element with the element itself and a count of how many times it appears. Even if the current data element only appears once, it still needs to be represented by two parts: {K, n}. Consequently, if the number of repetitions of elements in the original data is relatively small, the encoded data will be larger due to the addition of the extra ānā, which cannot achieve the purpose of data compression. In particular, if certain segments of data elements to be compressed have a high repeating frequency while other segments repeat less frequently, using run-length encoding directly might result in a larger data volume after encoding than before compression. Additionally, when performing run-length compression encoding on the original data, the current method involves comparing each original data with its predecessors and successors individually to determine whether they are identical. While this encoding technique is straightforward, it can only process one data at a time, leading to slower processing speeds and being disadvantageous for the parallel processing of multiple data. Contents of the Invention The present invention aims to propose a data processing method, a data processing apparatus, an electronic device to address the issue of poor compression efficiency of current run-length coding. According to an aspect of the present invention, a data processing method is proposed, including: in response to a data processing instruction, segmenting an original data to obtain multiple data segments of the original data to enable parallel processing of the multiple data segments, wherein the multiple data segments include a first data segment; determining whether the first data segment is suitable for using a preset run-length encoding; and when the first data segment is suitable for using the run-length encoding, using the run-length encoding to perform a run-length encoding processing on the first data segment. According to some embodiments, the step of in response to a data processing instruction, segmenting an original data to obtain multiple data segments of the original data includes: segmenting the original data according to a length of the original data; or segmenting the original data according to a data distribution of the original data; or segmenting the original data according to a preset segmentation constant. According to some embodiments, the multiple data segments further include a second data segment, and the data processing method further includes: determining whether the second data segment is suitable for using the run-length encoding; and when the second data segment is not suitable for using the run-length encoding, not processing the second data segment or encoding the second data segment using a different encoding method than the run-length encoding. According to some embodiments, the method further includes: concatenating the first data segment and the second data segment. According to some embodiments, the step of concatenating the first data segment and the second data segment includes: when both the first data segment and the second data segment are processed using or not using the run-length encoding, directly concatenating the first data segment and the second data segment. According to some embodiments, the step of concatenating the first data segment and the second data segment includes: when only one of the first data segment and the second data segment is processed using the run-length encoding, marking a position of the first data segment after the run-length encoding processing. According to some embodiments, the step of marking a position of the first data segment after the run-length encoding processing includes: adding a special character before and after the position of the first data segment after the run-length encoding processing to mark the position of the first data segment after the run-length encoding processing; or storing a position data of the first data segment after the run-length encoding processing to mark the position of the first data segment after the run-length encoding processing. According to an aspect