CN-121979847-A - Data processing method and device
Abstract
The application provides a data processing method and device, wherein the method comprises the steps of obtaining data to be processed which is continuously output to a cache, analyzing the data to be processed, responding to the data to be processed to meet a set data compression condition, extracting first data in the data to be processed, wherein the first data at least comprises constant information and variable information which can represent circulation characteristics, generating second data based on the first data, wherein the second data at least comprises circulation structure data corresponding to constant information and variable information, the circulation structure data is used for representing circulation characteristic information of repeated elements in the first data, and storing the second data.
Inventors
- GAO HONGLI
- PAN YUYAN
Assignees
- 联想(北京)有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251217
Claims (10)
- 1. A method of data processing, the method comprising: obtaining data to be processed which is continuously output to the cache; Analyzing the data to be processed, and responding to the data to be processed meeting a set data compression condition to extract first data in the data to be processed, wherein the first data at least comprises constant information and variable information which can characterize the cycle characteristics; Generating second data based on the first data, wherein the second data at least comprises cyclic structure data corresponding to constant information and variable information, and the cyclic structure data is used for representing cyclic characteristic information of repeatedly occurring elements in the first data; And storing the second data.
- 2. The method of claim 1, the extracting first data of the data to be processed in response to the data to be processed meeting a set data compression condition, comprising: Extracting the text structure information as constant information in the first data in response to the existence of the same text structure information in the continuous first quantity of data to be processed; The same text structure information characterizes the same content and/or description content in the circulation process.
- 3. The method of claim 1, the extracting first data of the data to be processed in response to the data to be processed meeting a set data compression condition, comprising: responding to the fact that the numerical value change frequency of the set field position in the continuous second quantity of data to be processed exceeds a preset threshold value, and extracting the name of the changed data and the corresponding numerical value set as the variable information; the name of the change data represents the variable identification of the change data in the circulation process, and the numerical value set represents the value of the change data in the circulation process.
- 4. The method of claim 1, the method further comprising: Determining the data to be processed as cyclic data in response to the data to be processed meeting a set data compression condition; Obtaining a time stamp corresponding to each piece of circulating data; Determining a cycle start time and a cycle end time corresponding to the cycle data based on the time stamp corresponding to each cycle data in response to the next piece of data of the cycle data not meeting the set data compression condition; the second data is generated based on the first data, the cycle start time, and the cycle end time in the cycle data.
- 5. The method of claim 1, the generating second data based on the first data, comprising: determining a constant data structure corresponding to constant information in response to the constant information included in at least two first data, wherein the constant data structure characterizes the fixed description contents of a plurality of first data in the cyclic process; Determining a variable data structure based on the variable information and a circulation rule corresponding to the variable information in response to the variable information included in at least two first data, wherein the variable data structure represents the change content of a plurality of first data in the circulation process, and the change content at least comprises the name of the change data and a corresponding numerical value set; and sequencing the constant data structure and the variable data structure based on the time sequence of the data to be processed to obtain the second data.
- 6. The method of claim 5, wherein determining a variable data structure based on the variable information and the circulation rule corresponding to the variable information comprises: determining the field position of the variable information in the data to be processed; determining the change rule of the data of the field position along with the time stamp of the data to be processed; Determining the variable data structure based on the change rule of the time stamp and the variable information in response to the change rule satisfying the circulation rule; And determining the circulation times of the variable information in the circulation process in response to the change rule not meeting the circulation rule, and determining the variable data structure based on the variable information and the circulation times.
- 7. The method of claim 1, after the storing the second data, the method further comprising: and deleting the data to be processed associated with the second data in the cache.
- 8. The method of claim 6, the method further comprising: Determining a time period of the data to be processed in response to the change rule meeting the circulation rule, wherein the time period is a fixed interval between adjacent time stamps; the time period is recorded in the second data to represent a cycle characteristic of the first data during a cycle.
- 9. The method of claim 6, the method further comprising: And in response to the change rule not meeting the circulation rule, recording the circulation times in the second data to represent the circulation characteristics of the first data in the circulation process.
- 10. A data processing apparatus, the apparatus comprising: the obtaining module is used for obtaining the data to be processed which is continuously output to the cache; The processing module is used for analyzing the data to be processed, and extracting first data in the data to be processed in response to the data to be processed meeting a set data compression condition, wherein the first data at least comprises constant information and variable information capable of representing the cycle characteristics; The generation module is used for generating second data based on the first data, wherein the second data at least comprises cyclic structure data corresponding to constant information and variable information, and the cyclic structure data is used for representing cyclic characteristic information of repeatedly occurring elements in the first data; and the storage module is used for storing the second data.
Description
Data processing method and device Technical Field The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus. Background In current data processing, a system typically generates a large amount of log data, which often contains duplicate information. Such repeated information not only occupies a large amount of memory space, but also causes a decrease in the program operation efficiency. Therefore, in the current data processing process, the log data has redundancy, which results in lower storage efficiency, and a large amount of data needs to be retrieved when the log data is analyzed, which results in lower analysis efficiency. Disclosure of Invention The embodiment of the application provides a data processing method, a data processing device, electronic equipment and a storage medium. The application provides a data processing method, which comprises the steps of obtaining data to be processed which is continuously output to a cache, analyzing the data to be processed, responding to the data to be processed to meet a set data compression condition, extracting first data in the data to be processed, wherein the first data at least comprises constant information and variable information which can characterize cycle characteristics, generating second data based on the first data, wherein the second data at least comprises cycle structure data corresponding to the constant information and the variable information, the cycle structure data is used for representing cycle characteristic information of repeated elements in the first data, and storing the second data. According to one embodiment of the application, the extracting of the first data in the data to be processed in response to the data to be processed meeting the set data compression condition comprises extracting the text structure information as constant information in the first data in response to the existence of the same text structure information in a continuous first number of data to be processed, wherein the same text structure information represents the content and/or the description content with the same position in the cyclic process. According to one embodiment of the application, the method for extracting the first data in the data to be processed in response to the data to be processed meeting the set data compression condition comprises the steps of extracting the name of the changed data and a corresponding value set as the variable information in response to the numerical value change frequency of the set field position in the continuous second quantity of data to be processed exceeding a preset threshold value, wherein the name of the changed data represents the variable identification of the changed data in the cyclic process, and the value set represents the value of the changed data in the cyclic process. According to one embodiment of the application, the method further comprises the steps of determining the data to be processed as cyclic data in response to the data to be processed meeting a set data compression condition, obtaining a time stamp corresponding to each piece of cyclic data, determining cycle starting time and cycle ending time corresponding to each piece of cyclic data based on the time stamp corresponding to each piece of cyclic data in response to the next piece of cyclic data does not meet the set data compression condition, and generating the second data based on the first data, the cycle starting time and the cycle ending time in the cyclic data. According to one embodiment of the application, the method for generating the second data based on the first data comprises the steps of determining a constant data structure corresponding to constant information in response to the fact that the constant information is included in at least two first data, enabling the constant data structure to represent fixed description contents of the plurality of first data in a cyclic process, determining a variable data structure in response to the fact that the variable information is included in the at least two first data and the cyclic rule corresponding to the variable information, enabling the variable data structure to represent changing contents of the plurality of first data in the cyclic process, enabling the changing contents to at least comprise names and corresponding numerical value sets of the changing data, and sequencing the constant data structure and the variable data structure based on time sequence of the data to be processed to obtain the second data. According to one embodiment of the application, the method for determining the variable data structure based on the variable information and the circulation rule corresponding to the variable information comprises the steps of determining a field position of the variable information in the data to be processed, determining a change rule of the data of the field position along with a ti