CN-122001388-A - Data self-adaptive compression method and system of Internet of things based on edge calculation
Abstract
The invention relates to the field of data transmission, in particular to an Internet of things data self-adaptive compression method and system based on edge calculation, wherein the method comprises the steps of acquiring a historical data set and a data set to be compressed, calculating the correlation degree of a target dimension and a comparison dimension, segmenting the data set to be compressed, and clustering all segments; the method comprises the steps of calculating abnormal probability of data of a target dimension in a target cluster, calculating a data self-adaptive compression bias value of the target dimension in a target segment, calculating an upper gate slope and a lower gate slope of a target point in the target dimension based on a rotation gate algorithm, calculating an upper gate effective slope of the target point in the target dimension, obtaining a lower gate effective slope in the same way, obtaining a compression starting point of the target dimension based on the upper gate effective slope and the lower gate effective slope, obtaining a compression end point in the same way, and compressing the data of the target dimension. By the technical scheme, the compression resource consumption can be reduced, and the compression efficiency can be improved.
Inventors
- CHEN XIN
- HU ZHI
- LIU JIE
- YAN JIE
Assignees
- 嘉杰科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260409
Claims (8)
- 1. The data self-adaptive compression method of the Internet of things based on edge calculation is characterized by comprising the following steps of: Acquiring a standardized multi-dimensional historical data set and a standardized multi-dimensional data set to be compressed, taking any dimension as a target dimension, taking any dimension except the target dimension as a comparison dimension, calculating the correlation degree of the target dimension and the comparison dimension, dividing the multi-dimensional data set to be compressed to obtain a plurality of segments, and clustering all the segments to obtain a plurality of clusters; taking any cluster as a target cluster, and calculating the data anomaly probability of the target dimension in the target cluster; Taking any segmented segment in the target cluster as a target segment, and calculating a data self-adaptive compression bias value of the target dimension in the target segment based on the data anomaly probability of the target dimension and a preset compression bias value; Taking any sampling time in the target section as a target point, calculating an upper gate slope and a lower gate slope of the target point in a target dimension based on a revolving gate algorithm, calculating an upper gate effective slope of the target point in the target dimension according to the upper gate slope and the correlation degree of the target point in the target dimension, acquiring a lower gate effective slope of the target point in the target dimension similarly, acquiring a compression starting point of the target dimension based on the upper gate effective slope and the lower gate effective slope, acquiring a compression end point similarly, and compressing data of the target dimension.
- 2. The method for adaptively compressing data of the internet of things based on edge calculation according to claim 1, wherein the calculating the correlation degree between the target dimension and the reference dimension comprises: acquiring a data sequence of a standardized multi-dimensional historical data set in a target dimension as a first sequence; Acquiring a data sequence of the normalized multi-dimensional historical data set in a comparison dimension as a second sequence; And taking the absolute value of the cross-correlation coefficient of the first sequence and the second sequence as the correlation degree of the target dimension and the contrast dimension.
- 3. The method for adaptively compressing data of the internet of things based on edge calculation according to claim 1, wherein the dividing the multi-dimensional data set to be compressed to obtain a plurality of segments, and clustering all segments to obtain a plurality of clusters comprises: Randomly generating a plurality of candidate segmentation schemes from the standardized multidimensional data set to be compressed, and obtaining a plurality of initial segmentation segments for any candidate segmentation scheme; Clustering a plurality of initial segments in any candidate segmentation scheme to obtain a clustering scheme, wherein one candidate segmentation scheme corresponds to one clustering scheme; calculating CH indexes of any clustering scheme, taking a candidate segmentation scheme corresponding to the maximum value of the CH indexes as an optimal segmentation scheme, and obtaining a plurality of segmentation segments in the optimal segmentation scheme; And taking the clustering scheme corresponding to the optimal segmentation scheme as an optimal clustering scheme, and obtaining a plurality of clustering clusters in the optimal clustering scheme.
- 4. The method for adaptively compressing data of the internet of things based on edge calculation according to claim 1, wherein the probability of data anomaly comprises: Acquiring a DTW distance between a data sequence of any segment in a target dimension in a target cluster and a data sequence of the target dimension in a historical dataset, taking the DTW distance as a trend similarity of any segment in the target dimension, and traversing to acquire the trend similarity of any segment in a comparison dimension; for the same comparison dimension, calculating a first product of the correlation degree and the trend similarity, traversing to obtain a first product of each dimension except the target dimension, and calculating a first product accumulated value; and calculating the sum value of the trend similarity of the target dimension and the first product accumulated value, traversing to obtain a first sum value of each segmented segment in the target cluster, calculating the sum value mean value of all the first sum values, and taking the negative index value of the sum value mean value as the data anomaly probability of the target dimension in the target cluster.
- 5. The method for adaptively compressing data of the internet of things based on edge calculation according to claim 1, wherein the adaptive compression bias value comprises: Calculating a first standard deviation of the correlation degree of the target dimension and each dimension except the target dimension, and normalizing the first standard deviation to obtain the information key degree of the target dimension; Calculating a second standard deviation of all data of the target segment in the target dimension; dividing the historical data set by the length of the target segment to obtain a plurality of historical segments, traversing to obtain a second standard deviation of each historical segment, and calculating second standard deviation average values of all the historical segments; Calculating a first difference value of a second standard deviation of the target segment and a second standard deviation mean value, obtaining an index value of the first difference value, calculating a second difference value of the information key degree of the 1 and the target dimension, and calculating a second product of the index value of the first difference value and the second difference value; The method comprises the steps of obtaining data anomaly probabilities of target dimensions in target clusters, obtaining the data anomaly probabilities of the target dimensions in each cluster by the same process, calculating the data anomaly probability average value of the target dimensions in all clusters, calculating a first ratio of the data anomaly probability average value to the data anomaly probabilities of the target dimensions in the target clusters, and calculating a third product of the information criticality of the target dimensions and the first ratio; and calculating a second sum value of the second product and the third product, and taking the product of the second sum value and the preset compression deviation value as a data self-adaptive compression deviation value of the target dimension in the target segment.
- 6. The method for adaptively compressing data of the internet of things based on edge calculation according to claim 1, wherein the adaptive compression bias value comprises: acquiring data anomaly probabilities of target dimensions in the target clusters, and similarly acquiring the data anomaly probabilities of the target dimensions in each cluster, and calculating the data anomaly probability average value of the target dimensions in all the clusters; acquiring a difference value sequence of a data sequence of the target segment in the target dimension, and calculating a first average value of absolute values of each element value in the difference value sequence as a trend change rate of the target segment in the target dimension; The position of the target segment in the data set to be compressed is obtained, a previous segmented segment adjacent to the target segment in the data set to be compressed is used as an adjacent segment, the trend change rate of the adjacent segment in the target dimension is obtained in a similar way according to the trend change rate of the target segment in the target dimension, and a second ratio of the trend change rate of the adjacent segment in the target dimension to the trend change rate of the target segment in the target dimension is calculated; and calculating a second average value of the first ratio and the second ratio, and taking the product of the second average value and a preset compression deviation value as a data self-adaptive compression deviation value of the target dimension in the target segment.
- 7. The method for adaptively compressing data of the internet of things based on edge calculation according to claim 1, wherein the upper gate effective slope comprises: taking the correlation degree of the target dimension normalized with the comparison dimension as a slope weight; calculating a fourth product of the upper gate slope and the slope weight of the target point in the comparison dimension; Traversing to obtain a fourth product of the target point in each dimension except the target dimension, and calculating a fourth product accumulated value; And taking the sum of the fourth product accumulated value and the upper gate slope of the target point in the target dimension as an upper gate effective slope.
- 8. The internet of things data self-adaptive compression system based on edge calculation is characterized by comprising a processor and a memory, wherein the memory stores computer program instructions which when executed by the processor realize the internet of things data self-adaptive compression method based on edge calculation according to any one of claims 1-7.
Description
Data self-adaptive compression method and system of Internet of things based on edge calculation Technical Field The present invention relates to the field of data transmission. In particular to an internet of things data self-adaptive compression method and system based on edge calculation. Background With the rapid development of the internet of things technology, massive sensors and intelligent terminal equipment are continuously connected into a network to generate high-frequency packet data with the characteristics of strong time sequence correlation, high redundancy, more noise and abnormal values, strong non-stationarity, low value density and the like. In a traditional architecture with cloud computing as a center, such data needs to be uploaded to a cloud end in full for processing and storage, which leads to excessive consumption of network bandwidth resources, significant increase of end-to-end transmission delay, and introduction of potential privacy disclosure and security risks in the data transmission process. The existing fixed parameter compression method is difficult to adapt to the dynamic non-stationary characteristic of the data of the Internet of things, is difficult to balance among compression efficiency, real-time response and data fidelity, and restricts the large-scale deployment of edge calculation in the high-concurrency low-delay Internet of things application. The prior Chinese patent application document with the publication number of CN114900191A discloses an improved algorithm for compressing differential protection data by a revolving door algorithm, which comprises the steps of initializing and improving parameters of the revolving door algorithm, executing an abnormal point recording strategy, directly storing the abnormal point when the abnormal point is judged, judging whether the data meets the compression condition or not if the abnormal point is not, compressing the differential protection data and executing a dynamic adjustment threshold strategy if the data meets the compression condition, executing a self-adaptive variable frequency data storage strategy on all the data at the abnormal point, finally collecting the next data, and judging whether all the data compression is completed or not. However, the compression deviation of the revolving door compression algorithm is adaptively adjusted through the compression ratio threshold value or the compression ratio feedback, so that the key degree of different data containing information and the difference of data state transition between continuous data segments are ignored, the compression cannot be performed in time according to the content in the data segments, and the compression result is low in precision and efficiency. Disclosure of Invention In order to solve the above-described technical problems, the present invention provides the following aspects. The invention provides an internet of things data self-adaptive compression method based on edge calculation, which comprises the steps of obtaining a standardized multi-dimensional historical data set and a standardized multi-dimensional data set to be compressed, taking any dimension as a target dimension, taking any dimension except the target dimension as a comparison dimension, calculating the correlation degree of the target dimension and the comparison dimension, dividing the multi-dimensional data set to be compressed to obtain a plurality of divided sections, clustering all the divided sections to obtain a plurality of clustered clusters, taking any clustered cluster as a target cluster, calculating the data anomaly probability of the target dimension in the target cluster, taking any divided section in the target cluster as the target section, calculating the data self-adaptive compression bias value of the target dimension in the target section based on the data anomaly probability of the target dimension and a preset compression bias value, taking any sampling moment in the target section as a target point, calculating the upper gate slope and the lower gate slope of the target point in the target dimension based on a rotation gate algorithm, calculating the upper gate effective slope of the target point in the target dimension according to the upper gate and the correlation degree of the target point in the target dimension, and obtaining the lower gate effective slope in the target dimension, obtaining the data self-adaptive compression bias value based on the upper gate effective slope and the lower gate effective slope in the target dimension, and obtaining the compression target compression bias value. Preferably, the calculating the correlation degree of the target dimension and the comparison dimension comprises the steps of obtaining a data sequence of the normalized multi-dimension historical data set in the target dimension as a first sequence, obtaining a data sequence of the normalized multi-dimension historical data set in the comparison