CN-121705035-B - Container elastic telescoping method based on asymmetric risk perception and double-channel space-time feature fusion

CN121705035BCN 121705035 BCN121705035 BCN 121705035BCN-121705035-B

Abstract

The invention relates to the technical field of cloud computing container arrangement and artificial intelligence operation and maintenance, in particular to a container elastic telescoping method based on asymmetric risk perception and double-channel space-time feature fusion. The method comprises the following steps of S1, constructing a real-time load data stream based on multi-order differential enhancement, S2, constructing a load prediction model fused by two-channel space-time characteristics, S3, training the model based on an asymmetric weighting loss function, S4, constructing an elastic telescopic decision system based on prediction perception, and S5, performing active resource scheduling and anti-jitter design. The method overcomes the symmetry assumption defect of the traditional prediction model, obviously reduces the default risk caused by resource underallocation, builds an active defense mechanism aiming at business risk on the algorithm level, and meets the severe requirement of the enterprise-level production environment on high stability. The method has comprehensive characteristic capture and strong model generalization capability, and can eliminate cold start time difference and realize zero delay response.

Inventors

GAO XIAO
MEI MENG
XU ZHONGWEI
HUANG XINLIN
PEI XILONG

Assignees

同济大学

Dates

Publication Date: 20260508
Application Date: 20260212

Claims (10)

1. The container elastic telescoping method based on asymmetric risk perception and double-channel space-time feature fusion is characterized by comprising the following steps of: The method comprises the steps of S1, constructing a real-time load data stream based on multi-order differential enhancement, collecting historical load time sequence data of a container cluster, introducing a time sequence feature engineering method, calculating a first-order differential representing a load change rate and a second-order differential representing a change acceleration, and carrying out missing value cleaning by combining a self-adaptive hierarchical interpolation strategy, so as to construct a set of time sequence data set which not only reflects the current load amplitude, but also can represent the load change trend and strength; s2, constructing a load prediction model of double-channel space-time feature fusion, constructing a double-channel parallel deep learning model of a local channel containing local mutation feature extraction and a global channel of global long-range time sequence memory, wherein the local channel is used for extracting high-frequency flow spike features by combining feature enhancement type time convolution network with expansion causal convolution; S3, model training based on an asymmetric weighted loss function, dividing a processed data set into a training set and a test set, and introducing a risk penalty coefficient in the training process to construct the asymmetric loss function; s4, constructing an elastic telescopic decision system based on predictive perception, monitoring the time consumption of the whole flow of the container from dispatching to service readiness in real time, updating a starting time delay dynamic image by using an exponential weighted moving average algorithm, calculating the number of target container copies by adopting a multi-index parallel evaluation strategy in combination with the load predicted value output in the step S2, and determining the early triggering time point of the capacity expansion operation according to the starting time delay image so as to finish active resource preheating before the flow peak value arrives; s5, executing active resource scheduling and anti-jitter processing, calling a cluster API to modify a copy field and executing noninductive capacity expansion by combining a ready probe mechanism when the target number of the copies exceeds the current number of the running copies, introducing an asymmetric cooling window mechanism, and filtering instantaneous load jitter by using a time dimension hysteresis effect through setting a capacity shrinkage judging threshold value and a cooling locking period after capacity shrinkage execution to ensure high availability of a system.
2. The method for elastically stretching and retracting a container based on the fusion of asymmetric risk perception and two-channel space-time features according to claim 1, wherein the step S1 is specifically: S11, multi-dimensional monitoring index collection Deploying monitoring components in Kubernetes clusters, configuring high frequency data acquisition cycles Second, wherein the second is; time-of-day acquisition of raw index vectors Comprising the following steps: basic resource index CPU utilization Memory usage ; Service performance index, number of requests per second QPS, P90 response delay ; Network index network I/O rate ; S12, filling missing values and cleaning abnormal values The short-time missing adopts a linear interpolation method to fill in the instant packet loss; the long-term missing adopts a pre-value holding method, and the confidence level marks of the bit data are collocated to prevent the error guiding model, and the error guiding model is used for the exceeding The outlier of ], as a monitoring noise, using a sliding window median for smooth replacement, wherein, Is the arithmetic mean of the load data in the current time sliding window, Standard deviation of the data in the sliding window; S13, feature derivation and normalization Introducing time sequence difference characteristics, and calculating an observed value First order difference of (2) And second order difference : In particular for the starting moment Since there is no history data, setting , And ; S14, splicing and expanding feature vectors Original collected index set Is that Dimension is Respectively calculating the first-order difference and the second-order difference of each index to obtain a velocity vector And acceleration vector Construction by vector concatenation operations Time-of-day extended feature vector : At this time, the feature dimension of the input data is determined from Expansion into 15 dimensions; independently normalizing the expanded feature matrix according to columns, and regarding the first Individual feature dimension [ ] ) It was mapped to the [0, 1] interval using Max-Min normalization: all normalized elements are processed Recombining, and finally constructing a model input feature matrix as ; S15 time series tensor construction Setting window size by using sliding window method to construct model input Predicting step length H, and constructing a feature matrix in step S14 Based on, wherein As a function of the total time step, For any meeting Is the sampling instant of (2) Cut length of Construction of local feature matrix from successive feature segments of (a) : Wherein, the Representation matrix In the first place The feature vectors of the rows are then sampled by sliding on the time axis with a step size of 1 and a plurality of local feature matrices are obtained Stacking along the batch dimension, the final constructed time-series tensor is expressed as Wherein B is the batch size.
3. The method for elastic container expansion and contraction based on asymmetric risk perception and dual-channel temporal-spatial feature fusion according to claim 1, wherein in step S2, The load prediction model for the double-channel space-time feature fusion comprises a local mutation feature extraction module, a global long-range time sequence memory module and a dynamic gating fusion module; and (2) inputting a load prediction model fused by two-channel space-time characteristics into the high-dimensional time sequence tensor generated in the step (S1) ; The local abrupt change feature extraction module captures high-frequency fluctuation features of load data by utilizing a feature enhanced time convolution network FE-TCN, the global long-range time sequence memory module captures periodicity rules of the load data by utilizing a self-attention gating circulating unit SA-GRU, and the dynamic gating fusion module is responsible for carrying out self-adaptive weighted fusion on the two paths of features and finally outputting a load predicted value at a future moment.
4. A container elastic telescoping method based on asymmetric risk perception and dual channel spatiotemporal feature fusion as claimed in claim 3, wherein, The local abrupt change feature extraction module aims at extracting local features reflecting millisecond flow burrs and instantaneous fluctuation from an input time sequence tensor, and realizes the self-adaptive screening of key load indexes by introducing a feature channel attention mechanism into a residual block; s211, dilation causal convolution processing Stacking 4 layers of residual blocks, wherein the expansion coefficient of each layer grows exponentially; For the first A layer with input sequence of Time step Convolved output at Calculated by the following formula: in the formula, Is the convolution kernel size; Index for position in convolution kernel ) For traversing the convolution window; Is the first The coefficient of expansion of the layer(s), Is the first Layer position A matrix of learnable weights at the location, Is a bias term; S212, feature channel attention screening To achieve adaptive filtering of features, the output is convolved Post-embedding attention module: Global information compression, compressing features along the time dimension by global average pooling, convolving the output Mapping to channel descriptors First, a third step The calculation formula of each channel is as follows: in the formula, The time dimension length of the current layer feature map is; The total number of characteristic channels, namely the number of convolution kernels; Index for channel ); Represent the first Layer number The individual channels being in time steps A feature value at the location; Weight generation, namely generating a normalized weight vector through nonlinear dependency relationship between two full-connection layer capturing channels : Wherein the method comprises the steps of For the function to be activated by the ReLU, Is a Sigmoid function; in order to reduce the weight matrix of the dimension, In order to reduce the dimension ratio of the fiber, The weight matrix is a lifting and maintaining weight matrix; Feature weight calibration, namely, the generated weight vector Considered as feature selector, and convolved output Multiplying channel by channel to obtain weighted features : Residual connection to prevent network degradation and speed convergence, the weighted features are used for Original input to the layer residual block Element-by-element addition is performed, and a final result is output through an activation function: after the stacked residual blocks are processed, the last layer is taken to be output at the current time step Is marked as local abrupt feature vector ( ) As one of the inputs to the subsequent two-channel fusion module.
5. The method for elastic container expansion and contraction based on asymmetric risk perception and double-channel space-time feature fusion according to claim 4, wherein the global long-range time sequence memory module aims at capturing macroscopic tide law of load data in days and solving the problem of long-sequence dependence, and the module works in parallel with the S21 module and processes the same input time sequence tensor ; S221 gating loop unit processing With gated loop unit GRU as the basic backbone for time step Input is The hidden state at the last moment is The specific calculation process is as follows: reset gate For controlling the discarding degree of the historical information, the weight matrix is as follows The calculation formula is as follows: Update door For controlling the retention degree of historical information, the weight matrix is as follows The calculation formula is as follows: Hidden state update, weight matrix is Based on the gating result, calculating candidate hidden states And the final hidden state : Obtaining a hidden state sequence containing long-range dependent information through the recursive calculation; s222 masking multi-headed self-attention enhancement Introducing a multi-head self-attention mechanism on the GRU output sequence, and directly establishing the dependency relationship between the current moment and the long history moment: Wherein, the Respectively representing a query vector, a key vector and a value vector, wherein the query vector, the key vector and the value vector are generated by linear mapping of a hidden state sequence output by the GRU; For scaling factors, for adjusting the numerical range of the dot product result to prevent the gradient from disappearing; is an upper triangular mask matrix; final output of global feature vector containing long-range dependency information , Wherein As the second input of the subsequent dual channel fusion module.
6. The method for elastically telescoping a container based on asymmetric risk perception and dual-channel spatiotemporal feature fusion of claim 5, wherein the method comprises the steps of, The dynamic gating fusion module receives the local mutation feature vector from S21 And global feature vector from S22 Dynamically adjusting the contribution ratio of the two paths of features by utilizing a self-adaptive gating mechanism, and finally generating a load predicted value; s231 gating coefficient generation Firstly, splicing two paths of feature vectors in a channel dimension to construct a joint feature vector ; Subsequently, the joint features are mapped into scalar gating coefficients by a multi-layer perceptron The process includes nonlinear transformation to enhance expression, and the calculation formula is as follows: in the formula, Activating a function for a ReLU; Ensuring output coefficients for Sigmoid activation functions ; And Is a weight matrix which can be learned; Is a bias term; S232, self-adaptive feature fusion Using the generated gating coefficients As an adjusting factor, carrying out soft switching weighted fusion on two paths of characteristics, and carrying out fused characteristic vector The calculation is as follows: s233 load predictive value output Finally, the feature vectors after fusion are used Inputting the target load space into a linear regression layer, mapping the high-dimensional characteristic space back to the target load space, and outputting a prediction result at a future moment : In the formula, And For the weight and bias of the output layer, prediction result The predicted values of key indexes including CPU utilization rate, memory usage amount, QPS and the like are used for subsequent elastic expansion decisions.
7. The method for elastic container expansion and contraction based on asymmetric risk perception and dual-channel space-time feature fusion according to claim 1, wherein in step S3, the asymmetric loss function The definition is as follows: Wherein the method comprises the steps of Representing the size of the training batch; Representing an actual load observation value of the load, Representing a predicted output value of the model; for the indication function, the value is 1 when the condition in brackets is satisfied, otherwise, the value is 0; penalty coefficients for risk.
8. The container elastic telescoping method based on asymmetric risk perception and double-channel space-time feature fusion according to claim 1, wherein in step S3, the fast fitting and global optimization of the complex non-stationary objective function are realized by collaborative configuration of an optimization algorithm, a learning rate scheduling mechanism and model early-stop criteria, and the method specifically comprises the following steps: parameter optimization adopts an Adam optimizer to update parameters, and a first moment estimation index attenuation rate is set Second moment estimation exponential decay rate To accommodate sparse gradients and non-stationary objective functions; the learning rate schedule introduces a cosine annealing strategy to jump out of a local optimal solution, an initial learning rate Minimum learning rate Learning rate Along with the current Epoch The change formula of (2) is: And monitoring an asymmetric loss value on the verification set, and if the loss of the verification set is not reduced by 10 continuous epochs, terminating training in advance and saving the current optimal weight so as to prevent overfitting.
9. The method for elastic container expansion and contraction based on asymmetric risk perception and dual-channel temporal-spatial feature fusion according to claim 1, wherein in step S4, The target copy number calculating method comprises the following steps: After the load prediction result is obtained, three indexes of CPU utilization rate, memory usage amount and QPS which are directly related to container physical resource quota are extracted from the prediction result to serve as reference dimensions, and each load dimension is used for Future predicted from model Time load value Combining the current resource utilization rate with a corresponding target threshold value Respectively calculating the required copy numbers Introducing safety redundancy coefficient for eliminating uncertainty risk of model prediction Constructing a deterministic resource buffer pool, and calculating the following formula: in the formula, The number of copies currently running; representing an upward rounding function; Final target copy number Taking the maximum value of all dimension calculation results, namely: full link delay compensation decision: to achieve proactive resource warm-up and minimize response delays, the system must calculate a pre-trigger time window To counteract the physical time consumption of resource startup, the calculation formula is as follows: In the middle of For the time consumption of cold start of Pod, as the value is not fixed, the actual time consumption of Pod from dispatch to ready state is continuously monitored by a Prometheus open source monitoring and alarming system, and dynamic update is carried out by adopting an exponential weighted moving average algorithm, and the calculation formula is as follows: Wherein, the For the current moment The updated start time delay estimated value; The physical time consumption from scheduling to ready for the Pod actually monitored at the current moment; For the last moment Is set, the start-up delay estimation value of (a); the method is a smoothing factor and is used for balancing the weight of the historical trend and the current observed value so as to adapt to mirror image pulling speed fluctuation or node load change; the system safety buffer time is used for covering the synchronous delay of the Kubernetes API scheduling delay and the service registration discovery; Based on the time window, the system selects the time span closest from the future load prediction sequence output in step S3 Is used for predicting the point of (2) As decision basis, if the predicted load exceeds the current capacity, immediately at the current time Triggering a capacity expansion instruction.
10. The method for elastic container expansion and contraction based on asymmetric risk perception and dual-channel temporal-spatial feature fusion according to claim 1, wherein in step S5, S51 non-inductive capacity expansion execution The system monitors the final target copy number output in the step S41 in real time When (1) Is greater than the current running copy number When the system is in operation, the system immediately triggers a capacity expansion instruction; to prevent resource exhaustion due to extreme prediction bias, a cluster maximum copy number limit is set Actually issued target copy number of expansion Taking the smaller of the calculated value and the upper limit value, namely: S52, reducing capacity hysteresis cooling strategy In order to prevent Pod from being frequently created and destroyed due to small fluctuation of load around a threshold value, an asymmetric cooling window mechanism is introduced, wherein for capacity expansion operation, the cooling time is set to be 0 seconds, namely the capacity expansion instruction has the highest priority, and is immediately executed once the risk of insufficient resources is detected, so that high service availability is guaranteed, for capacity expansion operation, a capacity reduction judging threshold value is set to be 30 percent, the capacity reduction instruction is triggered only when a future load value predicted by a model is continuously lower than the threshold value, and the system enters a locking state of 300 seconds after the capacity expansion instruction is executed, and new capacity reduction instruction is forbidden during the period, but the capacity expansion instruction is allowed to be preempted to be executed.

Description

Container elastic telescoping method based on asymmetric risk perception and double-channel space-time feature fusion Technical Field The invention relates to the technical field of cloud computing container arrangement and artificial intelligence operation and maintenance, in particular to a container elastic telescoping method based on asymmetric risk perception and double-channel space-time feature fusion. Background With the popularity of cloud technology, kubernetes has become a de facto container orchestration standard. In the scenes of great promotion of electronic commerce, high-frequency financial transaction, industrial internet and the like with extremely high requirements on real-time performance, the application load usually shows the characteristic of implying high-frequency noise in a periodic rule with severe instantaneous fluctuation. Elastic scaling serves as a core capability of cloud protocal, with the goal of minimizing resource costs while guaranteeing Service Level Agreements (SLAs). However, in the existing container elastic telescoping technology, the following three disadvantages still exist: One is the hysteresis of the telescoping strategy. The currently mainstream Kubernetes HPA (Horizontal Pod Autoscaler) adopts a threshold-based responsive strategy, and only after the resource utilization exceeds a preset value, the capacity expansion is triggered. There is usually a minute-scale delay from monitoring data collection, aggregate computation to trigger decision, and this passive response mechanism cannot cope with a burst traffic impact in the millisecond level, which is very easy to cause service avalanche. And secondly, the symmetry of the prediction model is assumed to be defective. The existing active expansion scheme mostly adopts long-short-term memory network (LSTM), differential integrated moving average autoregressive model (ARIMA) and other models, and the training process is usually based on Mean Square Error (MSE) and other symmetrical loss functions. This mechanism implies the assumption that the prediction is higher and the prediction is lower cost equal. However, in a production environment, SLA violations resulting from resource underallocation are far more costly than those resulting from resource overcomplications. The prior art lacks of modeling the algorithm level of the asymmetric risk, so that the model cannot output a prediction result with safety redundancy, the system is very easy to generate the condition of insufficient resource supply when facing load fluctuation, and the enterprise-level high-availability requirement is difficult to meet. And thirdly, the dynamic compensation of the full link start time delay is lacked. The actual traffic load often mixes long-period tidal laws with high-frequency transient noise, and it is difficult for a single model to accurately capture both of these spatio-temporal features simultaneously. Furthermore, existing approaches often only predict when the load rises, but ignore the full link start-up delay required for the container to issue from the scheduled instruction to the ready state. If the starting time consumption of dynamic change is not considered, even if the load prediction is accurate, the resources cannot be ready in time before the flow peak arrives, so that the problems of mismatch between the resources and the demands caused by the completion of capacity expansion operation and the overtime service request are caused. Therefore, how to provide an effective elastic container expansion method to solve the problems of response lag, risk perception asymmetry, single feature capture and resource mismatch caused by cold start delay in the existing work has become a technical problem to be solved by those skilled in the art. Disclosure of Invention In view of the above-mentioned drawbacks of the prior art, the present invention aims to provide a container elastic telescoping method and system based on asymmetric risk perception and dual-channel temporal-spatial feature fusion. First, the present invention constructs a real-time payload data stream based on multi-order differential enhancement. Different from the traditional monitoring mode of only collecting the current moment value, the invention introduces a time sequence characteristic engineering method, and performs data cleaning by calculating the first-order difference representing the load change rate and the second-order difference representing the change acceleration and combining the self-adaptive hierarchical interpolation strategy. The method constructs a set of time sequence data set which not only can reflect the current load amplitude, but also can represent the load change trend and the load strength sharply. Secondly, the invention constructs a load prediction model of the double-channel space-time feature fusion. The model adopts a dual-channel parallel feature extraction architecture, wherein a local mutation channel is used for extract