CN-121984747-A - Cloud primary resource dynamic scheduling safety isolation method based on cloud computing
Abstract
The invention discloses a cloud primary resource dynamic scheduling safety isolation method based on cloud computing, which relates to the technical field of industrial cloud computing and comprises the steps of collecting multi-source heterogeneous runtime characteristic data in real time, preprocessing the characteristic data to generate a joint state characterization vector, inputting the joint state characterization vector into a double-target reinforcement learning model, performing action reasoning and strategy mapping to generate a collaborative scheduling decision, injecting standardized detection flow into a target service based on an isolation execution state report, calculating an SLA-safety balance index to generate a service quality and isolation effectiveness quantization evaluation result, and performing incremental optimization on the double-target reinforcement learning model according to the service quality and isolation effectiveness quantization evaluation result to generate an updated decision model parameter set. The invention achieves dynamic balance of resource efficiency and deep defense capability in an industrial cloud computing environment.
Inventors
- CAO YU
- WU YUEZHOU
- LI WEN
- Si Gangjun
- HU HAIGANG
Assignees
- 北京数智星通科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260130
Claims (10)
- 1. A cloud primary resource dynamic scheduling safety isolation method based on cloud computing is characterized by comprising the steps of, The method comprises the steps of collecting multi-source heterogeneous runtime characteristic data in real time, preprocessing the characteristic data, and generating a joint state characterization vector; Inputting the joint state characterization vector into a double-target reinforcement learning model, and performing action reasoning and strategy mapping to generate a collaborative scheduling decision; synchronously implementing target node scheduling and isolation policy deployment according to the cooperative scheduling decision, synchronously updating an isolation rule, and generating an isolation execution state report; Based on the isolation execution state report, injecting standardized detection flow into the target service, calculating SLA-safety balance index, and generating a service quality and isolation effectiveness quantitative evaluation result; And performing incremental optimization on the double-target reinforcement learning model according to the quality of service and isolation effectiveness quantitative evaluation result to generate an updated decision model parameter set.
- 2. The cloud computing based cloud primary resource dynamic scheduling security isolation method of claim 1, wherein the multi-source heterogeneous runtime feature data comprises a container resource dynamic index, a service security attribute index, a node physical resource utilization rate and a security configuration compliance state.
- 3. The method for dynamically scheduling and safely isolating cloud native resources based on cloud computing as claimed in claim 2, wherein the generating of the joint state characterization vector comprises the following steps of, Feature extraction, feature construction and feature normalization are carried out on the multi-source heterogeneous runtime feature data, and a standardized service portrait vector and a standardized node portrait vector are generated; And splicing the standardized service portrait vector and the standardized node portrait vector into a joint state representation vector.
- 4. The cloud computing based cloud primary resource dynamic scheduling security isolation method of claim 3, wherein the training step of the dual-objective reinforcement learning model is as follows, Extracting a historical joint state characterization vector set, a historical scheduling action set and a historical double-target return label set from a historical collaborative scheduling decision sample to obtain a training data set; The method comprises the steps of inputting a historical joint state characterization vector into a strategy network component, obtaining motion probability distribution prediction, inputting the historical joint state characterization vector and a historical scheduling motion into a value network component, obtaining resource utilization rate return estimation value and security risk return estimation value prediction, and obtaining a double-target prediction output set; Constructing a resource utilization rate target loss item and a security risk target loss item according to the double-target prediction output set and the historical double-target return label set, and carrying out weighted summarization to obtain a joint optimization target function; and based on the joint optimization objective function, performing iterative updating on the network parameters of the strategy network component and the value network component to obtain a double-objective reinforcement learning model.
- 5. The method for dynamically scheduling and safely isolating cloud native resources based on cloud computing as claimed in claim 1, wherein the generating of the cooperative scheduling decision comprises the following specific steps, Based on the joint state characterization vector, performing forward propagation calculation through a double-target reinforcement learning model to obtain action probability distribution; According to the action probability distribution, selecting the maximum probability action through a double-target reinforcement learning model, and obtaining candidate scheduling actions; calculating a resource utilization rate return estimation value and a security risk return estimation value based on the candidate scheduling actions and the joint state characterization vector, and obtaining a comprehensive return estimation value through return fusion; and taking the candidate scheduling action with the highest comprehensive return estimation value as a final scheduling action, mapping the candidate scheduling action into a target node identification and isolation strategy type according to a mapping rule, and generating a cooperative scheduling decision.
- 6. The cloud computing based cloud primary resource dynamic scheduling security isolation method of claim 5, wherein said implementing the target node scheduling and isolation policy deployment in synchronization according to the collaborative scheduling decision comprises the following steps, The collaborative scheduling decision is based on the target node identification, a service instance scheduling instruction is issued to a target computing node, and a target node scheduling execution result is obtained; and the cooperative scheduling decision issues an isolation policy configuration instruction to the target computing node according to the isolation policy type, and an isolation policy deployment execution result is obtained.
- 7. The method for dynamically scheduling and safely isolating cloud native resources based on cloud computing as claimed in claim 6, wherein the step of generating an isolated execution status report comprises the following steps, Writing a network micro-isolation rule set and a calling filtering rule set in an isolation strategy configuration instruction into a distributed isolation rule library to obtain an updated isolation rule; and carrying out structured encapsulation on the target node dispatching execution result, the isolation strategy deployment execution result and the updated isolation rule to generate an isolation execution state report.
- 8. The method for dynamically scheduling and safely isolating cloud primary resources based on cloud computing as claimed in claim 7, wherein the step of injecting standardized probe traffic into the target service based on the isolation execution status report comprises the following steps, Extracting a target service identifier, an isolation rule effective time stamp and a rule version identifier from the isolation execution state report; And injecting standardized detection traffic into the target service based on the target service identification and the isolation rule effective time stamp, and obtaining a service response data set.
- 9. The cloud computing based cloud primary resource dynamic scheduling security isolation method of claim 8, wherein the generating service quality and isolation effectiveness quantitative evaluation result comprises the following steps of, Calculating the request success rate, the average response delay and the throughput according to the service response data set, and generating a service quality index set; Based on the isolation rule effective time stamp and the rule version identifier, comparing the external network connection behavior statistical data before and after the standardized detection flow is injected with the sensitive calling frequency statistical data to obtain an isolation effectiveness index set; Respectively carrying out linear normalization and aggregation on each index in the service quality index set and the isolation effectiveness index set to obtain a service quality score and an isolation effectiveness score; and linearly combining the service quality score and the isolation effectiveness score to obtain an SLA-safety balance index, and generating a service quality and isolation effectiveness quantitative evaluation result.
- 10. The method for dynamically scheduling and safely isolating cloud native resources based on cloud computing as claimed in claim 9, wherein the step of generating the updated decision model parameter set comprises the following steps of, Extracting an SLA-safety balance index from a service quality and isolation effectiveness quantitative evaluation result; generating an incremental optimization trigger signal when the SLA-security tradeoff index is below a performance threshold; when the increment optimization trigger signal is true, storing the current joint state characterization vector, the candidate scheduling action, the resource utilization rate return estimation value and the safety risk return estimation value as experience samples into an experience playback buffer zone to obtain an enhanced experience data set; based on the enhanced experience data set, carrying out gradient update on strategy network parameters and value network parameters of the double-target reinforcement learning model, and implementing update amplitude constraint to obtain an intermediate update parameter set; And carrying out parameter soft update on the intermediate updated parameter set and the current parameter set of the double-target reinforcement learning model to generate an updated decision model parameter set.
Description
Cloud primary resource dynamic scheduling safety isolation method based on cloud computing Technical Field The invention relates to the technical field of industrial cloud computing, in particular to a cloud primary resource dynamic scheduling safety isolation method based on cloud computing. Background The cloud primary resource dynamic scheduling safety isolation technology based on cloud computing occupies a vital position in the current industrial cloud computing field, with deep application of micro-service architecture and containerization deployment in key scenes of intelligent manufacturing, energy monitoring and industrial Internet of things, the cloud primary application needs to meet strict safety isolation requirements while guaranteeing high resource utilization rate, the current industrial cloud computing environment commonly adopts a scheduling framework based on Kubernetes to combine with network strategies (such as Calico, cilium) to realize resource allocation and basic isolation, and service deployment topology control and three-layer network micro isolation are realized through declarative configuration, so that standardized practice is formed in large-scale industrial application. In the field of cloud native resource dynamic scheduling safety isolation based on cloud computing, a traditional cloud native resource dynamic scheduling safety isolation method generally decouples resource scheduling decisions from safety strategy configuration to be executed, so that an isolation strategy is delayed to take effect after a scheduling action is completed, real-time risk blocking in a service migration process is difficult to ensure, meanwhile, the existing method relies on a static threshold value or single-objective optimization to generate a scheduling decision, and security attributes (such as sensitive interface exposure and authentication strength) and node compliance states of service operation are not included in a joint state representation, so that high-risk services can be scheduled to weak protection nodes, and deep defense capability is weakened. Disclosure of Invention The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides a cloud native resource dynamic scheduling safety isolation method based on cloud computing, which solves the problems of cooperative hysteresis of resource scheduling and safety isolation and lack of safety perception in operation of scheduling decisions. In order to solve the technical problems, the invention provides the following technical scheme: The invention provides a cloud primary resource dynamic scheduling safety isolation method based on cloud computing, which comprises the steps of collecting multi-source heterogeneous runtime characteristic data in real time, preprocessing to generate a joint state characterization vector, inputting the joint state characterization vector into a double-target reinforcement learning model, performing action reasoning and strategy mapping to generate a collaborative scheduling decision, synchronously implementing target node scheduling and isolation strategy deployment according to the collaborative scheduling decision, synchronously updating an isolation rule to generate an isolation execution state report, injecting standardized detection flow into target service based on the isolation execution state report, calculating SLA-safety balance index to generate a service quality and isolation effectiveness quantization evaluation result, and performing increment optimization on the double-target reinforcement learning model according to the service quality and the isolation effectiveness quantization evaluation result to generate an updated decision model parameter set. The cloud primary resource dynamic scheduling security isolation method based on cloud computing is an optimal scheme, wherein the multi-source heterogeneous runtime characteristic data comprise a container resource dynamic index, a service security attribute index, a node physical resource utilization rate and a security configuration compliance state. As an optimal scheme of the cloud computing-based cloud native resource dynamic scheduling security isolation method, the method comprises the following specific steps of, Feature extraction, feature construction and feature normalization are carried out on the multi-source heterogeneous runtime feature data, and a standardized service portrait vector and a standardized node portrait vector are generated; And splicing the standardized service portrait vector and the standardized node portrait vector into a joint state representation vector. As an optimal scheme of the cloud computing-based cloud primary resource dynamic scheduling safety isolation method, the training steps of the double-target reinforcement learning model are as follows, Splicing the standardized service portrait vector and the standardized node portrait