Search

CN-118034971-B - Method and device for recovering container state, computer equipment and storage medium

CN118034971BCN 118034971 BCN118034971 BCN 118034971BCN-118034971-B

Abstract

The application relates to a method and a device for recovering a container state, computer equipment, a storage medium and a software robot. The method comprises the steps of detecting the container state of a target container running in a service environment in real time and the resource use condition of the service environment, accessing a self-healing environment which is configured in advance in the service environment when the container state is abnormal, carrying out state self-recovery on the target container according to the resource use condition, pulling up a standby container corresponding to the target container in the self-healing environment by utilizing state data of the target container, transferring flow of the target container to the standby container for processing, continuously detecting the container state of the target container, and re-registering the target container in the service environment and distributing the flow to the target container for processing when the container state is normal. By adopting the method, the self-recovery of the fault container can be intelligently realized under the condition of ensuring the continuous and uninterrupted flow service, thereby reducing the labor cost of operation and maintenance and improving the recovery efficiency of the container state.

Inventors

  • LI YIMING
  • WANG XIMIN
  • LI ZONGSHI
  • LAN kaimei
  • TAN KE

Assignees

  • 中国人寿保险股份有限公司

Dates

Publication Date
20260508
Application Date
20240103

Claims (8)

  1. 1. A method of recovering a container state, the method comprising: Detecting the container state of a target container operated under the service environment of a terminal and the resource use condition of the service environment in real time; When the container state is abnormal, accessing a self-healing environment pre-configured by a server in the service environment, and determining a fault reason of the target container according to the resource use condition, wherein the fault reason comprises any one or more of insufficient host resources, insufficient container examples, network fusing and current limiting or abnormal service requirements, and the state of the target container is self-recovered according to a self-healing strategy corresponding to the fault reason stored in the self-healing environment; Creating mirror image information corresponding to the target container by utilizing the state data of the target container, pulling up a standby container corresponding to the target container in the self-healing environment according to the mirror image information, distributing container resources in the self-healing environment to the standby container, and transferring the flow of the target container to the standby container for processing; Continuously detecting the container state of the target container, re-registering the target container in the service environment when the container state is recovered to be normal, distributing the flow to the target container for processing, and recovering the container resources in the self-healing environment; Detecting the load states of host resources in the service environment and the self-healing environment; when the host resources with idle load states exist in the service environment, the corresponding idle host resources are configured in the self-healing environment; And when the host resource with the overload state exists in the self-healing environment, separating the corresponding host resource with the overload state from the self-healing environment.
  2. 2. The method of claim 1, wherein said transferring the flow of the target vessel to the standby vessel process comprises: Dividing the flow of the target container by adopting a preset service grid strategy to obtain a dividing result of the flow; And transferring the flow to the standby container to instruct the standby container to allocate corresponding container resources to the flow for processing according to the dividing result.
  3. 3. The method according to any one of claims 1 to 2, further comprising: When the container state is abnormal, responding to the triggering operation of a node switch, restarting a processing node corresponding to the node switch in the service environment so as to perform state self-recovery on the target container.
  4. 4. A device for restoring a container state, the device comprising: The monitoring module is used for detecting the container state of a target container operated in the service environment of the terminal and the resource service condition of the service environment in real time; The self-healing module is used for accessing a self-healing environment preconfigured by a server in the service environment when the state of the container is abnormal, determining the fault cause of the target container according to the resource use condition, wherein the fault cause comprises any one or more of insufficient host resources, insufficient container instances, network fusing and current limiting or abnormal service requirements, performing state self-healing on the target container according to a self-healing strategy corresponding to the fault cause stored in the self-healing environment, creating mirror image information corresponding to the target container by utilizing state data of the target container, pulling up a standby container corresponding to the target container in the self-healing environment according to the mirror image information, distributing container resources in the self-healing environment to the standby container, and transferring the flow of the target container to the standby container for processing; The recovery module is used for continuously detecting the container state of the target container, re-registering the target container in the service environment when the container state is recovered to be normal, distributing the flow to the target container for processing, and recovering the container resources in the self-healing environment; The dynamic expansion and contraction module is used for detecting load states of host resources in the service environment and the self-healing environment, when the host resources with idle load states exist in the service environment, the corresponding idle host resources are configured in the self-healing environment, and when the host resources with overload load states exist in the self-healing environment, the corresponding overload host resources are separated from the self-healing environment.
  5. 5. The apparatus of claim 4, wherein the self-healing module is further configured to divide a flow of the target container by using a preset service grid policy to obtain a division result of the flow, and transfer the flow to the standby container to instruct the standby container to allocate corresponding container resources to the flow according to the division result for processing.
  6. 6. The apparatus according to claim 4 or 5, characterized in that the apparatus further comprises: And the intervening self-healing module is used for restarting the processing node corresponding to the node switch in the service environment in response to the triggering operation of the node switch when the state of the container is abnormal so as to perform state self-healing on the target container.
  7. 7. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 3 when the computer program is executed.
  8. 8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 3.

Description

Method and device for recovering container state, computer equipment and storage medium Technical Field The present application relates to the field of intelligent operation and maintenance technologies, and in particular, to a method and apparatus for recovering a container state, a computer device, a storage medium, and a software robot. Background With the advancement of digital transformation of companies in various industries, resources gradually tend to pool, and applications gradually tend to cloud. The conventional IT operation and maintenance mode cannot meet the requirement of large-scale operation and maintenance growth brought by digital transformation. Therefore, AIOps intelligent operation and maintenance technology is realized under the addition of new technologies such as big data, cloud computing, artificial intelligence and the like. AIOps, collectively ARTIFICIAL INTELLIGENCE for IT Operations, is a technique that combines the capabilities of artificial intelligence with operation and maintenance, and improves the efficiency of operation and maintenance by means of machine learning. At present, an operation and maintenance deployment team of an enterprise can combine AIOps technology to realize semi-manual container state detection and operation and maintenance, namely cloud deployment of micro-services is realized through AIOps technology, and the alarm state of the micro-services is detected in real time. The health state of the micro-service container is manually judged by operation and maintenance personnel based on the alarm state, and the fault processing is realized by manual intervention. But due to the high number of micro-services, the emergency handling of sudden failures still requires high human costs. Disclosure of Invention Based on the foregoing, it is necessary to provide an intelligent container state recovery method, device, computer equipment, computer readable storage medium and software robot to reduce the operation and maintenance labor cost. In a first aspect, the present application provides a method for recovering a container state, the method comprising: Detecting the container state of a target container running in a service environment and the resource use condition of the service environment in real time; When the container state is abnormal, accessing a self-healing environment which is preset in the service environment, and carrying out state self-recovery on the target container according to the resource use condition; Pulling up a standby container corresponding to the target container in the self-healing environment by utilizing the state data of the target container, and transferring the flow of the target container to the standby container for processing; And continuously detecting the container state of the target container, and when the container state is recovered to be normal, re-registering the target container in the service environment and distributing the flow to the target container for processing. In one embodiment, when the container is abnormal, accessing a self-healing environment configured in advance in the service environment, and performing state self-recovery on the target container according to the resource use condition, including: When the container state is abnormal, determining a fault reason of the target container according to the resource use condition, wherein the fault reason comprises any one or more of insufficient host resources, insufficient container instances, network fusing and current limiting or abnormal service requirements; And carrying out state self-recovery on the target container according to a self-recovery strategy corresponding to the fault reason stored in the self-recovery environment. In one embodiment, the pulling up a standby container corresponding to the target container in the self-healing environment by using the state data of the target container, and transferring the flow of the target container to the standby container for processing includes: creating mirror image information corresponding to the target container by using the state data of the target container; Pulling up the standby container in the self-healing environment according to the mirror image information, and distributing container resources in the self-healing environment to the standby container; Dividing the flow of the target container by adopting a preset service grid strategy to obtain a dividing result of the flow; And transferring the flow to the standby container to instruct the standby container to allocate corresponding container resources to the flow for processing according to the dividing result. In one embodiment, the method further comprises: Detecting the load states of host resources in the service environment and the self-healing environment; when the host resources with idle load states exist in the service environment, the corresponding idle host resources are configured in the self-healing environment; And when