EP-4742032-A1 - SYSTEM AND COMPUTER-IMPLEMENTED METHOD FOR FAILOVER OF NETWORK CONNECTIONS OF A CONTAINER INSTANCE HAVING AN ASSIGNED CRITICALITY

EP4742032A1EP 4742032 A1EP4742032 A1EP 4742032A1EP-4742032-A1

Abstract

Computer-implemented method for failover of network interfaces provided to a container instance (113) with an assigned criticality in a managed container runtime environment (112) by a guest computer (110, 120), comprising the steps: - Configuring (S1) at least one network interface bundle by means of a central management facility (100) for a group of managed runtime environments (112), comprising a) Assigning (S11) at least two physical network interfaces (117a, 117b, 117c) to each of the network interface bundles (117), and b) Setting up (S12) at least one traffic class that assigns to the different network interfaces (117a, 117b, 117c) of the network interface bundle (117) a utilization level of the transmission bandwidth of the network interface (117a, 117b, 117c) under a given failure condition, in the runtime environment (112) in which the container instance (113) is executed, - Assigning (S2) a traffic class to the container instance (113) that varies depending on the criticality of the container instance (113), - dynamic provisioning (S3) of the network interface (117a, 117b, 117c) of the network interface bundle (117) for use by the container instance (113) which is determined by the assigned traffic class and a currently determined failure condition.

Inventors

Knierim, Christian

Assignees

Siemens Aktiengesellschaft

Dates

Publication Date: 20260513
Application Date: 20241111

Claims (15)

Computer-implemented method for failover of network interfaces provided to a container instance (113) with an assigned criticality in a managed container runtime environment (112) by a guest computer (110, 120), comprising the steps: - Configuring (S1) at least one network interface bundle by means of a central management facility (100) for a group of managed runtime environments (112), comprising a) Assigning (S11) at least two physical network interfaces (117a, 117b, 117c) to each of the network interface bundles (117), and b) Setting up (S12) at least one traffic class that assigns to the different network interfaces (117a, 117b, 117c) of the network interface bundle (117) a utilization level of the transmission bandwidth of the network interface (117a, 117b, 117c) under a given failure condition, in the runtime environment (112) in which the container instance (113) is executed, - Assigning (S2) a traffic class to the container instance (113) that varies depending on the criticality of the container instance (113), - dynamic provisioning (S3) of the network interface (117a, 117b, 117c) of the network interface bundle (117) for use by the container instance (113) which is determined by the assigned traffic class and a currently determined failure condition.
Computer-implemented method according to claim 1, wherein for all managed runtime environments (112) from the group of managed runtime environments similar aliases are defined for the respective physical network interfaces (117a, 117b, 117c) and the aliases of the network interfaces (117a, 117b, 117c) are used for the assignment to a network interface bundle (117).
A computer-implemented method according to any of the preceding claims, wherein each configured network interface bundle (117) is assigned a different bundle name that is unique within the group of managed runtime environments (112).
Computer-implemented method according to any of the preceding claims, wherein each network interface bundle (117) comprises the following information: - a preferred network interface (117a, 117b, 117c) as the main interface (HS), - at least one further network interface (117a, 117b, 117c) of the network interface bundle (117) as a fallback interface (RS), and the traffic classes established for the further network interfaces (117a, 117b, 117c), and/or - Information specifying the failure conditions for determining a failsafe event.
Computer-implemented method according to claim 4, wherein, in the case of more than one fallback interface (RS), a priority is assigned to the different fallback interfaces (RS) which specifies the order in which the respective fallback interfaces (RS) are activated.
Computer-implemented method according to one of claims 4-5, wherein each fallback interface (RS) is assigned at least one traffic class and a utilization rate of the transmission bandwidth provided by the fallback interface (RS).
Computer-implemented method according to one of claims 4-6, wherein the utilization rate is a relative specification with respect to the total transmission bandwidth of the fallback interface (RS) or an absolute bandwidth specification.
Computer-implemented method according to one of the preceding claims, wherein several failure conditions are specified within the traffic class, and the network interface (117a, 117b, 117c) is activated with the utilization level corresponding to the respective failure conditions or a combination of the failure conditions.
A computer-implemented method according to one of the preceding claims, wherein a configuration of the network interface bundles (117) is forwarded from the central management device (100) to an interface control unit (115) on the runtime environment (112), and the interface control unit (115) sets up the network interface bundle (117) according to the configuration.
Computer-implemented method according to claim 9, wherein the interface control unit (115) derives upper and lower limits for monitoring parameters from the failure conditions and monitors these monitoring parameters via at least one fallback interface (RS).
Computer-implemented method according to claim 10, wherein, in the event of an exceedance or fall below the monitoring parameter, the fallback interface (RS) is configured accordingly. The configuration of the fallback interface (RS) is adapted by the interface control unit (115).
Computer-implemented method according to claim 11, wherein, when the monitoring parameter is exceeded or fallen below, the network interfaces (117a, 117b, 117c) of the container instance (113) are switched from the main interface (HS) to the highest-ranking fallback interface (RS).
Computer-implemented method according to claim 9, wherein, in the event of an exceedance or fall below the monitoring parameter, at least one fallback interface (RS) is assigned to the main interface (HS).
Arrangement for failover of network interfaces provided to a container instance with assigned criticality in a managed container runtime environment (112) by a guest computer (110, 120), comprising, a central administrative body (100) that is designed in this way, - to configure at least one network interface bundle (117) for a group of managed runtime environments (112), comprising a) Assigning at least two physical network interfaces (117a, 117b, 117c) to each of the network interface bundles (117), and b) Setting up at least one traffic class that assigns to the various network interfaces (117a, 117b, 117c) of the network interface bundle (117) a utilization level of the transmission bandwidth of the network interface (117a, 117b, 117c) under a given failure condition, and a runtime environment (112) in which the container instance (113) is executed, which is configured in such a way, - to assign a different traffic class to the container instance (113) depending on the criticality of the container instance (113), and - to dynamically provide the network interface (117a, 117b, 117c) of the network interface bundle (117) for use by the container instance (113), which is determined by the assigned traffic class of the container instance (113) and a currently determined failure condition.
A computer program product comprising a non-volatile, computer-readable medium that can be directly loaded into the memory of at least one digital computer, comprising program code segments that, when the program code segments are executed by the at least one digital computer cause it to perform the steps of the method according to any one of claims 1 to 13.

Description

Technical field The invention relates to a computer-implemented method for fail-safe network connections provided to a container instance with an assigned criticality in a managed container runtime environment by a guest computer, as well as a corresponding arrangement and a computer program product. Technical background Container virtualization is an operating system-level virtualization method. It provides computer programs with a complete runtime environment virtually within an isolated or self-contained software container. This runtime environment can be used by multiple containers and accesses the operating system kernel of a guest computer. The operating system kernel can restrict resource access depending on the user and context under which a process is running. Software containers, hereinafter referred to simply as containers, thus represent a resource-efficient form of virtualization compared to virtual machines, which have their own operating system and are allocated hardware resources of the underlying system via a hypervisor and have their own operating system kernel. Container virtualization encapsulates a software component running within a container from the underlying guest computer. To start a container on the guest machine, a container instance is created from a container image using a deployment configuration (hereinafter also referred to as deployment information) and executed in the guest machine's runtime environment. A deployment configuration could be, for example, a Docker Compose file or a Kubertes manifest. An orchestrated runtime environment comprises an orchestrator, preferably implemented as orchestration software running on a computer, and at least one guest machine, often multiple guest machines, associated with the orchestrator. The orchestrator and the at least one orchestrated guest machine form a cluster. The orchestrator starts, manages, and terminates the runtime environment. Container instances on the assigned guest machines. Typical orchestrated runtime environments are Kubernetes-based container environments that, for example, manage cloud-based Container-as-a-Service environments or virtual instances operated in a cloud as nodes of an orchestrated runtime environment. Container instances communicate with other container instances running locally on the same underlying guest machine, also known as a host or node, via a network interface assigned by the container runtime environment. This network interface is also used to communicate with container instances on other nodes or with external services outside the potentially orchestrated runtime environment. This communication typically occurs via IP addresses, which are dynamically managed by the container runtime environment and/or the orchestrator and assigned to the instances upon startup. A failover, or switchover for redundancy, in network interfaces refers to the process of automatically switching to a redundant network interface when the primary network interface fails. This ensures that a network connection is maintained continuously, even if one network interface stops working. If the primary network interface fails, a secondary network interface automatically takes over the network connection without requiring manual intervention. For devices not operating in the cloud that provide a container runtime environment, the problem is that these are connected to physical networks and, without additional measures, are not inherently equipped with highly available network connectivity – as is the case with virtual network connections. It is known that this problem can be solved by combining physical network interfaces on the underlying host computer using a bonding or teaming method, and implementing the failure mechanism in this way. For example, two physical interfaces are connected to different, redundant network switches, and if one interface or switch fails, communication is handled via the other, remaining interface. The transmission bandwidth of the physical interfaces is not constant, but can vary, for example due to different network loads or due to a environment-dependent factors can significantly alter the quality of a wireless interface, especially with moving guest computers. Container instances are often assigned different priorities with varying levels of criticality and run on the host machine's container runtime environment. The problem with the bonding method is that the criticality of individual container instances running on the host cannot be considered; instead, it affects the entire host machine globally. Therefore, if resources on the failover interface become scarce, a less critical container instance cannot be offered reduced or even no bandwidth at all to provide higher bandwidth to a critical container instance. Summary of the invention The object of the present invention is therefore to ensure a sufficient transmission bandwidth for the container instance depending on its criticality. Th