US-20260129054-A1 - DETERMINING MICROSERVICE RESOURCE AVAILABILITIES BASED ON THREAT INTELLIGENCE AND HEALTH METRIC VALUES

US20260129054A1US 20260129054 A1US20260129054 A1US 20260129054A1US-20260129054-A1

Abstract

A technique includes monitoring health metric values associated with a collection of monitored resources associated with a microservice. The technique includes determining based on the health metric values, whether each resource of the collection of monitored resources is healthy or unhealthy. The determination of whether each resource is healthy or unhealthy includes determining that a given resource of the collection of resources is healthy. The technique includes for each resource of the collection of resources, monitoring an associated security status of the resource; and determining availability statuses for the collection of resources. Determining the availability statuses includes classifying each resource that is unhealthy as being unavailable and classifying the given resource as being unavailable responsive to the security status associated with the given resource. The technique includes determining a resource availability of the microservice based on the availability statuses.

Inventors

Phanidhar Koganti
Vidya R. Gudlavalleti

Assignees

HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP

Dates

Publication Date: 20260507
Application Date: 20240827

Claims (20)

1 . A method comprising: monitoring, by a processor-based operations monitoring agent, health metric values associated with a collection of monitored resources associated with a microservice, wherein the collection of resources comprises a plurality of containers that collectively provide the microservice; determining, by the processor-based operations monitoring agent and based on the health metric values, whether each container of the plurality of containers is healthy or unhealthy, wherein the determining whether each container is healthy or unhealthy comprises determining that a given container of the plurality of containers is healthy; for each container of the plurality of containers, monitoring, by the processor-based operations monitoring agent, an associated security status of the container; determining availability statuses for the plurality of containers, wherein the determining the availability statuses comprises: classifying each container of the plurality of containers which is unhealthy as being unavailable; and classifying the given container as being unavailable responsive to the security status associated with the given container; determining, by the processor-based operations monitoring agent, a resource availability of the microservice based on the availability statuses; and selectively initiating, by the processor-based operations monitoring agent, a remedial action based on the resource availability.
2 . The method of claim 1 , wherein: determining the resource availability comprises determining a ratio of a number of containers of the plurality of containers which are available to the number of the containers of the plurality of containers.
3 . The method of claim 1 , wherein: the monitoring comprises: receiving a threat intelligence; and determining, based on the threat intelligence, that the given resource is security compromised; and classifying the given resource as being unavailable comprises determining that the given resource is unavailable responsive to the determination that the given resource is security compromised.
4 . The method of claim 3 , wherein determining that the given resource is security compromised comprises determining that the threat intelligence represents that the given resource has an associated security intrusion.
5 . The method of claim 3 , wherein determining that the given resource is security compromised comprises determining that the threat intelligence represents that the given resource has an associated security vulnerability.
6 . The method of claim 3 , wherein determining that the given resource is security compromised comprises determining that the threat intelligence represents that the given resource has an associated security vulnerability and determining that the threat intelligence represents a security risk score greater than a predefined threshold.
7 . The method of claim 1 , wherein selectively initiating the remedial action comprises: comparing the resource availability of the microservice to a predefined resource availability threshold; and responsive to a result of the comparison, initiating the remedial action.
8 . The method of claim 1 , wherein the given resource comprises a container, and selectively initiating the remedial action comprises at least one of: generating data representing a monitoring dashboard alert; stopping the container; or restarting the container.
9 . The method of claim 1 , wherein the given resource comprises a container, and selectively initiating the remedial action comprises at least one of: patching an image associated with the container; or replacing the image.
10 . The method of claim 1 , wherein: the health metric values comprise a subset of health metric values associated with the given resource; and determining that the given resource is healthy comprises: determining whether the health metric values of the subset are expected; and applying a rule to a result of determining whether the health metric values of the subset are expected.
11 . The method of claim 10 , wherein applying the rule comprises one of: determining whether any of the health metric values of the subset is unexpected and marking the given resource as being healthy based on none of the health metric values of the subset being unexpected; or determining a number of the health metric values of the subset as being unexpected and marking the given resource as being healthy based on the number being less than a predefined number threshold.
12 . An information technology (IT) operations management system comprising: a health monitoring engine comprising a hardware processor to determine, based on metric values associated with containers of a collection of containers, whether each container of the collection is healthy or unhealthy, wherein the collection of containers provides a microservice; a security monitoring engine comprising a hardware processor to determine, based on threat intelligence, whether each container of the collection is compromised; an availability determination engine comprising a hardware processor to: determine availability statuses for respective containers of the collection of containers, wherein determining the availability statuses comprises determining that a given container of the collection of containers is unavailable responsive to the given container being security compromised, and wherein the given container is healthy; and determine an availability of the microservice based on the availability statuses.
13 . The IT operations management system of claim 12 , wherein: the availability determination engine determines the availability of the microservice based on a ratio of a first number of the containers of the collection indicated as being available by the associated availability statuses to the total number of containers of the collection.
14 . The IT operations management system of claim 13 , further comprising: a remediation engine comprises a hardware processor to initiate a remedial action responsive to a comparison of the availability of the microservice to a predetermined availability threshold.
15 . The IT operations management system of claim 14 , wherein the hardware processor of the remediation engine to further, responsive to the comparison, generate data to display an alert on a monitoring dashboard associated with the microservice.
16 . The IT operations management system of claim 14 , wherein: the hardware processor of the security monitoring engine to further determine that a second container of the collection of containers is security compromised based on the threat intelligence representing that the second container is either associated with a security intrusion or vulnerable to a security intrusion.
17 . A non-transitory system-readable storage medium that stores hardware processor-readable instructions that, when executed by a hardware processor of an information technology (IT) operations management system, cause the IT operations management system to: based on metric data provided by a computer system, determine health statuses of associated respective containers of a computer system, wherein the containers provide a plurality of microservices, and the plurality of microservices is associated with an application; based on threat intelligence data provided by a threat intelligence source, determine an associated security status of each container of the containers, wherein the security status represents whether the associated container is security compromised; determine, for each container of the collection, an associated availability status representing whether the container is available or unavailable based on the associated health status and the associated security status; and determine a resource availability of each microservice based on the availability statuses.
18 . The storage medium of claim 17 , wherein the instructions, when executed by the hardware processor, further cause the IT operations management system to: compare, for each resource availability, the resource availabilities to a resource availability threshold to provide a comparison result associated with the resource availability; and initiate a remedial action responsive to a given comparison result of the comparison results.
19 . The storage medium of claim 17 , wherein the instructions, when executed by the hardware processor, further cause the IT operations management system to generate data to display the resource availabilities on a dashboard.
20 . The storage medium of claim 17 , wherein the instructions, when executed by the hardware processor, further cause the IT operations management system to receive, for a given container of the containers and from a kubelet of the given container, health metric values corresponding to health data for the given container.

Description

BACKGROUND A business enterprise may rely on any of a number of different computing environments to provide its services. In examples, the computing environments for a particular business enterprise may be confined to a private cloud (e.g., an on-premise datacenter), confined to a public cloud, or distributed across a hybrid cloud that includes both public and private clouds. A business enterprise may subscribe to an information technology (IT) operations management (ITOM) platform (e.g., a public cloud-based, software-as-a-service (Saas) platform) for such purposes as monitoring service availabilities; and detecting, predicting and remediating service issues. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of a computer network that includes a threat intelligence-aware operations management service to monitor microservice resource availabilities according to an example implementation. FIG. 2 is a block diagram of a threat intelligence-aware operations management system according to an example implementation. FIG. 3 is an example snapshot of a dashboard of a threat intelligence-aware operations management system, illustrating use of the dashboard to monitor and manage microservice resource availabilities according to an example implementation. FIG. 4 is a sequence diagram depicting communications among components of a threat intelligence-aware operations management system according to an example implementation. FIG. 5 is a flow diagram depicting a technique to determine a security status of a resource based on threat intelligence according to an example implementation. FIG. 6 is a flow diagram depicting a technique to determine microservice resource availability based on resource health metric values and resource security statuses according to an example implementation. FIG. 7 is a block diagram of an information technology (IT) operations management system to determine resource availabilities based on resource health metric values and resource security statuses according to an example implementation. FIG. 8 is an illustration of instructions that are stored on a non-transitory hardware processor-readable storage medium, which when executed by a hardware processor, cause the IT operations management system to determine microservice resource availability based on metric data and threat intelligence according to an example implementation. DETAILED DESCRIPTION In one type of application architecture, an application may be monolithic and correspond to a single unit. In another type of application architecture, an application may be formed from multiple, autonomous parts called “microservices.” As compared to the monolithic architecture, the microservice architecture provides greater agility, elasticity and greater control for software quality assurance. Moreover, the microservice architecture may be better suited for a cloud deployment of an application. A microservice may be provided by a container environment. In this context, a “container environment” refers to a collection of one or multiple instantiated containers (also referred to herein as “containers”). For a container environment that includes multiple containers, the containers may collaborate for a particular purpose (e.g., providing a microservice). A container environment may be orchestrated or non-orchestrated (or “self-managed”). An orchestrated container environment has an orchestrator that manages the lifecycles and workloads of the environment's containers. In examples, an orchestrator may manage provisioning and resource allocation for the containers. In other examples, an orchestrator may manage container replication, when containers start and stop, container scaling, workload distribution among the containers, or other lifecycle phase or workload aspects of the container environment. In examples, an orchestrated container environment may have a KUBERNETES orchestrator or a DOCKER SWARM orchestrator. In an example, an orchestrated container environment may be a container cluster (e.g., a KUBERNETES cluster) that has a control plane and worker nodes. Regardless of its particular architecture, a microservice has a number of supporting resources. In the context that is used herein, a “resource” refers to a component, such as a container or a group of containers (called a “container pod” or “pod”). Depending on its complexity (e.g., the degree of scaling, fault tolerance features, the number of entities communicating with the microservice, as well as other features), a given microservice may have hundreds or even thousands of resources. For purposes of managing its microservices, a business entity customer may subscribe to an information technology (IT) operations management (ITOM) platform (a platform provided by a public cloud provider “as-a-service”). The ITOM platform monitors metrics (e.g., kube metrics) of the microservice resources for purposes of assessing resource health and through a user graphical user interface (GUI), or