US-12619932-B2 - Value chain workload autoscaling in an industry cloud

US12619932B2US 12619932 B2US12619932 B2US 12619932B2US-12619932-B2

Abstract

The technology described herein is directed towards automatically scaling cloud provider resources allocated for one enterprise's service based on the resources being used or expected to be used by another enterprise's service, in which there is a value chain-based load-dependency relationship between the two enterprises' services. In one example implementation, a first prediction engine determines a predicted workload for a first enterprise service, and sends that information to a load manager that allocates resources for the first enterprise service based on the prediction. Based on the load-dependency relationship, the first prediction engine sends the predicted workload to a second prediction engine, which predicts a second predicted workload for a second enterprise's service, and sends the second prediction information to a load manager that allocates resources for the second enterprise service based on the second prediction. The automatic scaling is done without sharing any enterprise-sensitive data among the enterprises.

Inventors

Ofir Ezrielev
Shahed Yousef
Sharon Vitek

Assignees

DELL PRODUCTS L.P.

Dates

Publication Date: 20260505
Application Date: 20240424

Claims (20)

1 . A cloud system, comprising: at least one processor; and at least one memory that stores executable instructions that, when executed by the at least one processor, facilitate performance of operations, the operations comprising: for a group of services on the cloud system associated with a common value chain of a group of enterprises, wherein there are interdependencies between respective subgroups of the services associated with the enterprises, wherein the cloud system has visibility to respective telemetry data and respective architectures associated with the respective subgroups of the services associated with the enterprises, and wherein each of the enterprises does not have visibility to the respective telemetry data and the respective architectures associated with the respective subgroups of services associated with other enterprises of the group of enterprises: constructing, based on the respective telemetry data and the respective architectures associated with the respective subgroups of the services, a workflow graph of the group of services and the interdependencies between the respective subgroups of the services associated with the enterprises; obtaining, from the respective telemetry data, respective workload data representative of workloads of the subgroups of services; allocating, based on the respective workload data, server resources of the cloud system to run the respective subgroups of the services; based on the respective telemetry data, determining first predicted workload data associated with a first subgroup of the services associated with a first enterprise of the group of enterprises, wherein the first predicted workload data is indicative of a first predicted changed workload associated with at least one first service of the first subgroup of the services; based on the first predicted changed workload, determining a first minimum resource scaling for the first subgroup of the services; based on the at least one first service of the first subgroup of the services and the interdependencies between the respective subgroups of the services, determining a circulation flow through the workflow graph, wherein the circulation flow comprises: the at least one first service, and at least one second service of at least one second subgroup of the services, different than the first subgroup, respectively associated with second enterprises of the group of enterprises; based on the circulation flow and the first minimum resource scaling, determining respective second minimum resource scalings for the at least one second subgroup of the services; allocating first server resources of the server resources of the cloud system to run the first subgroup of the services based on the first minimum resource scaling; and allocating respective second server resources of the server resources of the cloud system to run the at least one second subgroup of the services based on the respective second minimum resource scalings.
2 . The cloud system of claim 1 , wherein the respective telemetry data comprises at least one of: current state data representative of respective current states associated with the group of services, or expected state data of respective information for processing by the group of services.
3 . The cloud system of claim 2 , wherein a first service of the group of services sends request data to a second service of the group of services, wherein the first service, in response to the request data, receives result data from the second service, and wherein first data of the current state data associated with the first service or second data of the expected state data associated with the first service corresponds to an amount of the request data to be sent from the first service to the second service.
4 . The cloud system of claim 1 , wherein the first predicted changed workload is an increased workload from a current workload associated with the first subgroup of the services.
5 . The cloud system of claim 1 , wherein the first predicted changed workload is a decreased workload from a current workload associated with the first subgroup of the services.
6 . The cloud system of claim 1 , wherein determining the first minimum resource scaling is based further on a service level agreement associated with the first enterprise.
7 . The cloud system of claim 1 , wherein determining the first minimum resource scaling is based further on a service level objective associated with the first enterprise.
8 . The cloud system of claim 1 , wherein the common value chain is associated with healthcare-related workloads.
9 . The cloud system of claim 1 , wherein the common value chain is associated with financial data-related workloads.
10 . A method, comprising: for a group of services on a cloud system, comprising at least one processor, associated with a common value chain of a group of enterprises, wherein there are interdependencies between respective subgroups of the services associated with the enterprises, wherein the cloud system has visibility to respective telemetry data and respective architectures associated with the respective subgroups of the services associated with the enterprises, and wherein each enterprise of the enterprises does not have visibility to the respective telemetry data and the respective architectures associated with the respective subgroups of services associated with other enterprises of the group of enterprises: constructing, by the cloud system, based on the respective telemetry data and the respective architectures associated with the respective subgroups of the services, a workflow graph of the group of services and the interdependencies between the respective subgroups of the services associated with the enterprises; obtaining, by the cloud system, respective workload data representative of workloads of a first the subgroups of services; allocating, by the cloud system, based on the respective workload data, server resources of the cloud system to run the respective subgroups of the services based on the respective telemetry data, determining, by the cloud system, first predicted workload data associated with a first subgroup of the services associated with a first enterprise of the group of enterprises, wherein the first predicted workload data is indicative of a first predicted changed workload associated with at least one first service of the first subgroup of the services; based on the first predicted changed workload, determining, by the cloud system, a first minimum resource scaling for the first subgroup of the services; based on the at least one first service of the first subgroup of the services and the interdependencies between the respective subgroups of the services, determining, by the cloud system, a circulation flow through the workflow graph, wherein the circulation flow comprises: the at least one first service, and at least one second service of at least one second subgroup of the services respectively associated with second enterprises of the group of enterprises; based on the circulation flow and the first minimum resource scaling, determining, by the cloud system, respective second minimum resource scalings for the at least one second subgroup of the services; allocating, by the cloud system, first server resources of the server resources of the cloud system to run the first subgroup of the services based on the first minimum resource scaling; and allocating, by the cloud system, respective second server resources of the server resources of the cloud system to run the at least one second subgroup of the services based on the respective second minimum resource scalings.
11 . The method of claim 10 , wherein the respective telemetry data comprises at least one of: current state data representative of respective current states associated with the group of services, or expected state data of respective information for processing by the group of services.
12 . The method of claim 11 , wherein a first service of the group of services sends request data to a second service of the group of services, wherein the first service, in response to the request data, receives result data from the second service, and wherein a first part of the current state data associated with the first service or a second part of the expected state data associated with the first service corresponds to an amount of the request data to be sent from the first service to the second service.
13 . The method of claim 10 , wherein the first predicted changed workload is an increased workload from a current workload associated with the first subgroup of the services.
14 . The method of claim 10 , wherein the first predicted changed workload is a decreased workload from a current workload associated with the first subgroup of the services.
15 . The method of claim 10 , wherein determining the first minimum resource scaling is based further on a service level agreement associated with the first enterprise.
16 . The method of claim 10 , wherein determining the first minimum resource scaling is based further on a service level objective associated with the first enterprise.
17 . The method of claim 10 , wherein the common value chain is associated with healthcare-related workloads.
18 . The method of claim 10 , wherein the common value chain is associated with financial data-related workloads.
19 . A non-transitory machine-readable medium, comprising executable instructions that, when executed by at least one processor of a cloud system, facilitate performance of operations, the operations comprising: for a group of services on the cloud system associated with a common value chain of a group of enterprises, wherein there are interdependencies between respective subgroups of the services associated with the enterprises, wherein the cloud system has visibility to respective telemetry data and respective architectures associated with the respective subgroups of the services associated with the enterprises, and wherein each enterprise of the enterprises does not have visibility to the respective telemetry data and the respective architectures associated with the respective subgroups of services associated with other enterprises of the group of enterprises: constructing, based on the respective telemetry data and the respective architectures associated with the respective subgroups of the services, a workflow graph of the group of services and the interdependencies between the respective subgroups of the services associated with the enterprises; obtaining, from the respective telemetry data, respective workload data representative of workloads the subgroups of services; allocating, based on the respective workload data, server resources of the cloud system to run the respective subgroups of the services; based on the respective telemetry data, determining first predicted workload data associated with a first subgroup of the services associated with a first enterprise of the group of enterprises, wherein the first predicted workload data is indicative of a first predicted changed workload associated with at least one first service of the first subgroup of the services; based on the first predicted changed workload, determining a first minimum resource scaling for the first subgroup of the services; based on the at least one first service of the first subgroup of the services and the interdependencies between the respective subgroups of the services, determining a circulation flow through the workflow graph, wherein the circulation flow comprises: the at least one first service, and at least one second service of at least one second subgroup of the services respectively associated with second enterprises of the group of enterprises; based on the circulation flow and the first minimum resource scaling, determining respective second minimum resource scalings for the at least one second subgroup of the services; allocating first server resources of the server resources of the cloud system to execute the first subgroup of the services based on the first minimum resource scaling; and allocating respective second server resources of the server resources of the cloud system to execute the at least one second subgroup of the services based on the respective second minimum resource scalings.
20 . The non-transitory machine-readable medium of claim 19 , wherein the respective telemetry data comprises at least one of: current state data representative of respective current states associated with the group of services, or expected state data of respective information for processing by the group of services.

Description

BACKGROUND An industry (vertical) cloud is a set of packaged cloud products, full-service offerings or frameworks configured for an industry or several industries across a value chain, where “value chain” refers to a combination of different services that provide a full product solution. A value chain may encompass more than one company (e.g., a financial technology, or ‘fintech’, company that draws data from banks and credit providers to provide insights to end customers). Industry cloud platforms are designed to cater to the specific needs of vertical industry segments inadequately served by generic solutions. On these platforms, it is common to find applications that are based on services by different companies merged into a single workflow, where “workflow” includes the structure and amount, or number, of resources and time that are used in a system in the context of completing a task. BRIEF DESCRIPTION OF THE DRAWINGS The technology described herein is illustrated by way of example and not limited in the accompanying figures in which like reference numerals indicate similar elements and in which: FIG. 1 is a block diagram showing an example system for automatically scaling services based on dependency relationships between those services, in accordance with various embodiments and implementations of the subject disclosure. FIG. 2 is a block diagram showing an example workflow for two enterprises operating as part of a value chain, in which one enterprise's service workload impacts the other enterprise's service workload, in accordance with various embodiments and implementations of the subject disclosure. FIG. 3 is a flow diagram showing example operations for predicting a first workload for a first service, and using that predicted a first workload to predict a second workload for a second service, in accordance with various embodiments and implementations of the subject disclosure. FIG. 4 is a flow diagram showing example operations related to predicting a load for a first service and informing at least one other load-dependent service of the predicted load, in accordance with various embodiments and implementations of the subject disclosure. FIG. 5 is a flow diagram showing example operations related to obtaining a predicted load from another service and taking action based on that predicted load, in accordance with various embodiments and implementations of the subject disclosure. FIG. 6 is a flow diagram showing example operations related to determining predicted workload data based on first workload data of a first service running via a provider platform, and allocating second resources for a second running service based on the predicted workload data, in accordance with various embodiments and implementations of the subject disclosure. FIG. 7 is a flow diagram showing example operations related to allocating resources to a second service based on second predicted workload data that is determined based on first predicted workload data of a first service, in accordance with various embodiments and implementations of the subject disclosure. FIG. 8 is a flow diagram showing example operations related to determining first predicted workload data of a first service and allocating resources to a second service based on the first predicted workload data, in accordance with various embodiments and implementations of the subject disclosure. FIG. 9 is a block diagram representing an example computing environment into which embodiments of the subject matter described herein may be incorporated. FIG. 10 depicts an example schematic block diagram of a computing environment with which the disclosed subject matter can interact/be implemented at least in part, in accordance with various embodiments and implementations of the subject disclosure. DETAILED DESCRIPTION The technology described herein is generally directed towards automatically scaling cloud provider resources allocated for one enterprise's service based on the resources being used or expected to be used by another enterprise's service, in which there is a load-dependency relationship between the two enterprise's services. By way of example, consider that one enterprise's service (e.g., a lending company) makes queries to another enterprise (e.g., a credit monitoring company) to obtain credit scores for potential borrowers. If the lending enterprise needs its cloud resources scaled up to handle a large number of actual or expected requests, then the cloud resources currently allocated to the credit monitoring company may be inadequate to efficiently process and respond to the incoming queries from the lending company. Rather than waiting for the credit monitoring company's service to become a bottleneck because of the inadequate cloud resources currently allocated to the credit monitoring service, as spinning up additional resources takes time, the technology described herein predicts the need for additional cloud resources by the credit monitoring c