US-20260128987-A1 - Generalized Edge Compute (GEC) architecture with egress link safety

US20260128987A1US 20260128987 A1US20260128987 A1US 20260128987A1US-20260128987-A1

Abstract

A Generalized Edge Compute (GEC) architecture that enables customers to deploy their applications in a VM environment hosted on overlay network hardware and software, thereby leveraging all of the advantages provided by a widely-distributed overlay. The architecture also includes a link safety mechanism to ensure that GEC traffic does not over-consume link resources associated with an edge host.

Inventors

Utkarsh Goel
Igor B. Lubashev
Anna R. Blasiak
Kevin P. Fuerst

Assignees

AKAMAI TECHNOLOGIES, INC.

Dates

Publication Date: 20260507
Application Date: 20241025

Claims (18)

1 . A method for traffic management over a link of a set of one or more links that is utilized for traffic that is a blend of controllable traffic and non-controllable traffic, comprising: deploying a cloud compute control plane to a network host, the network host being one of a set of distributed hosts comprising a multi-tenant shared infrastructure; instantiating one or more virtual machines on the network host; monitoring traffic being served over the link and associated with the one or more virtual machines, the traffic corresponding to at least the non-controllable traffic; and throttling the non-controllable traffic associated with the one or more virtual machines on the link to ensure that the link does not become congested for the controllable traffic.
2 . The method as described in claim 1 , further including: defining a link capacity for the one or more virtual machines; determining whether the non-controllable traffic associated with the one or more virtual machines is within a configurable threshold of the link capacity; and based on a determination that the non-controllable traffic associated with the one or more virtual machines is within the configurable threshold of the link capacity, raising an alert.
3 . The method as described in claim 2 , wherein, in response to the alert, and for each of the one or more virtual machines: computing a fair share of bandwidth on the link for the virtual machine; identifying whether the virtual machine is egressing more than the fair share computed for the virtual machine; and applying a rate limit to the traffic being served over the link by the virtual machine when the virtual machine is identified as egressing more than its fair share for the link.
4 . The method as described in claim 3 , further including modifying at least one rate limit previously applied to a virtual machine upon a determination that the monitored traffic is within the configurable threshold of the link capacity.
5 . The method as described in claim 3 , wherein the fair share of the bandwidth on the link is computed based on the link capacity, a current virtual machine utilization of the link, and one or more characteristics of a virtual machine.
6 . The method as described in claim 4 , wherein the one or more characteristics of a virtual machine are one of: a virtual machine plan value, a cost, a number of virtual CPUs, an amount of RAM, and a bandwidth cap, and combinations thereof.
7 . The method as described in claim 4 , wherein the multi-tenant shared infrastructure is a content delivery network, the host is an edge machine, and the controllable traffic is associated with customers of the content delivery network.
8 . The method as described in claim 2 , wherein the one or more virtual machines are permitted to operate without bandwidth constraints while the monitored traffic is below the configurable threshold of the link capacity.
9 . The method as described in claim 1 , further including prioritizing bandwidth on the link for a first virtual machine over a second virtual machine when the traffic is throttled, wherein the first virtual machine has higher bandwidth allowance and requirement than the second virtual machine as reflected in a virtual machine plan size or vCPU count.
10 . The method as described in claim 1 , wherein the virtual machines are instantiated on the network host for multiple distinct tenants.
11 . The method as described in claim 1 , wherein the link is an egress link of a data center.
12 . The method as described in claim 1 , wherein the data center is associated with a collection of edge regions that together comprise an Equivalence-Class-Of-Region (ECOR).
13 . An apparatus, comprising: one or more hardware processors; computer memory holding computer program instructions, the computer program instructions comprising program code configured to provide traffic management over a link of a set of one or more links that is utilized for traffic that is a blend of controllable traffic and non-controllable traffic, the program code configured to: monitor traffic being served over the link and associated with the one or more virtual machines, the traffic corresponding to at least the non-controllable traffic; and throttle the non-controllable traffic associated with the one or more virtual machines on the link to ensure that the link does not become congested for the controllable traffic.
14 . The apparatus as described in claim 13 , wherein the program code is further configured to: determine whether the non-controllable traffic associated with the one or more virtual machines is within a configurable threshold of the link capacity; and based on a determination that the non-controllable traffic associated with the one or more virtual machines is within the configurable threshold of the link capacity, raise an alert.
15 . The apparatus as described in claim 14 , wherein the program code is further configured, in response to the alert, and for each of the one or more virtual machines, to: compute a fair share of bandwidth on the link for the virtual machine; identify whether the virtual machine is egressing more than the fair share computed for the virtual machine on the link; and apply a rate limit to the traffic being served over the link by the virtual machine when the virtual machine is identified as egressing more than its fair share for the link.
16 . The apparatus as described in claim 15 , wherein the rate limit is applied using an operating system traffic control subsystem.
17 . A method for traffic management over a link associated with an overlay network, wherein at least a portion of a capacity of the link is anticipated to be required to handle traffic that is not directly controllable by a traffic manager associated with the network, comprising: deploying a cloud compute control plane to a network host, the network host being one of a set of distributed hosts comprising a multi-tenant shared infrastructure; instantiating a set of virtual machines on the network host; monitoring the traffic being served over the link and associated with the set of virtual machines; and in response to the monitoring, selectively throttling the traffic associated with the set of virtual machines on the link by adjusting how the portion of the link capacity is allocated to at least one or more of the virtual machines of the set.
18 . The method as described in claim 17 , wherein each of the virtual machines of the set is allocated a determined fair share for that virtual machine.

Description

BACKGROUND OF THE INVENTION Distributed computer systems are well-known in the prior art. One such distributed computer system is a “content delivery network” (CDN) or “overlay network” that is operated and managed by a service provider. The service provider typically provides the content delivery service on behalf of third parties (customers) who use the service provider's shared infrastructure. A distributed system of this type typically refers to a collection of autonomous computers linked by a network or networks, together with the software, systems, protocols and techniques designed to facilitate various services, such as content delivery, web application acceleration, or other support of outsourced origin site infrastructure. A CDN service provider typically provides service delivery through digital properties (such as a website), which are provisioned in a customer portal and then deployed to the network. Cloud computing is an information technology delivery model by which shared resources, software and information are provided on-demand over a network (e.g., the publicly-routed Internet) to computers and other devices. This type of delivery model has significant advantages in that it reduces information technology costs and complexities, while at the same time improving workload optimization and service delivery. In a typical use case, an application is hosted from network-based resources and is accessible through a conventional browser or mobile application. Cloud compute resources typically are deployed and supported in data centers that run one or more network applications, typically using a virtualized architecture wherein applications run inside virtual servers, or virtual machines (VMs), which are mapped onto physical servers in the data center. The virtual machines typically run on top of a hypervisor, which allocates physical resources to the virtual machines. Traditional cloud providers typically support VMs and containers in a relatively small number of core data centers. Recently, the notion of “generalized edge compute” (GEC) has been proposed, wherein the capabilities of a multi-tenant cloud compute infrastructure are extended to edge Points of Presence (PoPs) of an overlay such as a CDN. By enabling full stack computing power to be brought to hundreds of previously hard to reach locations, deploying a cloud compute infrastructure control plane on overlay network edge machines would provide significant advantages. Indeed, deploying compute into an edge platform would also take advantage of existing overlay network operational tools, processes, and observability-enabling developers to innovate across the entire continuum of compute, providing a consistent experience from centralized cloud to distributed edge. While a GEC solution such as described could provide significant advantages—by enabling customers to deploy applications in a VM environment hosted in CDN edge hardware—the potential integration of these solutions raises traffic management concerns. In particular, a GEC solution would allow customers to host bandwidth-intensive applications, generate web-like traffic, mix both traffic patterns, or bring new traffic profiles, all without prior knowledge or approval of the CDN provider that is responsible for managing traffic delivered from its edge infrastructure. Indeed, customers on the GEC network are expected to run any type of workload at any time, to use their own load balancers (for a typical multi-tenant compute implementation), and to do so without knowledge or visibility about the CDN's own traffic demands and available link capacities. This has the potential to cause congestion on the overlay network, thereby potentially impacting CDN and other services running on the platform, and to overload links, potentially leaving minimal bandwidth for other overlay network services. SUMMARY OF THE INVENTION A cloud compute infrastructure control plane is deployed on overlay network edge machines. As noted above, this generalized edge compute (GEC) solution combines the computing power of the cloud compute infrastructure with the proximity and efficiency of the edge to put workloads closer to users. According to this disclosure, this generalized edge compute architecture is enhanced with a link safety mechanism configured to limit egress traffic for customers, especially high-bandwidth customers. In non-saturation scenarios, the GEC network is allowed to use as much capacity as it needs, while leaving bandwidth for other services and preventing congestion on the link. When, however, a link is determined to be approaching saturation, the link safety mechanism is engaged. In one embodiment, and as a consequence, GEC traffic on the link is reduced to a configurable amount of link capacity. The GEC network may be configured in association with a single edge machine region, or in a set of such regions and their associated network infrastructure (e.g., within a metropolitan area or “metro”