US-20260127126-A1 - Heterogeneous Compute Platform Architecture For Efficient Hosting Of Network Functions

US20260127126A1US 20260127126 A1US20260127126 A1US 20260127126A1US-20260127126-A1

Abstract

The present disclosure provides for a converged compute platform architecture, including a first infrastructure processing unit (IPU)-only configuration and a second configuration wherein the IPU is coupled to a central processing unit, such as an x86 processor. Connectivity between the two configurations may be accomplished with a PCIe switch, or the two configurations may communicate through remote direct memory access (RDMA) techniques. Both configurations may use ML acceleration through a single converged architecture.

Inventors

Santanu Dasgupta
Bok Knun Randolph Chung
Ankur Jain
Prashant Chandra
Bor Chan
Durgaprasad V. Ayyadevara
Ian Kenneth Coolidge
Muzammil Mueen Butt

Assignees

GOOGLE LLC

Dates

Publication Date: 20260507
Application Date: 20260106

Claims (15)

1 . A method of converging one or more processors including an infrastructure processing unit (IPU) and a central processing unit (CPU) comprising: coupling the IPU and the CPU to a programmable interconnect; coupling one or more peripheral components to the programmable interconnect, the plurality of peripheral components accessible by each of the IPU and the CPU via the programmable interconnect; and utilizing one or more of the one or more peripheral components by one of the IPU or the CPU, independent of the other processors.
2 . The method of claim 1 , wherein the one or more peripheral components comprise at least one of a network interface card (NIC) or an accelerator.
3 . The method of claim 1 , wherein the one or more processors are implemented in a system on chip (SoC).
4 . The method of claim 1 , further comprising: accessing by the IPU or the CPU, at least one storage device via the programmable interconnect.
5 . The method of claim 4 , wherein the at least one storage device is directly coupled to the programmable interconnect.
6 . The method of claim 5 , further comprising: accessing by the CPU, the at least one storage device via the IPU.
7 . The method of claim 6 , further comprising: accessing the at least one storage unit using remote direct memory access.
8 . The method of claim 1 , further comprising: accessing by the IPU or the CPU, at least one machine learning (ML) accelerator via the programmable interconnect.
9 . The method of claim 8 , wherein the at least one ML accelerator is directly coupled to the programmable interconnect.
10 . The method of claim 9 , further comprising: accessing by the CPU, the at least one ML accelerator via the IPU.
11 . The method of claim 10 , further comprising: accessing the at least one ML accelerator via remote direct memory access.
12 . The method of claim 1 , further comprising: coupling a first root of trust to the IPU; and coupling a second root of trust to the CPU wherein the first root of trust is different from the second root of trust.
13 . The method of claim 12 , further comprising: coupling a peripheral interconnect express (PCIe) switch between the IPU and the CPU.
14 . The method of claim 13 , further comprising: directly connecting one or more accelerators to the PCIe switch, the one or more accelerators being accessible by each of the one or more processors.
15 . The method of claim 1 further comprising: directly coupling a plurality of network interface cards (NICs) to the CPU.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present application is a continuation of U.S. Non-provisional Ser. No. 18/213,028 filed Jun. 22, 2023, which claims the benefit of the filing date of U.S. Provisional Ser. No. 63/355,848 filed Jun. 27, 2022, the disclosures of which is hereby incorporated herein by reference. BACKGROUND Communication Service Providers (CSPs) worldwide are embracing disaggregation, cloud, automation and machine learning (ML)/artificial intelligence (AI) to achieve software centricity to become agile and customer experience centric. CSPs are virtualizing various network functions, deploying them on general servers, leveraging cloud native technologies across all domains of end-to-end systems architecture. An initial phase started with operations support system and business support system (OSS/BSS) that are typically deployed centrally in a CSP network, and in later phases, virtualization expanded to the core network at regional data centers or service edge of the CSP. Traffic over the Internet is doubling almost every 2 years, and in order to maintain a proper balance between supply and demand, the computing infrastructure also needs to be doubled every 2 years. However, the density of transistors within the same sized Integrated Circuit (IC) and at the same power footprint is no longer doubling anymore, which can create an imbalance where the supply may not be able to keep up with the traffic demand anymore in a cost and power efficient manner. BRIEF SUMMARY The present application relates to deployment of virtualized/containerized network functions. An example relates to a virtualized distributed unit (vDU) and virtualized centralized unit (vCU) of a 4G or 5G Radio Access Network (RAN). Virtual distributed unit (vDU) and Virtual Centralized Unit (vCU) network functions of 4G/5G radio access networks (RAN) involves deployment of physical layer, scheduler, data link layer and packet processing layers including the control components of the data link. Given the involvement of lower layer components of the protocol stack, the vDU poses extremely stringent requirements for computing around high bandwidth with no packet loss, extreme low latency, predictability, reliability and security. Some of these requirements create the need for the cloud infrastructure to deliver real-time performance. Wireline access networks such as cable modem termination system (CMTS) in a cable network may have similar system requirements. To address such requirements in existing systems, vDUs and vCUs are deployed on top of x86 general purpose processors (GPP), often alongside a lookaside or inline acceleration building block (for vDU) to offload very high compute intensive processing such as the computation of forward error correction. The incoming traffic in such arrangements comes in through a dedicated network interface controller (NIC), followed by the GPP based central processing unit (CPU) processing the physical layer functions (Hi-PHY) including lookaside acceleration to process channel coding or forward error correction (FEC), followed by the GPP based CPU again that processes the scheduler and data link layer functions. The present disclosure provides a common and horizontal telephone communication (telco) cloud infrastructure that can form the foundation for virtualization of both wireless networks, such as 4G and 5G and other radio access networks (RANs), 5G Core network (5GC) and wirelines access networks, such as cable/fiber based broadband networks. Such infrastructure can be deployed in a highly distributed manner across hundreds of thousands of sites. Such infrastructure may provide an agile, secure and efficient platform to deploy all network and information technology (IT) functions in a seamless manner. Such infrastructure may also provide higher performance and lower power consumption, while also bringing in newer capabilities to address artificial intelligence and security challenges in the new world. A compute platform architecture described herein provides for secure and efficient deployment of CSP network functions, particularly for access networking like 4G & 5G RAN, 5G NSA (Non-Stand Alone) and SA (Stand Alone) core, cable and fiber broadband. The compute platform architecture may be modular, with a host computer as a main building block along with an optional L1 processor as a PCIe device. This architecture may include a first configuration, leveraging an infrastructure processing unit (IPU) in a headless mode without a dedicated host central processing unit (CPU). Embedded Arm CPU cores within the IPU may be implemented to deploy an operating system and network function applications. In other examples the architecture may include a second configuration, using an x86 or an Arm GPP CPU along with the IPU. Host operating system and network function application's control and management plane may be hosted on the CPU in the second configuration. Moreover, application's user plane or