EP-4738143-A1 - BULK API WITH CHUNKED TRANSFER ENCODING FOR CLOUD APPLICATIONS

EP4738143A1EP 4738143 A1EP4738143 A1EP 4738143A1EP-4738143-A1

Abstract

Example methods and systems are directed to a bulk API with chunked transfer encoding for cloud applications. The Replication Management Service (RMS) is a cloud-based application designed to replicate data for a variety of data sources. The RMS provides a collection of RESTful APIs to facilitate the creation and execution of replication tasks. The RMS receives multiple chunks in a REST communication and generates a replication flow in response. Since all replication tasks in a replication flow use the same constellation, the application using the RMS only sends the constellation data once regardless of the number of replication tasks in the replication flow. As a result, the total amount of data transferred between the application and the RMS is reduced whenever multiple replication tasks are performed on the same constellation, reducing network congestion and improving the performance of the cloud-based system.

Inventors

HUANG, XINRONG

Assignees

SAP SE

Dates

Publication Date: 20260506
Application Date: 20251028

Claims (15)

A system comprising: a memory that stores instructions; and one or more processors coupled to the memory to execute the instructions to perform operations comprising: receiving a first chunk of a hypertext transport protocol (HTTP) payload, the first chunk identifying a source space; receiving a second chunk of the HTTP payload, the second chunk identifying a target space; receiving a third chunk of the HTTP payload, the third chunk identifying a source table; receiving a fourth chunk of the HTTP payload, the fourth chunk identifying a target table; and based on the first chunk, the second chunk, the third chunk, and the fourth chunk, creating a replication task for copying data from the source table in the source space to the target table in the target space.
The system of claim 1, wherein the operations further comprise: receiving a fifth chunk of the HTTP payload, the fifth chunk identifying a constellation that comprises the source space and the target space.
The system of claim 1 or 2, wherein the operations further comprise: attempting to create a first data object that represents the source space; attempting to create a second data object that represents the target space; attempting to create a third data object that represents the source table; and prior to the task creation for copying of the data from the source table in the source space to the target table in the target space, determining that the attempts to create the first data object, the second data object, and the third data object were all successful.
The system of any one of claims 1 to 3, wherein the operations further comprise: receiving a fifth chunk of the HTTP payload, the fifth chunk identifying a second source table; receiving a sixth chunk of the HTTP payload, the sixth chunk identifying a second target table; and based on the first chunk, the second chunk, the fifth chunk, and the sixth chunk, creating task for copying data from the second source table in the source space to the second target table in the target space.
The system of any one of claims 1 to 4, wherein the first chunk, the second chunk, the third chunk, and the fourth chunk each comprise an identifier with a same value; and/or wherein the first chunk comprises a name of the source space, a connection identifier that identifies a connection to the source space, a connection type that identifies a type of the connection to a source space, and a maximum number of simultaneous connections to the source space.
The system of any one of claims 1 to 5, wherein the third chunk identifies a column of the source table; and/or wherein the fourth chunk identifies a column of the target table.
A non-transitory computer-readable medium that stores instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: receiving a first chunk of a hypertext transport protocol (HTTP) payload, the first chunk identifying a source space; receiving a second chunk of the HTTP payload, the second chunk identifying a target space; receiving a third chunk of the HTTP payload, the third chunk identifying a source table; receiving a fourth chunk of the HTTP payload, the fourth chunk identifying a target table; and based on the first chunk, the second chunk, the third chunk, and the fourth chunk, creating a task for copying data from the source table in the source space to the target table in the target space.
The non-transitory computer-readable medium of claim 7, wherein the operations further comprise: receiving a fifth chunk of the HTTP payload, the fifth chunk identifying a constellation that comprises the source space and the target space.
The non-transitory computer-readable medium of claim 7 or 8, wherein the operations further comprise: attempting to create a first data object that represents the source space; attempting to create a second data object that represents the target space; attempting to create a third data object that represents the source table; and prior to the task creation for copying of the data from the source table in the source space to the target table in the target space, determining that the attempts to create the first data object, the second data object, and the third data object were all successful.
The non-transitory computer-readable medium of any one of claims 7 to 9, wherein the operations further comprise: receiving a fifth chunk of the HTTP payload, the fifth chunk identifying a second source table; receiving a sixth chunk of the HTTP payload, the sixth chunk identifying a second target table; and based on the first chunk, the second chunk, the fifth chunk, and the sixth chunk, creating a task for copying data from the second source table in the source space to the second target table in the target space.
The non-transitory computer-readable medium of any one of claims 7 to 10, wherein the first chunk, the second chunk, the third chunk, and the fourth chunk each comprise an identifier with a same value; and/or wherein the first chunk comprises a name of the source space, a connection identifier that identifies a connection to the source space, a connection type that identifies a type of the connection to a source space, and a maximum number of simultaneous connections to the source space; and/or wherein the third chunk identifies a column of the source table; and/or wherein the fourth chunk identifies a column of the target table.
A method comprising: receiving, by one or more processors, a first chunk of a hypertext transport protocol (HTTP) payload, the first chunk identifying a source space; receiving a second chunk of the HTTP payload, the second chunk identifying a target space; receiving a third chunk of the HTTP payload, the third chunk identifying a source table; receiving a fourth chunk of the HTTP payload, the fourth chunk identifying a target table; and based on the first chunk, the second chunk, the third chunk, and the fourth chunk, creating a task for copying data from the source table in the source space to the target table in the target space.
The method of claim 12, further comprising: receiving a fifth chunk of the HTTP payload, the fifth chunk identifying a constellation that comprises the source space and the target space.
The method of claim 12 or 13, further comprising: attempting to create a first data object that represents the source space; attempting to create a second data object that represents the target space; attempting to create a third data object that represents the source table; and prior to the task creation for copying of the data from the source table in the source space to the target table in the target space, determining that the attempts to create the first data object, the second data object, and the third data object were all successful.
The method of any one of claims 12 to 14, further comprising: receiving a fifth chunk of the HTTP payload, the fifth chunk identifying a second source table; receiving a sixth chunk of the HTTP payload, the sixth chunk identifying a second target table; and based on the first chunk, the second chunk, the fifth chunk, and the sixth chunk, creating a task for copying data from the second source table in the source space to the second target table in the target space.

Description

TECHNICAL FIELD The subject matter disclosed herein generally relates to systems for communicating with cloud applications, and more specifically to a bulk application programming interface (API) with chunked transfer encoding for cloud applications. BACKGROUND Existing applications use multiple API calls to perform multiple tasks. Managing complex sets of tasks requires a substantial number of API calls. Error handling may further increase the number of API calls performed. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 shows a network diagram illustrating an example network environment suitable for using a bulk API with chunked transfer encoding for cloud applications.FIG. 2 shows a block diagram of components suitable for performing database replication using a bulk API with chunked transfer encoding.FIG. 3 shows a flowchart illustrating a method of implementing a bulk API with chunked transfer encoding for cloud applications, according to some example embodiments.FIG. 4 shows a flowchart illustrating a method of generating a request for a cloud application using a bulk API with chunked transfer encoding, according to some example embodiments.FIG. 5 shows a flowchart illustrating a method of handling a request for a cloud application using a bulk API with chunked transfer encoding, according to some example embodiments.FIG. 6 shows a flowchart illustrating a method of execution of a task created using a bulk API with chunked transfer encoding, according to some example embodiments.FIG. 7 shows a block diagram showing one example of a software architecture for a computing device.FIG. 8 shows a block diagram of a machine in the example form of a computer system within which instructions may be executed for causing the machine to perform any one or more of the methodologies discussed herein. DETAILED DESCRIPTION Example methods and systems are directed to a bulk API with chunked transfer encoding for cloud applications. In a microservices application, independent small services provide specific functions. The small services communicate with each other using remote call protocols such as hypertext transport protocol (HTTP) or Google remote procedure call (gRPC). Each service provides a set of APIs using representational state transfer (REST) to communicate with other services. REST is a valuable architectural style for microservices, thanks to its simplicity, flexibility, and scalability. However, the communication between services can be complex. The performance of communication between services appears to be the bottleneck of overall performance in microservices-based systems. Therefore, improving the performance of communication becomes extremely important and is critical to achieving high performance in microservices-based systems. The Replication Management Service (RMS) is a cloud-based application designed to replicate data for a variety of data sources. It provides a collection of RESTful APIs to facilitate the creation and execution of replication tasks. Task creation refers to the creation of a task definition with essential information. Task execution refers to the action of copying a table from the source to a designated target. Task creation necessitates multiple API calls to spawn related objects. However, with potentially hundreds of tables requiring replication, a considerable number of HTTP requests would need to take place. In addition, cleanup requires additional API calls to delete already created objects if a failure occurs during creation. This presents a challenge for the performance and stability of RMS as a cloud application. An RMS is a central service that manages replication task definitions such as source connection identifier (ID), target connection ID, source table metadata, target table metadata, and the like. The RMS orchestrates replication worker tasks and manages the replication task lifecycle. As discussed herein, an operation- and transaction-based bulk API with chunked transfer encoding reduces the quantity of requests and the request body size. This improves error handling, thereby advancing overall performance and stability of cloud-based applications. An RMS operates as a cloud-based application with a central role of replicating data across multiple source types. This is accomplished by deploying an assortment of RESTful APIs integral in generating replication tasks. For clarity, a "task" refers to the definition including essential information for transferring a single table's data from its original source to a designated target. The source and target can be any type of database or file supported by the RMS, such as Azure, HANA, Kafka, object store, or the like. The RMS makes use of an RMS repository that stores both replication task definitions and replication worker task states. The RMS repository may be implemented as an in-memory database. RMS workers are pipeline graphs that are controlled by a pipeline service. The RMS workers execute replication tasks to perform d