Search

US-12619602-B2 - Propagating resource scaling information between source and target data stores of a materialized view

US12619602B2US 12619602 B2US12619602 B2US 12619602B2US-12619602-B2

Abstract

A materialized view management service (MVMS) is capable of monitoring resource allocation changes of a source data object at a source data store and responsively generating resource change alerts to the owner of a target data object (the materialized view) in the target data store. Resource allocation changes may include autoscaling changes to the source data object's partition scheme, throughput limit, storage limit, and the like. The MVMS generates resource change alerts in response to these detected events and pushes the alerts to interested subscribers. Depending on the embodiment, the alerts may be pushed to human administrators, or the target data store itself, which may be configured to automatically adjust the resource allocation of the target data object to match the source data object. Advantageously, the disclosed alerts allow view owners to gain real time visibility of resource auto-scaling at the data source and appropriately react to such changes.

Inventors

  • Akshat Vig
  • Sharatkumar Nagesh Kuppahally
  • Bradley James Curran

Assignees

  • AMAZON TECHNOLOGIES, INC.

Dates

Publication Date
20260505
Application Date
20250102

Claims (20)

  1. 1 . A system, comprising: at least one processor; and a memory, storing program instructions that when executed on or across the at least one processor, cause the at least one processor to: create, in a target data store of a second type of database system, a materialized view of a source data object stored in a source data store of a first type of database system; receive, via a first interface, resource metadata about the source data object from the source data store, wherein the resource metadata indicates a change to a resource allocation of the source data object; determine, based at least in part on a view definition of the materialized view, to update a resource allocation of the materialized view to handle an amount of data from the source data store after the change to the resource allocation of the source data object; and generate, via a second interface, a resource change alert based at least in part on the change to the resource allocation of the source data object, wherein the resource change alert includes an instruction that specifies the update to a resource allocation of the materialized view in response to the change to the resource allocation of the source data object.
  2. 2 . The system of claim 1 , further comprising: program instructions stored on the memory or other memory, that when executed on or across the at least one processor or another processor, cause a target connector to: obtain changes to be made to the materialized view; and perform one or more translations on the changes, comprising: data-type conversion, operation conversion, or generation of one or more parameters needed to perform an update request to make a corresponding change in the materialized view.
  3. 3 . The system of claim 1 , further comprising: program instructions stored on the memory or other memory, that when executed on or across the at least one processor or another processor, cause a target connector to, for one or more obtained changes to be made to the materialized view: enforce one or more ordering constraints; perform deduplication; or track whether the changes are performed successfully.
  4. 4 . The system of claim 1 , wherein the source data object is stored in a source data store of a first type of database system of a service provider network that provides at least compute and storage services, comprising the first type of database system, over a service provider network to clients; and wherein the materialized view of the source data object is stored in a target data store of a second type of database system of a client premise network.
  5. 5 . The system of claim 1 , further comprising: a service provider network configured to provide compute and storage service, comprising the first and second types of database systems, over a service provider network to clients; wherein the source data store and the target data store are implemented on the service provider network, via the first and second types of database systems.
  6. 6 . The system of claim 1 , wherein: the at least one processor and the memory storing program instructions are provisioned in a network-accessible service provider network; and at least one of the first data store and the second data store is provisioned outside the network-accessible service provider network.
  7. 7 . The system of claim 1 , wherein the change to the resource allocation of the source data object includes a change to one or more of: a partition or sharding scheme of the source data object; an amount of compute or storage nodes allocated to the source data object; a write throughput limit of the source data object; and a data storage limit of the source data object.
  8. 8 . A method, comprising: performing, by one or more processors and associated memory: creating, in a target data store of a second type of database system, a materialized view of a source data object stored in a source data store of a first type of database system; receiving, via a first interface, resource metadata about the source data object from the source data store, wherein the resource metadata indicates a change to a resource allocation of the source data object; determining, based at least in part on a view definition of the materialized view, to update a resource allocation of the materialized view to handle an amount of data from the source data store after the change to the resource allocation of the source data object; and generating, via a second interface, a resource change alert based at least in part on the change to the resource allocation of the source data object, wherein the resource change alert includes an instruction that specifies the update to a resource allocation of the materialized view in response to the change to the resource allocation of the source data object.
  9. 9 . The method of claim 8 , further comprising: obtaining, by a target connector, changes to be made to the materialized view; and performing, by the target connector, translation on the changes, comprising: data-type conversion, operation conversion, or generation of one or more parameters needed to perform an update request to make a corresponding change in the materialized view.
  10. 10 . The method of claim 8 , further comprising, for one or more obtained changes to be made to the materialized view: enforcing one or more ordering constraints; performing deduplication; or tracking whether the changes are performed successfully.
  11. 11 . The method of claim 8 , wherein the source data object is stored in the source data store of the first type of database system of a service provider network that provides at least compute and storage services, comprising the first type of database system, over a service provider network to clients; and wherein the materialized view of the source data object is stored in a target data store of a second type of database system of a client premise network.
  12. 12 . The method of claim 8 , wherein the change to the resource allocation of the source data object includes a change to one or more of: a partition or sharding scheme of the source data object; an amount of compute or storage nodes allocated to the source data object; a write throughput limit of the source data object; and a data storage limit of the source data object.
  13. 13 . The method of claim 8 , further comprising: identifying the resource metadata in a stream of data changes on the source data; detecting, from the resource metadata, a change in a partition scheme of the source data object; changing, in response to the detection of the partition scheme change, a stream partition scheme used to send view data changes to the target data store; and indicating the change of the stream partition scheme in the resource change alert.
  14. 14 . The method of claim 8 , further comprising: tracking one or more performance metrics of the materialized view, including a backlog metric or a latency metric; and indicating the one or more performance metrics in the resource change alert.
  15. 15 . One or more non-transitory computer-readable storage media storing program instructions that when executed on or across one or more computing devices cause the computing devices to: create, in a target data store of a second type of database system, a materialized view of a source data object stored in a source data store of a first type of database system; receive, via a first interface, resource metadata about the source data object from the source data store, wherein the resource metadata indicates a change to a resource allocation of the source data object; determine, based at least in part on a view definition of the materialized view, to update a resource allocation of the materialized view to handle an amount of data from the source data store after the change to the resource allocation of the source data object; and generate, via a second interface, a resource change alert based at least in part on the change to the resource allocation of the source data object, wherein the resource change alert includes an instruction that specifies the update to a resource allocation of the materialized view in response to the change to the resource allocation of the source data object.
  16. 16 . The one or more non-transitory computer-readable storage media of claim 15 , wherein the program instructions, when executed, implement a target connector to: obtain changes to be made to the materialized view; perform target-specific translation on the changes, comprising: data-type conversion, operation conversion, or generation of one or more parameters needed to perform an update request to make a corresponding change in the materialized view.
  17. 17 . The one or more non-transitory computer-readable storage media of claim 15 , wherein the program instructions, when executed, implement a target connector to, for one or more obtained changes to be made to the materialized view: enforce one or more ordering constraints; perform deduplication; or track whether the changes are performed successfully.
  18. 18 . The one or more non-transitory computer-readable storage media of claim 15 , wherein the program instructions, when executed, cause the resource change alert to be generated via a programmatic interface subscribed to by the target data store.
  19. 19 . The one or more non-transitory computer-readable storage media of claim 15 , wherein the program instructions when executed on or across the one or more computing devices to: detect, from the resource metadata, a change in a partition scheme of the source data object; in response to the detection of partition scheme change, change a stream partition scheme used to send the view data changes to the target data store; and indicate the change of the stream partition scheme in the resource change alert.
  20. 20 . The one or more non-transitory computer-readable storage media of claim 15 , wherein the resource change alert is generated based at least in part on a configured policy of the materialized view.

Description

PRIORITY CLAIM This application is a continuation of U.S. patent application Ser. No. 17/548,373, filed Dec. 10, 2021, which is hereby incorporated by reference herein in its entirety. BACKGROUND Customers of cloud-based service provider networks frequently use purpose-built databases to build applications. These purpose-built data stores may be implemented within the same cloud or across multiple clouds. Typically, customers building these applications write custom code to move data from one data store to another which requires long-term maintenance. As organizations accelerate the growth of application data in source data stores, they find that the custom-built solutions to move data become less reliable, and do not scale with the needs of their business. Reliability issues often lead to data backlog in the pipeline which incur additional developer and scaling costs. Processing terabyte scale data sets moving at thousands of requests per second per pipeline with these custom solutions require significant upfront planning and ongoing management of infrastructure at the source, targets and data transformation pipeline. As more and more of these data pipelines are built, the aggregate throughput can reach hundreds of millions of requests per second and the aggregate size of the pipelines can reach multiple petabytes. At such scale, the building, evolution, and operation of these data pipelines become significant challenges for the typical customer. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 illustrates a view management system that propagates resource scaling information between a source data store and a target data store of a materialized view, according to some embodiments. FIG. 2 illustrates a service provider network offering a materialized view management service and other services, including various data storage and processing services that generates materialized views according to received view, according to some embodiments. FIG. 3 illustrates an embodiment of a materialized view management service that implements managed materialized views created from data sources, according to some embodiments. FIG. 4 is a sequence diagram illustrating a process of propagating resource scaling information from a source data store of a materialized view to a target data store, according to some embodiments. FIG. 5 illustrates a materialized view management service that allows users to configure custom resource change alert policies for materialized views, according to some embodiments. FIGS. 6A and 6B illustrate example resource change alert policies for materialized views used by a materialized view management service, according to some embodiments. FIG. 7 illustrates various functionalities associated with configuration of a materialized view managed by a materialized view management service, according to some embodiments. FIG. 8 is a flowchart illustrating a process of propagating resource scaling information from a source data store to a target data store of a materialized view, according to some embodiments. FIG. 9 is a flowchart illustrating a process of adding source and target data objects to a materialized view managed by a materialized view management service, according to some embodiments. FIG. 10 illustrates an example computer system configured to implement one or more portions of the materialized view management service described herein, according to some embodiments. While embodiments are described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that embodiments are not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit embodiments to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope as defined by the appended claims. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include,” “including,” and “includes” mean including, but not limited to. It will also be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first contact could be termed a second contact, and, similarly, a second contact could be termed a first contact, without departing from the scope of the present invention. The first contact and the second contact are both contacts, but they are not the same contact. DETAILED DESCRIPTION OF EMBODIMENTS Customers of cloud-based service provider networks