US-12619631-B2 - Caching systems and methods

US12619631B2US 12619631 B2US12619631 B2US 12619631B2US-12619631-B2

Abstract

Example caching systems and methods are described. In one implementation, a method receives a query, at an execution platform, directed to data stored across a plurality of shared storage devices, the execution platform comprising one or more execution nodes, an execution node comprising a plurality of processors. The method processes the query using the one or more execution nodes of the execution platform, and in response to a determination of a change in a number of execution nodes of the execution platform, wherein the change is creating a new execution node, wherein a first subset of the plurality of processors comprises a minimal cache and a second subset of the plurality of processors comprises a cache providing faster input-output operations, reassigns processing of the query, among the changed number of execution nodes of the execution platform.

Inventors

Benoit Dageville
Thierry Cruanes
Marcin Zukowski

Assignees

SNOWFLAKE INC.

Dates

Publication Date: 20260505
Application Date: 20230928

Claims (20)

1 . A method comprising: receiving, an execution platform of a data processing platform, a query directed to data stored across a plurality of shared storage devices in a multi-tenant database that isolates computing resources and data storage resources between different customers, wherein the execution platform comprises a plurality of virtual warehouses including a virtual warehouse; selecting a computing system from amongst a plurality of computing systems to implement a first execution node in the virtual warehouse based on communication capabilities of networks within a geographic location, wherein the first execution node comprises a first cache, wherein the first cache stores at least a portion of the data, wherein each of the plurality of virtual warehouses is able to access all of a plurality of data storage resources of a storage platform that is independent from the execution platform, and wherein the virtual warehouse is able to access a data storage resource in the plurality of data storage resources at a same time as a second virtual warehouse in the plurality of virtual warehouses; accessing, by a resource manager of the data processing platform and via a communication link implemented using a data communication network, a metadata database storing metadata that includes information regarding how the at least the portion of the data is organized in the first cache and in the storage platform, wherein the metadata is stored separately from the execution platform, and wherein the metadata identifies a subset of a plurality of rows of a table associated with the data and a subset of a plurality of columns associated with the data; processing, based on the accessed metadata, the query using the first execution node of the execution platform to identify expected tasks of the first execution node related to the query; creating a second execution node in the virtual warehouse based on the expected tasks of the first execution node related to the query, wherein the second execution node comprises a second cache, wherein a size of the first cache differs from a size of the second cache, wherein a speed of input-output operations of the first cache differs from a speed of input-output operations of the second cache; and in response to creating the second execution node, reassigning, by a processing device, processing of at least a portion of the query to the second execution node based on at least one of the size of the first cache, the size of the second cache, the speed of the input-output operations of the first cache, or the speed of the input-output operations of the second cache.
2 . The method of claim 1 , wherein the multi-tenant database comprises a set of tables.
3 . The method of claim 2 , wherein at least one table of the set of tables is encrypted and is subsequently decrypted before executing the query.
4 . The method of claim 2 , wherein at least one table of the set of tables is compressed and is subsequently decompressed before executing the query.
5 . The method of claim 2 , wherein: the first execution node comprises a plurality of processors and each processor of the plurality of processors processes one table of the set of tables; and data from the processed one table of the set of tables is stored in a cache associated with a processor.
6 . The method of claim 1 , wherein the multi-tenant database is a relational database.
7 . The method of claim 6 , wherein the relational database is a structured query language database.
8 . The method of claim 1 , wherein the query is received from a client, the method further comprising: generating a set of results; and returning the set of results to the client.
9 . The method of claim 1 , further comprising: optimizing the query.
10 . The method of claim 1 , wherein processing the query is based at least in part on a set of statistics.
11 . The method of claim 10 , wherein the set of statistics is automatically accumulated.
12 . The method of claim 10 , wherein the set of statistics is automatically updated.
13 . The method of claim 1 , wherein the metadata includes a summary of the at least the portion of the data.
14 . The method of claim 1 , wherein selecting the computing system from amongst the plurality of computing systems to implement the first execution node is additionally based on first communication capabilities of first networks between geographic locations.
15 . The method of claim 1 , wherein selecting the computing system from amongst the plurality of computing systems to implement the first execution node is additionally based on which computing systems in the plurality of computing systems are currently implementing other execution nodes.
16 . A system comprising: a memory; and a processing device operatively coupled to the memory, the processing device to: receive, at an execution platform of a data processing platform, a query directed to data stored across a plurality of shared storage devices in a multi-tenant database that isolates computing resources and data storage resources between different customers, wherein the execution platform comprises a plurality of virtual warehouses including a virtual warehouse; select a computing system from amongst a plurality of computing systems to implement a first execution node in the virtual warehouse based on communication capabilities of networks within a geographic location, wherein the first execution node comprises a first cache, wherein the first cache stores at least a portion of the data, wherein each of the plurality of virtual warehouses is able to access all of a plurality of data storage resources of a storage platform that is independent from the execution platform, and wherein the virtual warehouse is able to access a data storage resource in the plurality of data storage resources at a same time as a second virtual warehouse in the plurality of virtual warehouses; access, by a resource manager of the data processing platform and via a communication link implemented using a data communication network, a metadata database storing metadata that includes information regarding how the at least the portion of the data is organized in the first cache and in the storage platform, wherein the metadata is stored separately from the execution platform, and wherein the metadata identifies a subset of a plurality of rows of a table associated with the data and a subset of a plurality of columns associated with the data; process, based on the accessed metadata, the query using the first execution node of the execution platform to identify expected tasks of the first execution node related to the query; create a second execution node in the virtual warehouse based on the expected tasks of the first execution node related to the query, wherein the second execution node comprises a second cache, wherein a size of the first cache differs from a size of the second cache, wherein a speed of input-output operations of the first cache differs from a speed of input-output operations of the second cache; and in response to the creation of the second execution node, reassign processing of at least a portion of the query to the second execution node based on at least one of the size of the first cache, the size of the second cache, the speed of the input-output operations of the first cache, or the speed of the input-output operations of the second cache.
17 . The system of claim 16 , wherein the first cache includes a memory device and a disk storage device.
18 . A non-transitory computer-readable medium having instructions stored thereon that, when executed by a processing device, cause the processing device to: receive, at an execution platform of a data processing platform, a query directed to data stored across a plurality of shared storage devices in a multi-tenant database that isolates computing resources and data storage resources between different customers, wherein the execution platform comprises a plurality of virtual warehouses including a virtual warehouse; select a computing system from amongst a plurality of computing systems to implement a first execution node in the virtual warehouse based on communication capabilities of networks within a geographic location, wherein the first execution node comprises a first cache, wherein the first cache stores at least a portion of the data, wherein each of the plurality of virtual warehouses is able to access all of a plurality of data storage resources of a storage platform that is independent from the execution platform, and wherein the virtual warehouse is able to access a data storage resource in the plurality of data storage resources at a same time as a second virtual warehouse in the plurality of virtual warehouses; access, by a resource manager of the data processing platform and via a communication link implemented using a data communication network, a metadata database storing metadata that includes information regarding how the at least the portion of the data is organized in the first cache and in the storage platform, wherein the metadata is stored separately from the execution platform, and wherein the metadata identifies a subset of a plurality of rows of a table associated with the data and a subset of a plurality of columns associated with the data; process, based on the accessed metadata, the query using the first execution node of the execution platform to identify expected tasks of the first execution node related to the query; creating a second execution node in the virtual warehouse based on the expected tasks of the first execution node related to the query, wherein the second execution node comprises a second cache, wherein a size of the first cache differs from a size of the second cache, wherein a speed of input-output operations of the first cache differs from a speed of input-output operations of the second cache; and in response to the creation of the second execution node, reassign, by the processing device, processing of at least a portion of the query to the second execution node based on at least one of the size of the first cache, the size of the second cache, the speed of the input-output operations of the first cache, or the speed of the input-output operations of the second cache.
19 . The non-transitory computer-readable medium of claim 18 , wherein the query comprises a set of tables.
20 . The non-transitory computer-readable medium of claim 19 , wherein at least one table of the set of tables is encrypted and is subsequently decrypted before executing the query.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of co-pending U.S. patent application Ser. No. 14/518,971, filed Oct. 20, 2014, which claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Patent Application No. 61/941,986, filed Feb. 19, 2014, and these applications are hereby incorporated by reference herein in their entirety. TECHNICAL FIELD The present disclosure relates to resource management systems and methods that manage the caching of data. BACKGROUND Many existing data storage and retrieval systems are available today. For example, in a shared-disk system, all data is stored on a shared storage device that is accessible from all of the processing nodes in a data cluster. In this type of system, all data changes are written to the shared storage device to ensure that all processing nodes in the data cluster access a consistent version of the data. As the number of processing nodes increases in a shared-disk system, the shared storage device (and the communication links between the processing nodes and the shared storage device) becomes a bottleneck that slows data read and data write operations. This bottleneck is further aggravated with the addition of more processing nodes. Thus, existing shared-disk systems have limited scalability due to this bottleneck problem. Another existing data storage and retrieval system is referred to as a “shared-nothing architecture.” In this architecture, data is distributed across multiple processing nodes such that each node stores a subset of the data in the entire database. When a new processing node is added or removed, the shared-nothing architecture must rearrange data across the multiple processing nodes. This rearrangement of data can be time-consuming and disruptive to data read and write operations executed during the data rearrangement. And, the affinity of data to a particular node can create “hot spots” on the data cluster for popular data. Further, since each processing node performs also the storage function, this architecture requires at least one processing node to store data. Thus, the shared-nothing architecture fails to store data if all processing nodes are removed. Additionally, management of data in a shared-nothing architecture is complex due to the distribution of data across many different processing nodes. The systems and methods described herein provide an improved approach to data storage and data retrieval that alleviates the above-identified limitations of existing systems. BRIEF DESCRIPTION OF THE DRAWINGS Non-limiting and non-exhaustive embodiments of the present disclosure are described with reference to the following figures, wherein like reference numerals refer to like parts throughout the various figures unless otherwise specified. FIG. 1 is a block diagram depicting an example embodiment of the systems and methods described herein. FIG. 2 is a block diagram depicting an embodiment of a resource manager. FIG. 3 is a block diagram depicting an embodiment of an execution platform. FIG. 4 is a block diagram depicting an example operating environment with multiple users accessing multiple databases through multiple virtual warehouses. FIG. 5 is a block diagram depicting another example operating environment with multiple users accessing multiple databases through a load balancer and multiple virtual warehouses contained in a virtual warehouse group. FIG. 6 is a block diagram depicting another example operating environment having multiple distributed virtual warehouses and virtual warehouse groups. FIG. 7 is a flow diagram depicting an embodiment of a method for managing data storage and retrieval operations. FIG. 8 is a flow diagram depicting an embodiment of a method for managing a data cache. FIG. 9 is a block diagram depicting an example computing device. DETAILED DESCRIPTION The systems and methods described herein provide a new platform for storing and retrieving data without the problems faced by existing systems. For example, this new platform supports the addition of new nodes without the need for rearranging data files as required by the shared-nothing architecture. Additionally, nodes can be added to the platform without creating bottlenecks that are common in the shared-disk system. This new platform is always available for data read and data write operations, even when some of the nodes are offline for maintenance or have suffered a failure. The described platform separates the data storage resources from the computing resources so that data can be stored without requiring the use of dedicated computing resources. This is an improvement over the shared-nothing architecture, which fails to store data if all computing resources are removed. Therefore, the new platform continues to store data even though the computing resources are no longer available or are performing other tasks. In the following description, reference is made to the accompanying drawings that form a part thereof, and