US-20260127159-A1 - Mechanisms for Testing Updates to Database Query Optimizer System

US20260127159A1US 20260127159 A1US20260127159 A1US 20260127159A1US-20260127159-A1

Abstract

Techniques are disclosed that relate to capturing and replaying database queries to assess the impacts of updates to a database system. A system may receive a plurality of queries from a set of users to execute against a database that stores data. The system identifies one or more of the queries that are deemed relevant to updates being made to the database system. The system executes the received queries and captures query execution information for the one or more identified queries. The system replays, based on the execution information, the one or more queries using the database system with the one or more updates enabled to determine a first performance of the database system. The system may generate a report indicating whether the first performance represents a reduction in performance relative to a second performance of the database system with the one or more updates disabled.

Inventors

Prateek Swamy
Yi Xia
Bradley Glasbergen
Jesse Collins
Lars Hofhansl
Jeena Vinod
Vaira Selvakani
Alfa Savla
Jeffrey Freschl

Assignees

SALESFORCE, INC.

Dates

Publication Date: 20260507
Application Date: 20241107

Claims (20)

1 . A method, comprising: receiving, by a computer system from a set of users, a plurality of queries to execute against a database that stores data for the set of users; identifying, by the computer system, a subset of the plurality of queries that is deemed relevant to one or more updates being made to a database system of the computer system, wherein the identifying includes detecting that a given query of the subset of queries triggered a capture function inserted into code of the database system at a location associated with the one or more updates; executing, by the computer system, the plurality of queries, wherein the executing includes capturing query execution information corresponding to the subset of queries that enables the computer system to replay the subset of queries; replaying, by the computer system based on the query execution information, the subset of queries against the database with the one or more updates enabled; and based on the replaying, the computer system providing an indication of a first performance of the database system with the one or more updates enabled.
2 . The method of claim 1 , further comprising: replaying, by the computer system based on the query execution information, the subset of queries against the database with the one or more updates disabled to derive a second performance of the database system; comparing, by the computer system, the first performance of the database system with the one or more updates enabled with the second performance of the database system with the one or more updates disabled; and generating, by the computer system, a report indicating an effect of the one or more updates on the database system based on the comparison between the first and second performances.
3 . The method of claim 1 , wherein the capturing query execution information includes: storing, by the database system of the computer system, the query execution information in a memory buffer accessible to a client system of the computer system, wherein the client system is operable to access the query execution information and issue, as a part of the replaying, the subset of against the database system with the one or more updates enabled.
4 . The method of claim 3 , wherein the storing the query execution information includes: determining, via a bloom filter, whether the query execution information is already stored in the memory buffer to prevent multiple instances of the query execution information from being stored in the memory buffer.
5 . The method of claim 1 , wherein the plurality of queries include both read-only queries and queries that change the data stored in the database, and wherein the subset of queries includes only read-only queries.
6 . The method of claim 1 , further comprising: providing, by the computer system, an indication that one or more errors occurred when executing at least one of the subset of queries against the database with the one or more updates enabled.
7 . The method of claim 1 , wherein the database system includes a set of primary nodes and a set of standby nodes, and wherein the replaying the subset of queries against the database with the one or more updates enabled occurs using the set of standby nodes.
8 . The method of claim 7 , wherein the executing the plurality of queries occurs using the set of primary nodes.
9 . The method of claim 1 , wherein the query execution information specifies, for a particular one of the subset of queries, query text, parameter values, and a set of configuration values.
10 . A non-transitory computer-readable medium having program instructions stored thereon that are capable of causing a computer system to perform operations comprising: receiving a plurality of queries from a set of users to execute against a database that stores data for the set of users; identifying a subset of the plurality of queries that is deemed relevant to one or more updates being made to a database system of the computer system, wherein the identifying includes detecting that a given query of the subset of queries triggered a capture function inserted into code of the database system at a location associated with the one or more updates; executing the plurality of queries; capturing query execution information corresponding to the subset of queries that enables the computer system to replay the subset of queries; replaying the subset of queries with the one or more updates enabled to derive a first performance of the database system; and determining whether the first performance represents a reduction in performance relative to a second performance of the database system with the one or more updates disabled.
11 . The non-transitory computer-readable medium of claim 10 , wherein the operations further comprise, after the executing, replaying the subset of queries against the database with the one or more updates disabled to determine the second performance of the database system.
12 . The non-transitory computer-readable medium of claim 10 , wherein the operations further comprise determining the second performance of the database system as part of the executing the plurality of queries.
13 . The non-transitory computer-readable medium of claim 10 , wherein the subset of queries includes read-only queries and exclude any queries that change the data stored in the database.
14 . The non-transitory computer-readable medium of claim 10 , wherein the database system includes a set of primary nodes and a set of standby nodes, and wherein the replaying the subset of queries includes issuing the subset of queries to the set of standby nodes.
15 . The non-transitory computer-readable medium of claim 10 , wherein the operations further comprise storing the query execution information in a memory buffer accessible to a client system operable to access the query execution information and issue the subset of against the database system with the one or more updates enabled.
16 . A system, comprising: one or more processors; memory having program instructions stored therein that are executable by the one or more processors to cause the system to perform operations comprising: receiving a plurality of queries from a set of users to execute against a database that stores data for the set of users; identifying a subset queries of the plurality of queries that is deemed relevant to one or more updates being made to a database system of the system, wherein the identifying includes detecting that a given query of the subset of queries triggered a capture function inserted into code of the database system at a location associated with the one or more updates; executing the plurality of queries, wherein the executing includes capturing query execution information corresponding to the subset of queries; replaying, based on the query execution information, the subset of using the database system with the one or more updates enabled; and based on the replaying, determining a first performance of the database system with the one or more updates enabled; and generating a report indicating whether the first performance represents a reduction in performance relative to a second performance of the database system with the one or more updates disabled.
17 . The system of claim 16 , wherein the database system includes a set of primary nodes and a set of standby nodes, and wherein the replaying the subset of queries occurs on the set of standby nodes.
18 . The system of claim 16 , wherein the operations further comprise replaying the subset of queries using the database system with the one or more updates disabled to determine the second performance of the database system.
19 . The system of claim 16 , wherein only queries of the plurality of queries that do not change the data stored in the database are captured in the query execution information.
20 . The system of claim 16 , wherein operations further comprise: determining, via a bloom filter, whether the query execution information is already stored in a memory buffer that is accessible to a client system to prevent multiple instances of the query execution information from being stored in the memory buffer; and storing the query execution information in the memory buffer in response to determining that an instance of the query execution information is not already stored in the memory buffer.

Description

BACKGROUND Technical Field This disclosure relates generally to database systems, and more specifically, to mechanisms for capturing and replaying queries for testing and evaluating updates to a database system. Description of the Related Art Testing computer systems is an important process in software and hardware development that ensures that the computer systems perform as expected under various conditions. This process often involves validating functionality, performance, security, and/or reliability by executing tests designed to identify defects, inconsistencies, or potential vulnerabilities. Various types of testing, such as unit, integration, system, and user acceptance testing, can be employed to evaluate different aspects of a system. Through these tests, developers can detect and address issues early, ensuring that the system meets user requirements, operates efficiently, and minimizes the risk of failure in real-world scenarios. Effective system testing can ultimately improve the quality and robustness of a system before deployment. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram illustrating one embodiment of a system capable of capturing and replaying queries to test and evaluate updates to a database system. FIG. 2 is a block diagram illustrating one embodiment of an overview process for testing a database system with updates enabled and disabled. FIG. 3 is a block diagram illustrating one embodiment relating to capturing query execution information that involves an exchange between a database system and a client system. FIG. 4 is a block diagram illustrating one embodiment of a query replay process to replay captured queries to test and evaluate updates to a database system. FIG. 5 is a block diagram illustrating one embodiment of a client system replaying queries on standby nodes of a database system. FIGS. 6A and 6B are flow diagrams illustrating example methods that implement techniques described herein. FIG. 7 is a block diagram illustrating one embodiment of an exemplary multi-tenant system for implementing various systems described herein. DETAILED DESCRIPTION Modern database systems normally include a component, referred to as a query optimizer, that is responsible for selecting efficient execution plans for processing queries, aiming to reduce response time and resource usage. The query optimizer analyzes various ways to execute a given query and attempts to determine the most efficient execution plan for that given query, considering factors like the structure of the database, available indexes, and the size of the data. Accordingly, the optimizer attempts to select the optimal approach by evaluating the cost of different strategies, such as choosing the optimal join order, access paths, and execution methods. By minimizing the resources (e.g., time and memory) required to process a query, the optimizer can enhance overall database performance, allowing for faster retrieval and manipulation of data. The effectiveness of a database system may thus largely depend on the quality of its query optimizer. But in large distributed database systems, updating the optimizer can present significant challenges due its scope and surface area. For example, while an update may improve performance for a particular execution plan, it can unintentionally degrade performance or cause wrong results in queries that are likely unrelated to the particular plan that was intended to be fixed by the update. Also, multiple tenants (e.g., individuals, companies, etc.) may utilize the same database system. In some cases, an update to the query optimizer improves performance for one tenant but degrades performance for other tenants whose data structures and query patterns are different. Accordingly, making changes to the query optimizer is risky due to the complexity of queries and the variety of data handled by different tenants of the database system. A/B testing and predefined test cases are often used to assess the impact of updates, but these methods can expose systems to performance risks and errors, such as in multi-tenant environments with complex and diverse queries. Existing testing methods also struggle to replicate real-world query behavior, leading to issues in production environments when updates are implemented. Accordingly, this disclosure addresses, among other things, the technical problem of how to test updates to a database system (or, more particularly, a query optimizer) in a manner that may reduce the risk of performance regressions and incorrect results occurring when the updates are deployed for users/tenants. The present disclosure discusses various techniques for testing and validating updates to a database system (e.g., updates to an optimizer) that involve capturing and replaying user-submitted queries. In various embodiments described below, a system includes a database system and a client system that issues queries to the database system to execute. The queries may be i