US-12619608-B1 - Query system

US12619608B1US 12619608 B1US12619608 B1US 12619608B1US-12619608-B1

Abstract

A system and method for efficient query processing using a real index of a queried table are described. In one embodiment, the real index is used in an offset query type in order to reduce the number of rows that are sorted and thereby increases efficiency for processing offset query types. In another embodiment, the real index is used in a set operation query type where existing systems utilize a table scan and thereby increases efficiency of set operation query types.

Inventors

Raja Sekhar Chunduru

Assignees

PROGRESS SOFTWARE CORPORATION

Dates

Publication Date: 20260505
Application Date: 20231027

Claims (20)

1 . A computer-implemented method comprising: receiving, using one or more processors, a set operation query including: a join type operator, a union type operator, an identification of a first queried table, a second queried table, and a third queried table, and wherein the first query table and the second queried table are unioned tables; adding, using the one or more processors, a plurality of set operator nodes to a query tree, the plurality of set operator nodes including a first set operator node associated with the join type operator and a second set operator node associated with the union type operator; inserting, using the one or more processors, a first retrieval node into the query tree associated with the first queried table, a second retrieval node associated with the second queried table, and a third retrieval node associated with the third queried table, wherein the first retrieval node corresponds to an index scan of the first queried table using one or more real indices associated with the first queried table, wherein the second retrieval node is associated with the second queried table and one of: a same order of index scan as the first retrieval node, a different order of index scan than that of the first retrieval node, and a table scan, and wherein the second set operator node, the first retrieval node, and the second retrieval node comprise a union subtree; and selecting, using the one or more processors, the query tree based on a cost, wherein a portion of the cost associated with the union subtree of the query tree is based on a sum of one or more real indices associated with the union subtree.
2 . The computer-implemented method of claim 1 including: determining a largest order real index among the unioned tables associated with the second set operator node associated with the union type operator; and responsive to determining the first retrieval node associated with the first queried table is associated with a higher order index than the second retrieval node, inserting a restriction node between the second retrieval node and the second set operator node associated with the union type operator.
3 . The computer-implemented method of claim 2 , wherein the restriction node has an order equal to the difference between the largest order real index among the unioned tables associated with the second set operator node and the order of the real index associated with the first queried table.
4 . The computer-implemented method of claim 2 , wherein the second retrieval node is a table scan retrieval node when the second queried table lacks a real index and is an index scan node when the second queried table includes a real index.
5 . The computer-implemented method of claim 1 , wherein the union type operator is one or more of a union and union all.
6 . The computer-implemented method of claim 1 , wherein the join type operator is associated with a root node in the query tree.
7 . The computer-implemented method of claim 1 , wherein the second set operator node associated with the union type operator, when executed, uses a dynamically generated virtual index, wherein the dynamically generated virtual index does not persist after the set operation query is processed.
8 . The computer-implemented method of claim 1 , wherein the join type operator includes one or more of an inner join, left outer join, right outer join, full outer join, and cross join.
9 . The computer-implemented method of claim 1 including: calculating the cost associated with the query tree, wherein the second set operator node associated with the union type operator uses a virtual index.
10 . The computer-implemented method of claim 1 including: determining, responsive to comparing the cost associated with the query tree to a cost associate with another query tree, that the cost associated with the query tree has a lower cost; and executing the query based on the query tree.
11 . A system comprising: one or more processors; and a memory storing instructions that, when executed by the one or more processors, cause the system to: receive a set operation query including: a join type operator, a union type operator, an identification of a first queried table, a second queried table, and a third queried table, and wherein the first query table and the second queried table are unioned tables; add a plurality of set operator nodes to a query tree, the plurality of set operator nodes including a first set operator node associated with the join type operator and a second set operator node associated with the union type operator; insert a first retrieval node into the query tree associated with the first queried table, a second retrieval node associated with the second queried table, and a third retrieval node associated with the third queried table, wherein first retrieval node corresponds to an index scan of the first queried table using one or more real indices associated with the first queried table, wherein the second retrieval node is associated with the second queried table and one of: a same order of index scan as the first retrieval node, a different order of index scan than that of the first retrieval node, and a table scan, and wherein the second set operator node, the first retrieval node, and the second retrieval node comprise a union subtree; and select the query tree based on a cost, wherein a portion of the cost associated with the union subtree of the query tree is based on a sum of one or more real indices associated with the union subtree.
12 . The system of claim 11 , wherein the instructions, when executed, cause the system to: determine a largest order real index among the unioned tables associated with second set operator node associated with the union type operator; and responsive to determining the first retrieval node associated with the first queried table is associated with a higher order index than the second retrieval node, insert a restriction node between the second retrieval node and the second set operator node associated with the union type operator.
13 . The system of claim 12 , wherein the restriction node has an order equal to the difference between the largest order real index among the unioned tables associated with the second set operator node and the order of the real index associated with the first queried table.
14 . The system of claim 12 , wherein the second retrieval node is a table scan retrieval node when the second queried table lacks a real index and is an index scan node when the second queried table includes a real index.
15 . The system of claim 11 , wherein the union type operator is one or more of a union and union all.
16 . The system of claim 11 , wherein the join type operator is associated with a root node in the query tree.
17 . The system of claim 11 , wherein the second set operator node associated with the union type operator, when executed, uses a dynamically generated virtual index, wherein the dynamically generated virtual index does not persist after the set operation query is processed.
18 . The system of claim 11 , wherein the join type operator includes one or more of an inner join, left outer join, right outer join, full outer join, and cross join.
19 . The system of claim 11 , wherein the instructions, when executed, cause the system to: calculate the cost associated with the query tree, wherein the second set operator node associated with the union type operator uses a virtual index.
20 . The system of claim 11 , wherein the instructions, when executed, cause the system to: determining, responsive to comparing the cost associated with the query tree to a cost associate with another query tree, that the cost associated with the query tree has a lower cost; and executing the query based on the query tree.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present application is a continuation of and claims priority to U.S. patent application Ser. No. 16/875,773, filed May 15, 2020, titled “Query System,” which is a continuation of and claims priority to U.S. patent application Ser. No. 14/189,199, filed Feb. 25, 2014, titled “Query System,” which claims priority, under 35 U.S.C. § 119, of U.S. Provisional Patent Application No. 61/791,716, filed Mar. 15, 2013 and entitled “Query System,” the entirety of which is hereby incorporated by reference. Applicants hereby notify the USPTO that the claims of the present application are different from those of the parent application and any other related applications. Therefore, Applicants rescind any disclaimer of claim scope made in the parent application or any other predecessor application in relation to the present application. The Examiner is therefore advised that any such disclaimer and the cited reference that it was made to avoid may need to be revisited at this time. Furthermore, the Examiner is also reminded that any disclaimer made in the present application should not be read into or against the parent application, the grandparent application or any other related application FIELD OF INVENTION The present disclosure relates to query systems. Specifically, the present disclosure relates to providing optimization and efficient processing of queries. BACKGROUND Over the years, the cost of storing data has been reduced as memory technologies have advanced. This reduction in the cost of storing data coincides with a trend to store more data in databases as well as a trend to store data in larger databases. With a growth in the size of databases, the need for efficient methods and systems for locating and retrieving, or otherwise accessing, data therefrom is of increasing importance. Accordingly, what is needed are efficient methods for query processing and optimization. Data access may use mechanisms such as a table scans and index scans. Generally, an index scan is faster and more efficient than a table scan. Therefore, what is needed are methods for query processing that utilize an index scan, or similar mechanism, to provide greater efficiency, whether in terms of time or hardware resources (e.g. processing, bandwidth, etc.), to process a given query. SUMMARY In general, an innovative aspect of the subject matter described in this disclosure may be embodied in methods that include responsive to receiving an offset query, index scanning, using one or more processors, a queried table on a first column, wherein the first column comprises an index of the queried table; determining, using the one or more processors, a subset of one or more rows from the index scanned query table; sorting, using the one or more processors, the subset of rows based in part on a second column; and determining, using the one or more processors, based on the offset query, one or more output rows to be fetched from the sorted subset of rows. In general, another innovative aspect of the subject matter described in this disclosure may be embodied in methods that include responsive to receiving an offset query, index scanning, using one or more processors, a queried table on a first sort key, the index scan generating a first indexed table, the first indexed table ordered according to the first sort key, the offset query including an offset value and a fetch value; for each row in the first indexed table beginning with a first row of the first indexed table and proceeding sequentially through the rows to a final row: (1) retrieving, using the one or more processors, a row from the indexed table; and (2) determining, using the one or more processors, the row's position in the indexed table relative to an offset value and a fetch value and determining whether to add the row to a working set based in part on the row's position prior to retrieving a next row; responsive to determining the final row is reached, sorting, using the one or more processors, the working set based in part on a second sort key; and determining, using the one or more processors, based on the offset query and skipped row count, one or more output rows to be fetched from the sorted working set. In general, another innovative aspect of the subject matter described in this disclosure may be embodied in methods that include receiving, using one or more processors, a set operation query including at least one join type operator and at least on union type operator and identifying at least three queried tables including at least two unioned tables; adding, using the one or more processors, one or more set operator nodes to a query tree, a set operator node corresponding to a set operator in the set operation query; inserting, using the one or more processors, a retrieval node for each queried table into the query tree, wherein a retrieval node for at least one unioned table corresponds to an index scan of that unioned table; and se