US-20260127174-A1 - DATA QUERY METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
Abstract
The present application relates to the field of database technologies, and provides a data query method and apparatus, an electronic device, and a storage medium. The data query method includes: obtaining a first query statement to be queried; determining, based on the first query statement, a first data table and a second data table to be used for an inner join operation, a first join column to be used for the inner join operation in the first data table, and a second join column to be used for the inner join operation in the second data table; determining, based on the first query statement, whether the first join column and the second join column satisfy a data pushdown condition; in response to determining that the data pushdown condition is satisfied, updating the first query statement to obtain a second query statement; and performing data query on the first data table and the second data table based on the second query statement, to obtain a data query result. As such, a data table can be first filtered, and then an inner join operation is performed, thereby reducing a data processing amount and improving data query performance.
Inventors
- Qinliang XUE
- Yifei TIAN
- Tao Zhu
- Guoping Wang
Assignees
- Beijing Oceanbase Technology Co., Ltd.
Dates
- Publication Date
- 20260507
- Application Date
- 20251230
- Priority Date
- 20231201
Claims (20)
- 1 . A data query method, the method comprising: obtaining a first query statement to be queried, the first query statement being a statement to be used to perform inner join on two data tables and then perform data limitation based on a limit clause; determining, based on the first query statement, a first data table and a second data table to be used for an inner join operation, a first join column to be used for the inner join operation in the first data table, and a second join column to be used for the inner join operation in the second data table; determining, based on the first query statement, whether the first join column and the second join column satisfy a data pushdown condition, the data pushdown condition being determined based on whether there is an inclusion relationship between a domain of the first join column and a domain of the second join column; in response to determining that the data pushdown condition is satisfied, updating the first query statement to obtain a second query statement, the second query statement to be used to perform data limitation and an inner join operation on the data tables; and performing data query on the first data table and the second data table based on the second query statement, to obtain a data query result.
- 2 . The method according to claim 1 , wherein the determining, based on the first query statement, whether the first join column and the second join column satisfy the data pushdown condition includes: determining, based on the first query statement, whether an inclusion relationship determining condition is satisfied, the inclusion relationship determining condition being determined based on a data table composition manner; in response to determining that the inclusion relationship determining condition is satisfied and there is a foreign key constraint relationship between the first join column and the second join column or the inner join operation is a self join operation, determining that there is the inclusion relationship; and determining, based on a predicate operator included in the first query statement in addition to an inner join clause, whether the data pushdown condition is satisfied, the inner join clause being a clause to be used for the inner join operation in the first query statement.
- 3 . The method according to claim 2 , wherein the determining, based on the first query statement, whether the inclusion relationship determining condition is satisfied includes: in response to determining, based on the first query statement, that both the first data table and the second data table are base tables, determining that the inclusion relationship determining condition is satisfied; and in response to determining, based on the first query statement, that there is a derived table in one or more of the first data table or the second data table, determining, based on a subquery statement corresponding to the derived table, whether the inclusion relationship determining condition is satisfied, the derived table being a virtual table formed based on a subquery statement in the first query statement.
- 4 . The method according to claim 3 , wherein the determining, based on the subquery statement corresponding to the derived table, whether the inclusion relationship determining condition is satisfied includes: in response to determining that the subquery statement includes a join operator, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes an aggregation operator for a join column operation and does not include an extremum operator for an output operation of the aggregation operator, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes a filter operator and a filter range of the filter operator does not satisfy a filter condition, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes none of the join operator, the aggregation operator, and the filter operator, determining that the inclusion relationship determining condition is satisfied; in response to determining that the subquery statement includes the aggregation operator and the extremum operator and includes neither of the join operator and the filter operator, determining that the inclusion relationship determining condition is satisfied; or in response to determining that the subquery statement includes the aggregation operator, the extremum operator, and the filter operator and does not include the join operator, and the filter condition is satisfied, determining that the inclusion relationship determining condition is satisfied.
- 5 . The method according to claim 4 , wherein the in response to determining that the subquery statement includes the filter operator and the filter range of the filter operator does not satisfy the filter condition, determining that the inclusion relationship determining condition is not satisfied includes: separately determining filter ranges of the first data table and the second data table based on the filter operator; and in response to that there is no coverage relationship between the filter ranges respectively corresponding to the first data table and the second data table, determining that the filter condition is not satisfied.
- 6 . The method according to claim 2 , wherein the determining, based on the predicate operator included in the first query statement in addition to the inner join clause, whether the data pushdown condition is satisfied includes: in response to determining that the first query statement does not include the predicate operator, determining that the data pushdown condition is satisfied, or in response to determining that the first query statement includes the predicate operator, determining that the data pushdown condition is not satisfied; or in response to determining that the first query statement further includes the predicate operator and an associated data table of the predicate operator is consistent with a data table to which an included join column belongs, determining that the data pushdown condition is satisfied, or in response to determining that the first query statement does not include the predicate operator or the associated data table of the predicate operator is inconsistent with the data table to which the included join column belongs, determining that the data pushdown condition is not satisfied, the associated data table being a data table operated by the predicate operator.
- 7 . The method according to claim 1 , wherein the updating the first query statement, to obtain the second query statement includes: using a data table, to which an included join column belongs, as a target data table based on the inclusion relationship; generating, based on the limit clause in the first query statement, a target clause configured to perform data filtering on the target data table; generating a conditional clause configured to restrain that a join column of the target data table is not null; and combining the first query statement, the target clause, and the conditional clause to obtain the second query statement.
- 8 . An electronic device, comprising: one or more processors; and one or more memory devices, individually or collectively, having computer instructions stored thereon, the computer instructions, when executed by the one or more processors, enabling the one or more processors to, individually or collectively, perform actions including: obtaining a first query statement to be queried, the first query statement being a statement to be used to perform inner join on two data tables and then perform data limitation based on a limit clause; determining, based on the first query statement, a first data table and a second data table to be used for an inner join operation, a first join column to be used for the inner join operation in the first data table, and a second join column to be used for the inner join operation in the second data table; determining, based on the first query statement, whether the first join column and the second join column satisfy a data pushdown condition, the data pushdown condition being determined based on whether there is an inclusion relationship between a domain of the first join column and a domain of the second join column; in response to determining that the data pushdown condition is satisfied, updating the first query statement to obtain a second query statement, the second query statement to be used to perform data limitation and an inner join operation on the data tables; and performing data query on the first data table and the second data table based on the second query statement, to obtain a data query result.
- 9 . The electronic device according to claim 8 , wherein the determining, based on the first query statement, whether the first join column and the second join column satisfy the data pushdown condition includes: determining, based on the first query statement, whether an inclusion relationship determining condition is satisfied, the inclusion relationship determining condition being determined based on a data table composition manner; in response to determining that the inclusion relationship determining condition is satisfied and there is a foreign key constraint relationship between the first join column and the second join column or the inner join operation is a self join operation, determining that there is the inclusion relationship; and determining, based on a predicate operator included in the first query statement in addition to an inner join clause, whether the data pushdown condition is satisfied, the inner join clause being a clause to be used for the inner join operation in the first query statement.
- 10 . The electronic device according to claim 9 , wherein the determining, based on the first query statement, whether the inclusion relationship determining condition is satisfied includes: in response to determining, based on the first query statement, that both the first data table and the second data table are base tables, determining that the inclusion relationship determining condition is satisfied; and in response to determining, based on the first query statement, that there is a derived table in one or more of the first data table or the second data table, determining, based on a subquery statement corresponding to the derived table, whether the inclusion relationship determining condition is satisfied, the derived table being a virtual table formed based on a subquery statement in the first query statement.
- 11 . The electronic device according to claim 10 , wherein the determining, based on the subquery statement corresponding to the derived table, whether the inclusion relationship determining condition is satisfied includes: in response to determining that the subquery statement includes a join operator, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes an aggregation operator for a join column operation and does not include an extremum operator for an output operation of the aggregation operator, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes a filter operator and a filter range of the filter operator does not satisfy a filter condition, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes none of the join operator, the aggregation operator, and the filter operator, determining that the inclusion relationship determining condition is satisfied; in response to determining that the subquery statement includes the aggregation operator and the extremum operator and includes neither of the join operator and the filter operator, determining that the inclusion relationship determining condition is satisfied; or in response to determining that the subquery statement includes the aggregation operator, the extremum operator, and the filter operator and does not include the join operator, and the filter condition is satisfied, determining that the inclusion relationship determining condition is satisfied.
- 12 . The electronic device according to claim 11 , wherein the in response to determining that the subquery statement includes the filter operator and the filter range of the filter operator does not satisfy the filter condition, determining that the inclusion relationship determining condition is not satisfied includes: separately determining filter ranges of the first data table and the second data table based on the filter operator; and in response to that there is no coverage relationship between the filter ranges respectively corresponding to the first data table and the second data table, determining that the filter condition is not satisfied.
- 13 . The electronic device according to claim 9 , wherein the determining, based on the predicate operator included in the first query statement in addition to the inner join clause, whether the data pushdown condition is satisfied includes: in response to determining that the first query statement does not include the predicate operator, determining that the data pushdown condition is satisfied, or in response to determining that the first query statement includes the predicate operator, determining that the data pushdown condition is not satisfied; or in response to determining that the first query statement further includes the predicate operator and an associated data table of the predicate operator is consistent with a data table to which an included join column belongs, determining that the data pushdown condition is satisfied, or in response to determining that the first query statement does not include the predicate operator or the associated data table of the predicate operator is inconsistent with the data table to which the included join column belongs, determining that the data pushdown condition is not satisfied, the associated data table being a data table operated by the predicate operator.
- 14 . The electronic device according to claim 8 , wherein the updating the first query statement, to obtain the second query statement includes: using a data table, to which an included join column belongs, as a target data table based on the inclusion relationship; generating, based on the limit clause in the first query statement, a target clause configured to perform data filtering on the target data table; generating a conditional clause configured to restrain that a join column of the target data table is not null; and combining the first query statement, the target clause, and the conditional clause to obtain the second query statement.
- 15 . A storage medium, having computer instructions stored thereon, the computer instructions, when executed by one or more processors, enabling the one or more processors to, individually or collectively, perform actions comprising: obtaining a first query statement to be queried, the first query statement being a statement to be used to perform inner join on two data tables and then perform data limitation based on a limit clause; determining, based on the first query statement, a first data table and a second data table to be used for an inner join operation, a first join column to be used for the inner join operation in the first data table, and a second join column to be used for the inner join operation in the second data table; determining, based on the first query statement, whether the first join column and the second join column satisfy a data pushdown condition, the data pushdown condition being determined based on whether there is an inclusion relationship between a domain of the first join column and a domain of the second join column; in response to determining that the data pushdown condition is satisfied, updating the first query statement to obtain a second query statement, the second query statement to be used to perform data limitation and an inner join operation on the data tables; and performing data query on the first data table and the second data table based on the second query statement, to obtain a data query result.
- 16 . The storage medium according to claim 15 , wherein the determining, based on the first query statement, whether the first join column and the second join column satisfy the data pushdown condition includes: determining, based on the first query statement, whether an inclusion relationship determining condition is satisfied, the inclusion relationship determining condition being determined based on a data table composition manner; in response to determining that the inclusion relationship determining condition is satisfied and there is a foreign key constraint relationship between the first join column and the second join column or the inner join operation is a self join operation, determining that there is the inclusion relationship; and determining, based on a predicate operator included in the first query statement in addition to an inner join clause, whether the data pushdown condition is satisfied, the inner join clause being a clause to be used for the inner join operation in the first query statement.
- 17 . The storage medium according to claim 16 , wherein the determining, based on the first query statement, whether the inclusion relationship determining condition is satisfied includes: in response to determining, based on the first query statement, that both the first data table and the second data table are base tables, determining that the inclusion relationship determining condition is satisfied; and in response to determining, based on the first query statement, that there is a derived table in one or more of the first data table or the second data table, determining, based on a subquery statement corresponding to the derived table, whether the inclusion relationship determining condition is satisfied, the derived table being a virtual table formed based on a subquery statement in the first query statement.
- 18 . The storage medium according to claim 17 , wherein the determining, based on the subquery statement corresponding to the derived table, whether the inclusion relationship determining condition is satisfied includes: in response to determining that the subquery statement includes a join operator, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes an aggregation operator for a join column operation and does not include an extremum operator for an output operation of the aggregation operator, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes a filter operator and a filter range of the filter operator does not satisfy a filter condition, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes none of the join operator, the aggregation operator, and the filter operator, determining that the inclusion relationship determining condition is satisfied; in response to determining that the subquery statement includes the aggregation operator and the extremum operator and includes neither of the join operator and the filter operator, determining that the inclusion relationship determining condition is satisfied; or in response to determining that the subquery statement includes the aggregation operator, the extremum operator, and the filter operator and does not include the join operator, and the filter condition is satisfied, determining that the inclusion relationship determining condition is satisfied.
- 19 . The storage medium according to claim 18 , wherein the in response to determining that the subquery statement includes the filter operator and the filter range of the filter operator does not satisfy the filter condition, determining that the inclusion relationship determining condition is not satisfied includes: separately determining filter ranges of the first data table and the second data table based on the filter operator; and in response to that there is no coverage relationship between the filter ranges respectively corresponding to the first data table and the second data table, determining that the filter condition is not satisfied.
- 20 . The storage medium according to claim 16 , wherein the determining, based on the predicate operator included in the first query statement in addition to the inner join clause, whether the data pushdown condition is satisfied includes: in response to determining that the first query statement does not include the predicate operator, determining that the data pushdown condition is satisfied, or in response to determining that the first query statement includes the predicate operator, determining that the data pushdown condition is not satisfied; or in response to determining that the first query statement further includes the predicate operator and an associated data table of the predicate operator is consistent with a data table to which an included join column belongs, determining that the data pushdown condition is satisfied, or in response to determining that the first query statement does not include the predicate operator or the associated data table of the predicate operator is inconsistent with the data table to which the included join column belongs, determining that the data pushdown condition is not satisfied, the associated data table being a data table operated by the predicate operator.
Description
TECHNICAL FIELD The present application relates to the field of database technologies, and specifically to a data query method and apparatus, an electronic device, and a storage medium. BACKGROUND In actual service processing scenarios, an inner join operation usually needs to be performed on two data tables to obtain a query result set that includes joined rows, and then data limitation is performed on the query result set via limit clauses such as limit (limit). The inner join operation is to query rows that satisfy an equi-matching condition in the two data tables, and combine matched rows. SUMMARY Implementations of the present application provide a data query method and apparatus, an electronic device, and a storage medium, which improve data query performance during data query. Inventors recognized that when the inner join operation and data limitation are performed on two data tables, there are usually a relatively large quantity of invalid calculations, which consume a large quantity of system resources and time costs, resulting in poor data query performance. An implementation of the present application provides a data query method, including: obtaining a first query statement to be queried, the first query statement being a statement to be used to perform inner join on two data tables and then perform data limitation based on a limit clause; determining, based on the first query statement, a first data table and a second data table to be used for an inner join operation, a first join column to be used for the inner join operation in the first data table, and a second join column to be used for the inner join operation in the second data table; determining, based on the first query statement, whether the first join column and the second join column satisfy a data pushdown condition, the data pushdown condition being determined based on whether there is an inclusion relationship between a domain of the first join column and a domain of the second join column; in response to determining that the data pushdown condition is satisfied, updating the first query statement to obtain a second query statement, the second query statement to be used to perform data limitation and then perform an inner join operation on the data tables; and performing data query on the first data table and the second data table based on the second query statement, to obtain a data query result. In some implementations, the determining, based on the first query statement, whether the first join column and the second join column satisfy the data pushdown condition includes: determining, based on the first query statement, whether an inclusion relationship determining condition is satisfied, the inclusion relationship determining condition being determined based on a data table composition manner; in response to determining that the inclusion relationship determining condition is satisfied and there is a foreign key constraint relationship between the first join column and the second join column or the inner join operation is a self join operation, determining that there is the inclusion relationship; and determining, based on a specified predicate operator included in the first query statement in addition to an inner join clause, whether the data pushdown condition is satisfied, the inner join clause being a clause to be used for the inner join operation in the first query statement. In some implementations, the determining, based on the first query statement, whether the inclusion relationship determining condition is satisfied includes: in response to determining, based on the first query statement, that both the first data table and the second data table are base tables, determining that the inclusion relationship determining condition is satisfied; and in response to determining, based on the first query statement, that there is a derived table in one or more of the first data table or the second data table, determining, based on a subquery statement corresponding to the derived table, whether the inclusion relationship determining condition is satisfied, the derived table being a virtual table formed based on a subquery statement in the first query statement. In some implementations, the determining, based on the subquery statement corresponding to the derived table, whether the inclusion relationship determining condition is satisfied includes: in response to determining that the subquery statement includes a specified join operator, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes an aggregation operator for a join column operation and does not include an extremum operator for an output operation of the aggregation operator, determining that the inclusion relationship determining condition is not satisfied; in response to determining that the subquery statement includes a specified filter operator and a filter range of the sp