Search

US-20260127150-A1 - INDEX TABLE CREATION AND DATA QUERY

US20260127150A1US 20260127150 A1US20260127150 A1US 20260127150A1US-20260127150-A1

Abstract

One or more implementations of the present specification provide an index table creation method, a data query method, and apparatuses, and relate to the field of database technologies. In the method, an index column to be used for creating an index and a redundant column associated with the index column can be determined in a data table, and an index table is created, where the index table includes the index column and the redundant column, the index column is an index key of the index table, data in the index column is stored in a row-based storage manner, data in the redundant column is stored in a column-based storage manner, and the redundant column in the index table is used to accelerate a data query process for the data table. The solution provided in the present specification can improve both OLAP performance and OLTP performance of a database by using the index table, so that the database can support an application scenario of HTAP.

Inventors

  • Jianyun Sun
  • Zhenjiang Xie

Assignees

  • BEIJING OCEANDASE TECHNOLOGY CO., LTD.

Dates

Publication Date
20260507
Application Date
20251229
Priority Date
20231120

Claims (20)

  1. 1 . A method, comprising: determining, in a data table, an index column and a redundant column associated with the index column; and creating an index table, the index table including the index column and the redundant column, the index column being an index key of the index table, data in the index column being stored in a row-based storage manner, data in the redundant column being stored in a column-based storage manner.
  2. 2 . The method according to claim 1 , wherein the index table further includes a primary key in the data table, and the primary key and the index column are stored together in the row-based storage manner.
  3. 3 . The method of claim 1 , comprising: obtaining a data query command, the data query command configured to query a target column of the data table for target data that satisfies a query condition; in response to that the redundant column of the index table includes at least a part of the target column, querying the target column included in the redundant column for a first target row that satisfies the query condition, to obtain a first target row offset; and querying, based on the first target row offset, for the target data that satisfies the query condition.
  4. 4 . The method according to claim 3 , wherein the querying the target column included in the redundant column for the first target row that satisfies the query condition, to obtain the first target row offset includes: in response to that the redundant column includes a plurality of target columns, querying each target column included in the redundant column for a candidate row that satisfy the query condition, to obtain a row offset set of the candidate row in each target column; and calculating an intersection set of row offset sets corresponding to target columns in the redundant column, to obtain the first target row offset.
  5. 5 . The method according to claim 3 , wherein the querying, based on the first target row offset, for the target data that satisfies the query condition includes: in response to that the redundant column includes a first part of the target column and the index column includes a second part of the target column, screening out, based on the first target row offset, rows indicated by the first target row offset from the target column included in the index column; querying the screened-out rows for a second target row that satisfies the query condition, to obtain a second target row offset; and querying, based on the second target row offset, for the target data that satisfies the query condition.
  6. 6 . The method according to claim 3 , wherein the index table further includes a primary key in the data table, and the primary key and the index column are together stored in the row-based storage manner; and the querying, based on the first target row offset, for the target data that satisfies the query condition includes: in response to that the index table includes a part of the target column, determining, based on the first target row offset, a primary key value corresponding to the first target row offset from a primary key of the index table; and querying, based on the primary key value corresponding to the first target row offset, the data table for the target data that satisfies the query condition.
  7. 7 . The method according to claim 5 , wherein the index table further includes a primary key in the data table, and the primary key and the index column are together stored in the row-based storage manner; and the querying, based on the second target row offset, for the target data that satisfies the query condition includes: in response to that the index table includes a part of the target column, determining, based on the second target row offset, a primary key value corresponding to the second target row offset from a primary key of the index table; and querying, based on the primary key value corresponding to the second target row offset, the data table for the target data that satisfies the query condition.
  8. 8 . An electronic device, comprising: one or more processors; and one or more storage devices, individually or collectively, having processor-executable instructions stored thereon, the processor-executable instructions, when executed by the one or more processors, enabling the one or more processors to, individually or collectively, implement actions including: determining, in a data table, an index column and a redundant column associated with the index column; and creating an index table, the index table including the index column and the redundant column, the index column being an index key of the index table, data in the index column being stored in a row-based storage manner, data in the redundant column being stored in a column-based storage manner.
  9. 9 . The electronic device according to claim 8 , wherein the index table further includes a primary key in the data table, and the primary key and the index column are stored together in the row-based storage manner.
  10. 10 . The electronic device of claim 8 , wherein the actions include: obtaining a data query command, the data query command configured to query a target column of the data table for target data that satisfies a query condition; in response to that the redundant column of the index table includes at least a part of the target column, querying the target column included in the redundant column for a first target row that satisfies the query condition, to obtain a first target row offset; and querying, based on the first target row offset, for the target data that satisfies the query condition.
  11. 11 . The electronic device according to claim 10 , wherein the querying the target column included in the redundant column for the first target row that satisfies the query condition, to obtain the first target row offset includes: in response to that the redundant column includes a plurality of target columns, querying each target column included in the redundant column for a candidate row that satisfy the query condition, to obtain a row offset set of the candidate row in each target column; and calculating an intersection set of row offset sets corresponding to target columns in the redundant column, to obtain the first target row offset.
  12. 12 . The electronic device according to claim 10 , wherein the querying, based on the first target row offset, for the target data that satisfies the query condition includes: in response to that the redundant column includes a first part of the target column and the index column includes a second part of the target column, screening out, based on the first target row offset, rows indicated by the first target row offset from the target column included in the index column; querying the screened-out rows for a second target row that satisfies the query condition, to obtain a second target row offset; and querying, based on the second target row offset, for the target data that satisfies the query condition.
  13. 13 . The electronic device according to claim 10 , wherein the index table further includes a primary key in the data table, and the primary key and the index column are together stored in the row-based storage manner; and the querying, based on the first target row offset, for the target data that satisfies the query condition includes: in response to that the index table includes a part of the target column, determining, based on the first target row offset, a primary key value corresponding to the first target row offset from a primary key of the index table; and querying, based on the primary key value corresponding to the first target row offset, the data table for the target data that satisfies the query condition.
  14. 14 . The electronic device according to claim 12 , wherein the index table further includes a primary key in the data table, and the primary key and the index column are together stored in the row-based storage manner; and the querying, based on the second target row offset, for the target data that satisfies the query condition includes: in response to that the index table includes a part of the target column, determining, based on the second target row offset, a primary key value corresponding to the second target row offset from a primary key of the index table; and querying, based on the primary key value corresponding to the second target row offset, the data table for the target data that satisfies the query condition.
  15. 15 . A computer-readable storage medium, having computer instructions stored thereon, the computer instructions, when executed by one or more processors, enabling the one or more processors to, individually or collectively, implement actions including: determining, in a data table, an index column and a redundant column associated with the index column; and creating an index table, the index table including the index column and the redundant column, the index column being an index key of the index table, data in the index column being stored in a row-based storage manner, data in the redundant column being stored in a column-based storage manner.
  16. 16 . The storage medium according to claim 15 , wherein the index table further includes a primary key in the data table, and the primary key and the index column are stored together in the row-based storage manner.
  17. 17 . The storage medium of claim 15 , wherein the actions include: obtaining a data query command, the data query command configured to query a target column of the data table for target data that satisfies a query condition; in response to that the redundant column of the index table includes at least a part of the target column, querying the target column included in the redundant column for a first target row that satisfies the query condition, to obtain a first target row offset; and querying, based on the first target row offset, for the target data that satisfies the query condition.
  18. 18 . The storage medium according to claim 17 , wherein the querying the target column included in the redundant column for the first target row that satisfies the query condition, to obtain the first target row offset includes: in response to that the redundant column includes a plurality of target columns, querying each target column included in the redundant column for a candidate row that satisfy the query condition, to obtain a row offset set of the candidate row in each target column; and calculating an intersection set of row offset sets corresponding to target columns in the redundant column, to obtain the first target row offset.
  19. 19 . The storage medium according to claim 17 , wherein the querying, based on the first target row offset, for the target data that satisfies the query condition includes: in response to that the redundant column includes a first part of the target column and the index column includes a second part of the target column, screening out, based on the first target row offset, rows indicated by the first target row offset from the target column included in the index column; querying the screened-out rows for a second target row that satisfies the query condition, to obtain a second target row offset; and querying, based on the second target row offset, for the target data that satisfies the query condition.
  20. 20 . The storage medium according to claim 19 , wherein the index table further includes a primary key in the data table, and the primary key and the index column are together stored in the row-based storage manner; and the querying, based on the second target row offset, for the target data that satisfies the query condition includes: in response to that the index table includes a part of the target column, determining, based on the second target row offset, a primary key value corresponding to the second target row offset from a primary key of the index table; and querying, based on the primary key value corresponding to the second target row offset, the data table for the target data that satisfies the query condition.

Description

TECHNICAL FIELD One or more implementations of the present specification relate to the field of database technologies, and in particular, to an index table creation method, a data query method, and apparatuses. BACKGROUND In a data processing system, as a data volume increases substantially, service requirements of online analytical processing (OLAP) and online transaction processing (OLTP) may simultaneously exist in the same data table. Because OLAP and OLTP have distinct characteristics, it is difficult in related technologies to enable the data table to possess both good OLAP performance and OLTP performance. For example, OLAP usually requires to query a certain column or several columns of data in the data table, and OLTP usually requires to query a single row of data in the data table. Therefore, it is difficult to consider query efficiency of both in the related technologies. SUMMARY One or more implementations of the present specification provide an index table creation method, a data query method, and apparatuses. One or more implementations of the present specification provide the following technical solutions. According to a first aspect of one or more implementations of the present specification, an index table creation method is proposed, including: determining, in a data table, an index column to be used for creating an index and a redundant column associated with the index column; and creating an index table, the index table including the index column and the redundant column, the index column being an index key of the index table, data in the index column being stored in a row-based storage manner, data in the redundant column being stored in a column-based storage manner, and the redundant column in the index table to be used to accelerate a data query process for the data table. According to a second aspect of one or more implementations of the present specification, a data query method is proposed, including: obtaining a data query command, the data query command to be used to query a target column of a data table for target data that satisfies a query condition; in response to that the redundant column of the index table includes at least a part of the target column, querying the target column included in the redundant column for a first target row that satisfies the query condition, to obtain a first target row offset, the index table being an index column and the redundant column, the index column being an index key of the index table, data in the index column being stored in a row-based storage manner, and data in the redundant column being stored in a column-based storage manner; and querying, based on the first target row offset, for the target data that satisfies the query condition. According to a third aspect of one or more implementations of the present specification, an index table creation apparatus is proposed, including: a determining module, configured to determine, in a data table, an index column to be used for creating an index and a redundant column associated with the index column; and a creating module, configured to create an index table, the index table including the index column and the redundant column, the index column being an index key of the index table, data in the index column being stored in a row-based storage manner, data in the redundant column being stored in a column-based storage manner, and the redundant column in the index table to be used to accelerate a data query process for the data table. According to a fourth aspect of one or more implementations of the present specification, a data query apparatus is provided, including: an acquisition module, configured to obtain obtaining a data query command, the data query command to be used to query a target column of a data table for target data that satisfies a query condition; a first query module, configured to: in response to that the redundant column of the index table includes at least a part of the target column, query the target column included in the redundant column for a first target row that satisfies the query condition, to obtain a first target row offset, the index table including an index column and the redundant column, the index column being an index key of the index table, data in the index column being stored in a row-based storage manner, and data in the redundant column being stored in a column-based storage manner; and a second query module, configured to query, based on the first target row offset, for the target data that satisfies the query condition. According to a fifth aspect of one or more implementations of the present specification, an electronic device is provided, including: a processor; and a storage, configured to store processor-executable instructions. The processor runs the executable instructions to implement the method according to the first aspect and/or the method according to the second aspect. According to a sixth aspect of one or more implementations of the