CN-122019585-A - Database query control method and system based on radix estimation
Abstract
The invention provides a database query control method and a database query control system based on radix estimation, which are used for carrying out data calibration on a database based on a change log of the database, collecting bibliographic information of all calibration data, generating corresponding statistical information, carrying out index collection and statistics on the database, comprehensively grasping data distribution conditions of the database, generating a plurality of query execution plans based on query sentences and statistical information of a user side, selecting a part of query execution plans as alternative query execution plans based on execution cost of all query execution plans, predicting query result set size of the alternative query execution plans, selecting one of the alternative query execution plans to carry out data query, and predicting calculation cost, result size and reliability possibly formed by all query execution plans on the database from a software layer, so that query efficiency and performance of the database can be improved without hardware lifting.
Inventors
- YU DAN
- WANG DANXING
- XU HAORAN
Assignees
- 慧之安信息技术股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251212
Claims (8)
- 1. The database query control method based on the radix estimation is characterized by comprising the following steps of: based on a change log of a database, performing data calibration on the database; collecting bibliographic information of all calibration data so as to generate corresponding statistical information; Based on the execution cost of all the query execution plans, selecting a part of the query execution plans as alternative query execution plans; and selecting one of the alternative query execution plans to perform data query based on the scale of the query result set.
- 2. The database query control method based on radix estimation according to claim 1, wherein: and collecting bibliographic information of all calibration data, thereby generating corresponding statistical information, comprising: Based on the latest change time and a preset data effective time interval, calibrating the data which are changed in the database in the preset data effective time interval; And collecting index information and table information about all calibration data based on the positions of all calibration data in the database, carrying out semantic association recognition on the index information and the table information, and generating statistical information about all calibration data at a semantic association level.
- 3. The database query control method based on radix estimation according to claim 1, wherein: And selecting a part of the query execution plans as alternative query execution plans based on the execution cost of all the query execution plans, wherein the method comprises the following steps: Analyzing the query statement from the user side to obtain all keywords in the query statement, matching the keywords with semantic association information between all calibration data contained in the statistical information, and generating a plurality of query execution plans; The method comprises the steps of obtaining the cost of each query execution plan by estimating the cost of disk IO operation and the cost of CPU calculation in the execution process of each query execution plan, comparing the respective cost of all query execution plans with a preset cost threshold, and selecting a part of query execution plans as alternative query execution plans.
- 4. The database query control method based on radix estimation according to claim 1, wherein: The method comprises the steps of predicting the alternative query execution plan to obtain a query result set scale of the alternative query execution plan, selecting one of the alternative query execution plans to perform data query based on the query result set scale, and comprises the following steps: Predicting the alternative query execution plan based on the distribution structure of all calibration data in the database to obtain the query result set scale of the alternative query execution plan; and determining the reliability of the respective query results of all the alternative query execution plans based on the scale of the query result set, so as to select one of the alternative query execution plans to perform data query.
- 5. A database query control system based on radix estimation, comprising: the data calibration module is used for calibrating the data of the database based on the change log of the database; The statistical information generation module is used for collecting the bibliographic information of all the calibration data so as to generate corresponding statistical information; the execution plan generation module is used for generating a plurality of inquiry execution plans based on inquiry sentences from the user side and the statistical information; an execution plan selection module for selecting a portion of the query execution plans as alternative query execution plans based on the execution costs of all the query execution plans; The query result set scale determining module is used for predicting the alternative query execution plan to obtain the query result set scale of the alternative query execution plan; And the execution plan determining module is used for selecting one of the alternative query execution plans to perform data query based on the query result set scale.
- 6. The radix-estimation-based database query control system of claim 5, wherein: the data calibration module is used for calibrating data of the database based on the change log of the database, and comprises the following steps: Based on the latest change time and a preset data effective time interval, calibrating the data which are changed in the database in the preset data effective time interval; the statistical information generating module is used for collecting bibliographic information of all calibration data so as to generate corresponding statistical information, and comprises the following steps: And collecting index information and table information about all calibration data based on the positions of all calibration data in the database, carrying out semantic association recognition on the index information and the table information, and generating statistical information about all calibration data at a semantic association level.
- 7. The radix-estimation-based database query control system of claim 5, wherein: The execution plan generation module is configured to generate a plurality of query execution plans based on query sentences from a user side and the statistical information, including: Analyzing the query statement from the user side to obtain all keywords in the query statement, matching the keywords with semantic association information between all calibration data contained in the statistical information, and generating a plurality of query execution plans; the execution plan selection module is configured to select a part of the query execution plans as alternative query execution plans based on the execution costs of all the query execution plans, including: The method comprises the steps of obtaining the cost of each query execution plan by estimating the cost of disk IO operation and the cost of CPU calculation in the execution process of each query execution plan, comparing the respective cost of all query execution plans with a preset cost threshold, and selecting a part of query execution plans as alternative query execution plans.
- 8. The radix-estimation-based database query control system of claim 5, wherein: the query result set size determining module is configured to predict the alternative query execution plan to obtain a query result set size of the alternative query execution plan, and includes: Predicting the alternative query execution plan based on the distribution structure of all calibration data in the database to obtain the query result set scale of the alternative query execution plan; The execution plan determining module is configured to select one of the alternative query execution plans to perform a data query based on the query result set scale, including: and determining the reliability of the respective query results of all the alternative query execution plans based on the scale of the query result set, so as to select one of the alternative query execution plans to perform data query.
Description
Database query control method and system based on radix estimation Technical Field The present invention relates to the field of data query, and in particular, to a database query control method and system based on radix estimation. Background When the data volume in the database is greatly increased, the operation efficiency of the database can be reduced, and in order to ensure the normal operation of the database and avoid the influence of the database on the normal use of an operating system, the database can be optimized by increasing the memory and the like. The memory is added to enable the database to keep more data and indexes in the memory, IO operation on the disk is reduced, and therefore query and processing speed of the database are improved. Considering that the price of the memory is relatively high, the cost performance of optimizing the database performance by adding the memory is not high in a large-scale data storage scene. In addition, the expansibility of the memory is limited, and when the physical limit of the memory of the server is reached, the query performance of the database cannot be optimized by adding the memory. It can be seen that the database cannot be effectively optimized in a deepened way only by adding hardware changing modes such as memory and the like. Disclosure of Invention The invention aims to provide a database query control method and a database query control system based on radix estimation, which are used for carrying out data calibration on a database based on a change log of the database, collecting bibliographic information of all calibration data, generating corresponding statistical information, carrying out index collection and statistics on the database, comprehensively grasping data distribution conditions of the database, generating a plurality of query execution plans based on query sentences and statistical information of a user side, selecting a part of query execution plans as alternative query execution plans based on execution cost of all the query execution plans, ensuring that the alternative query execution plans do not occupy excessive calculation force cost, predicting the query result set scale of the alternative query execution plans, selecting one of the alternative query execution plans to carry out data query, and predicting calculation force cost, result scale and reliability possibly formed by the database from a software layer to the all the query execution plans, thereby providing multi-dimensional reference for determining the final query execution plan, and improving query efficiency and performance of the database under the condition without hardware lifting. The invention is realized by the following technical scheme: The database query control method based on the radix estimation comprises the following steps: based on a change log of a database, performing data calibration on the database; collecting bibliographic information of all calibration data so as to generate corresponding statistical information; Based on the execution cost of all the query execution plans, selecting a part of the query execution plans as alternative query execution plans; and selecting one of the alternative query execution plans to perform data query based on the scale of the query result set. Optionally, based on a change log of the database, performing data calibration on the database, collecting bibliographic information of all calibration data, thereby generating corresponding statistical information, including: Based on the latest change time and a preset data effective time interval, calibrating the data which are changed in the database in the preset data effective time interval; And collecting index information and table information about all calibration data based on the positions of all calibration data in the database, carrying out semantic association recognition on the index information and the table information, and generating statistical information about all calibration data at a semantic association level. Optionally, generating a plurality of query execution plans based on query sentences from the user side and the statistical information, selecting a part of the query execution plans as alternative query execution plans based on the execution cost of all the query execution plans, including: Analyzing the query statement from the user side to obtain all keywords in the query statement, matching the keywords with semantic association information between all calibration data contained in the statistical information, and generating a plurality of query execution plans; The method comprises the steps of obtaining the cost of each query execution plan by estimating the cost of disk IO operation and the cost of CPU calculation in the execution process of each query execution plan, comparing the respective cost of all query execution plans with a preset cost threshold, and selecting a part of query execution plans as alternative query