CN-121996690-A - Dynamic query load-oriented spatial index optimization method and system
Abstract
The invention provides a space index optimization method and a space index optimization system for dynamic query loads, which comprise the steps of S1, obtaining a load fluctuation quantization index by using KL divergence and used for quantizing the change degree of the dynamic query loads, S2, constructing a trigger condition based on the load fluctuation quantization index and a double rule of window query time, and S3, adjusting a dimension reduction index structure of the query loads when the trigger condition is met, so as to realize space index optimization. In order to ensure the effectiveness of a dynamic adjustment mechanism, the invention provides a quantization index for evaluating the degree of query load change, wherein adjustment is triggered only when the load change exceeds a set threshold value, and the analysis of query load characteristics optimizes part of structures in a BMTree dimension-reduction index structure to perform filtering, rewriting, partial construction and merging operations of data and query load, thereby flexibly adapting to the query load change and avoiding the performance bottleneck of global reconstruction.
Inventors
- ZHANG HAO
- Ying Wenzhong
- WANG ZONGYU
- ZHU PING
- WANG WENSHUN
- CAI ZUXIN
- LI JIAOJIAO
- Lv ziyang
- HOU ZHAOTANG
- QIAN SHIYOU
- LIAO ZHENGYU
- CHEN YICHAO
- CAO JIAN
Assignees
- 华能(福建)能源开发有限公司福州分公司
- 西安热工研究院有限公司
- 上海交通大学
- 苏州干诚数字科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260115
Claims (10)
- 1. A space index optimization method facing dynamic query load is characterized by comprising the following steps: Step S1, obtaining a load fluctuation quantization index by using KL divergence for quantizing the change degree of the dynamic query load; S2, constructing a trigger condition based on the double rules of the load fluctuation quantization index and window query time; and step S3, when the triggering condition is met, the dimension reduction index structure of the query load is adjusted, and the space index optimization is realized.
- 2. The method for optimizing the spatial index for the dynamic query load according to claim 1, wherein the step S1 comprises the steps of quantizing query operations in two adjacent time windows into two probability distributions, and calculating the difference degree between the two distributions by using KL as an index, wherein the larger the KL value is, the larger the difference degree between the two distributions is; Wherein, the Load window for the previous query; A load window is queried for the current time; Is that The probability distribution condition of access of the load in the data space divided by the dimension-reduction index structure is queried; Is that The probability distribution condition of the access of the query load in the data space divided by the dimension-reduction index structure; The number of queries in the window; Wherein, the ; The number of the data spaces divided for the user definition; To express as Individual query workload at An array of distribution sizes in the different partitioned spaces; To express the first Number of queries in a block of data space.
- 3. The dynamic query load oriented spatial index optimization method according to claim 2, wherein the data space divided by the dimension reduction index structure is a division of a query workload data range; When a single SFC is adopted as a dimension reduction index structure, the data space directly and uniformly divides the range of the query load; When the segment SFC is adopted as the dimension reduction index structure, the data space divides the range of the query load based on the distribution condition of the minimum range nodes of the query load in the BMTree tree.
- 4. The method for optimizing spatial index for dynamic query load according to claim 1, wherein said step S2 comprises: When the query time change does not meet the preset requirement and the KL value does not reach the threshold value, not adjusting; When the query time change does not meet the preset requirement but the KL value reaches the threshold value, the adjustment is not carried out under the condition of not affecting the performance; When the query time change meets the preset requirement but the KL value does not reach the threshold value, the performance change is not caused by the load change, and no adjustment is performed; when the time variation meets the preset requirement and the KL value reaches the threshold value, triggering adjustment.
- 5. The method for optimizing spatial index for dynamic query load according to claim 1, wherein said step S3 comprises: The dimension-reducing index structure adjustment aiming at single SFC comprises analyzing influence of different space filling curves on page access times through QUILTS, generating an optimal SFC as the dimension-reducing index structure according to QUILTS analysis, and completing dynamic optimization of the index structure; The dimension reduction index structure adjustment for the segment SFC comprises the steps of determining Top K nodes with the largest load quantization index difference in two distributions as adjustment root nodes, filtering and correcting a dataset and a current query load window based on the adjustment root nodes to generate a new dataset and a new query load set which adapt to an adjustment range, constructing a part BMTree tree in the adjustment range by adopting a Monte Carlo tree search algorithm based on the new dataset and the new query load set, and merging the constructed part BMTree tree with an original BMTree tree to finish dynamic optimization of the index structure.
- 6. The method for optimizing spatial index for dynamic query load according to claim 5, wherein said determining Top K nodes with the largest difference in load quantization index among two distributions as the adjusted root node comprises: load window based on previous query Load window with current query Distribution in BMTree tree And (3) with And recording the complete BMTree tree as Obtaining And (3) with The difference value array arr is used for reflecting the minimum range node distribution change of each node in BMTree trees between two windows; selecting a previous node with the maximum value from the difference value array arr, sequencing according to the value, and initializing a selection array S; Traversing the ordered nodes, if a subtree relation exists between the current node and the nodes in the selected array S, selecting the nodes with higher value based on the node value to add S; The node in the selection array S is taken as the adjustment root node of BMTree tree and is recorded as The root node of BMTree tree that needs to be currently adjusted is noted as k.
- 7. The method for optimizing a spatial index for dynamic query load according to claim 5, wherein said filtering and correcting the data set and the current query load window based on the adjusted root node to generate a new data set and a new query load set adapting to the adjustment range comprises: When data set D or current window Deleting the data to obtain a preliminarily filtered data set D 'and a query load window wc' when the data does not pass through a traversing path from BMTree root nodes to a root node k of a tree structure which is currently required to be adjusted, deleting binary digits of the data in corresponding dimensions in the traversing path, and finishing data correction to obtain a final data set D '' and a final query load window wc ''.
- 8. The dynamic query load oriented spatial index optimization method of claim 5, wherein constructing the part BMTree tree within the adjustment range using a Monte Carlo tree search algorithm based on the new dataset and the new query load set comprises constructing the part BMTree structure T2 in the hierarchy of the tree using a Monte Carlo tree search algorithm based on dataset D″ and query load window wc ".
- 9. The method for optimizing spatial index for dynamic query load according to claim 5, wherein said merging the built part BMTree tree with the original BMTree tree comprises finding a subtree T3 to be replaced with an adjusted root node as a root in the original BMTree by using a hierarchical traversal algorithm, replacing the dimension selection of T3 with the dimension selection of the part BMTree structure T2, and merging the part BMTree with the original BMTree.
- 10. A dynamic query load oriented spatial index optimization system, comprising: the module M1 is used for obtaining a load fluctuation quantization index by adopting KL divergence and used for quantizing the change degree of the dynamic query load; a module M2, constructing a trigger condition based on the load fluctuation quantization index and the double rules of window query time; and the module M3 is used for adjusting the dimension reduction index structure of the query load when the triggering condition is met, so as to realize the optimization of the spatial index.
Description
Dynamic query load-oriented spatial index optimization method and system Technical Field The invention relates to the technical field of spatial index, in particular to a spatial index optimization method and a spatial index optimization system for dynamic query load. Background In the prior art, the dimension reduction index structure based on SFC only considers that the dimension reduction mapping precision is further optimized for the data set or the query performance is further improved by combining a learning algorithm under the condition of fixed query workload, but the dimension reduction index structure is not optimized in the query workload in a real-time state, namely in a dynamic change. The change of the query workload can lead to the reduced query performance of the original excellent dimension reduction index structure, and two solutions are adopted for reconstruction or partial reconstruction. On one hand, reconstructing the dimension-reducing index structure is a mode with higher cost performance aiming at some dimension-reducing index structures with simple structures and high building efficiency, but the dimension-reducing index structure which needs to be built with time is not a good solution for BMTree, on the other hand, partial reconstruction of the dimension-reducing index structure can weigh which part and most of index structures in the structure are reconstructed, and the dimension-reducing index structure is optimized with the minimum time cost. For this, the choice of solution is a factor to consider. An important premise for implementing the solution is to determine the optimization opportunities for the dimension-reduction index structure at dynamically changing query workloads. Therefore, the invention provides a space index optimization method and a space index optimization system for dynamic query load, which are real-time index structure optimization methods (Dya-BMTree) based on KL divergence, and the dimension reduction index structure is optimized through partial reconstruction so as to cope with the query workload which dynamically changes. The system can only carry out local adjustment on the affected part when the load changes, so that the high time cost of full reconstruction is avoided, and the query efficiency is improved. In addition, the system evaluates the load change through the quantization index, so that a dynamic adjustment mechanism is triggered, and the performance bottleneck caused by global reconstruction is avoided. Disclosure of Invention Aiming at the defects in the prior art, the invention aims to provide a space index optimization method and a system for dynamic query load. The space index optimization method facing to the dynamic query load provided by the invention comprises the following steps: Step S1, obtaining a load fluctuation quantization index by using KL divergence for quantizing the change degree of the dynamic query load; S2, constructing a trigger condition based on the double rules of the load fluctuation quantization index and window query time; and step S3, when the triggering condition is met, the dimension reduction index structure of the query load is adjusted, and the space index optimization is realized. Preferably, the step S1 comprises the steps of quantizing query operations in two adjacent time windows into two probability distributions and calculating the difference degree between the two distributions by using KL as an index, wherein the larger the KL value is, the larger the difference degree between the two distributions is; Wherein, the Load window for the previous query; A load window is queried for the current time; Is that The probability distribution condition of access of the load in the data space divided by the dimension-reduction index structure is queried; Is that The probability distribution condition of the access of the query load in the data space divided by the dimension-reduction index structure; The number of queries in the window; Wherein, the ;The number of the data spaces divided for the user definition; To express as Individual query workload atAn array of distribution sizes in the different partitioned spaces; To express the first Number of queries in a block of data space. Preferably, the data space divided by the dimension-reduction index structure is the division of the data range of the query workload; When a single SFC is adopted as a dimension reduction index structure, the data space directly and uniformly divides the range of the query load; When the segment SFC is adopted as the dimension reduction index structure, the data space divides the range of the query load based on the distribution condition of the minimum range nodes of the query load in the BMTree tree. Preferably, the step S2 includes: When the query time change does not meet the preset requirement and the KL value does not reach the threshold value, not adjusting; When the query time change does not meet the preset requir