CN-122019502-A - Database intelligent routing method, device, equipment and storage medium
Abstract
The invention relates to the field of data processing, and discloses a database intelligent routing method, device, equipment and storage medium. The method comprises the steps of receiving and extracting static grammar characteristics and dynamic contexts of a database query request, inputting the static grammar characteristics and the dynamic contexts into a query type classification model, obtaining a transaction and analysis mixed tendency index, judging preliminary node type preference based on the transaction and analysis mixed tendency index, screening nodes with corresponding type labels from a database cluster to be queried as candidate nodes, calculating the fitness scores of the database query request and the candidate nodes, inputting the static grammar characteristics, the dynamic context characteristics and the fitness scores into an intelligent routing decision model, generating a routing decision result, adaptively rewriting the database query request based on the routing decision result, then routing to a target node for execution, and collecting performance indexes reflecting query execution efficiency and resource consumption conditions.
Inventors
- HUANG XIAOYU
Assignees
- 上海乾臻信息科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260126
Claims (10)
- 1. A database intelligent routing method, characterized in that the database intelligent routing method comprises: receiving a database query request, and extracting static grammar characteristics and dynamic context characteristics of the database query request; Inputting the static grammar features and the dynamic context features into a pre-trained query type classification model to obtain a transaction and analysis mixed tendency index corresponding to the database query request, and judging preliminary node type preference based on the transaction and analysis mixed tendency index; Based on the preliminary node type preference, selecting nodes with corresponding type labels from a database cluster to be queried as candidate nodes, and calculating the fitness score of the database query request and each candidate node; inputting the static grammar characteristics, the dynamic context characteristics and the fitness score into a pre-trained intelligent routing decision model to generate a routing decision result, wherein the routing decision result comprises a target node and routing confidence; And carrying out self-adaptive rewriting on the database query request based on the routing decision result, then routing the database query request to a target node for execution, and collecting performance indexes reflecting query execution efficiency and resource consumption conditions in the query execution process.
- 2. The method of claim 1, wherein the receiving a database query request, extracting static grammatical features and dynamic contextual features of the database query request, comprises: Receiving a database query request, and carrying out standardized analysis on SQL sentences of the database query request to obtain an abstract syntax tree; Traversing the abstract syntax tree, and extracting operation type weights, association table numbers, aggregation function numbers, complexity scores and sub-query nesting depths as static syntax features; and extracting a transaction boundary identifier, a request arrival time stamp and a user priority label from the encapsulation protocol or the application log of the database query request as dynamic context characteristics.
- 3. The method of claim 1, wherein inputting the static grammatical feature and the dynamic contextual feature into a pre-trained query type classification model to obtain a transaction and analysis mixed trend index corresponding to the database query request and determining a preliminary node type preference based on the transaction and analysis mixed trend index comprises: constructing and training based on a double-tower neural network structure to obtain a query type classification model; constructing a multidimensional query feature vector based on the static grammar feature and the dynamic context feature, and inputting the multidimensional query feature vector into a query type classification model to obtain a transaction and analysis mixed tendency index corresponding to a database query request; When the transaction and analysis mixed tendency index meets a transaction and analysis mixed tendency index < first preset threshold value, judging that the preliminary node type preference is a transaction type node, when the transaction and analysis mixed tendency index meets a transaction and analysis mixed tendency index > second preset threshold value, judging that the preliminary node type preference is an analysis type node, and when the transaction and analysis mixed tendency index meets the first preset threshold value or less and the transaction and analysis mixed tendency index or less and the second preset threshold value or less, judging that the preliminary node type preference is a mixed node, wherein the second preset threshold value is larger than the first preset threshold value.
- 4. The method according to claim 1, wherein the selecting nodes with corresponding type labels from the database cluster to be queried as candidate nodes based on the preliminary node type preference, and calculating the fitness score of the database query request and each candidate node, includes: Converting the transaction and analysis mixed tendency index into a multi-dimensional expected resource demand vector based on a preset resource demand mapping function, wherein each dimension corresponds to an expected consumption strength of a resource type, and the resource type comprises a computing resource, a memory resource and a storage I/O resource; Based on the preliminary node type preference, selecting nodes with corresponding type labels from a database cluster to be queried as candidate nodes, acquiring multi-dimensional load metrics of the candidate nodes in real time, and converting the multi-dimensional load metrics into node real-time load vectors with the same dimension as the multi-dimensional expected resource demand vector; And converting the node real-time load vector into a node available resource vector representing the node available resource capacity, and calculating cosine similarity between the expected resource demand vector and the node available resource vector of each candidate node, wherein the calculated cosine similarity is the suitability score of the database query request and each candidate node.
- 5. The method of claim 4, wherein the indexing the transaction and analysis mixture trend based on a predetermined resource demand mapping function into a multi-dimensional expected resource demand vector, each dimension corresponding to an expected consumption strength of a resource type, the resource type including a computing resource, a memory resource, and a storage I/O resource, comprises: Predefining a first benchmark demand vector that characterizes a typical demand pattern of a purely transactional query for a plurality of resource dimensions and a second benchmark demand vector that characterizes a typical demand pattern of a purely analytical query for a plurality of resource dimensions; And taking the transaction and analysis mixing tendency index as an interpolation coefficient, carrying out weighted calculation on the first reference demand vector and the second reference demand vector, and generating the multi-dimensional expected resource demand vector, wherein each dimension corresponds to the expected consumption strength of a resource type, and the resource type comprises a computing resource, a memory resource and a storage I/O resource.
- 6. The method of claim 1, wherein inputting the static grammar features, the dynamic context features, and the fitness scores into a pre-trained intelligent routing decision model generates a routing decision result, the routing decision result including a target node and a routing confidence level, comprising: Constructing and training based on a gradient lifting decision tree algorithm to obtain an intelligent routing decision model; combining the static grammar characteristics, the dynamic context characteristics and the fitness scores of the candidate nodes into decision feature vectors; And inputting the decision feature vector into the intelligent routing decision model to generate a routing decision result, wherein the routing decision result comprises a target node and routing confidence.
- 7. The method according to claim 1, wherein the adaptively rewriting the database query request based on the routing decision result is performed by the target node, and collecting performance indexes reflecting query execution efficiency and resource consumption in the query execution process, includes: According to the engine type of the target node, carrying out grammar rewriting or injection of execution prompt information on the SQL statement of the database query request to obtain a rewritten query request; routing the rewritten query request to the target node through a database connection pool; And allocating a tracking identifier for the query request route, and adopting a performance index reflecting the query execution efficiency and the resource consumption condition in the query execution process based on the tracking identifier.
- 8. A database intelligent routing device, comprising: the extraction module is used for receiving a database query request and extracting static grammar characteristics and dynamic context characteristics of the database query request; the judging module is used for inputting the static grammar characteristics and the dynamic context characteristics into a pre-trained query type classification model to obtain a transaction and analysis mixed tendency index corresponding to the database query request, and judging preliminary node type preference based on the transaction and analysis mixed tendency index; the scoring module is used for screening nodes with corresponding type labels from a database cluster to be queried as candidate nodes based on the preliminary node type preference, and calculating the suitability score of the database query request and each candidate node; The decision module is used for inputting the static grammar characteristics, the dynamic context characteristics and the fitness score into a pre-trained intelligent routing decision model to generate a routing decision result, wherein the routing decision result comprises a target node and a routing confidence; And the rewriting module is used for adaptively rewriting the database query request based on the routing decision result, then routing the database query request to a target node for execution, and collecting performance indexes reflecting query execution efficiency and resource consumption conditions in the query execution process.
- 9. A database intelligent routing device comprising a memory and at least one processor, the memory having computer readable instructions stored therein; the at least one processor invoking the computer readable instructions in the memory to perform the steps of the database intelligent routing method of any of claims 1-7.
- 10. A computer readable storage medium having computer readable instructions stored thereon, which when executed by a processor, implement the steps of the database intelligent routing method of any of claims 1-7.
Description
Database intelligent routing method, device, equipment and storage medium Technical Field The present invention relates to the field of data processing, and in particular, to a method, an apparatus, a device, and a storage medium for intelligent routing of a database. Background With the penetration of enterprise data driven decisions, business scenarios place mixed load demands on database systems that handle both high-concurrency online transactions (OLTP) and online analytical processing (OLAP), i.e., mixed transaction/analytical processing (HTAP) demands. The traditional solution thinking is mainly divided into two types, namely adopting a separate architecture, namely an OLTP system and an OLAP system which are independent, and carrying out data synchronization through an ETL, wherein the problems of high data delay, high operation and maintenance complexity and superposition of hardware cost exist in the method, and the two types adopt an integrated HTAP database which shares one data but still depends on two different calculation engines or storage formats of transaction processing and analysis processing in the interior. Currently, whether separate architecture or an integrated HTAP database, the core challenge when faced with a hybrid load is how to intelligently and efficiently route immediate query requests to the most appropriate computing resources. The existing routing strategy is mostly based on simple rules, for example, coarse-grained judgment is performed according to whether query contains syntax features such as complex aggregation, table connection and the like, or division is performed statically according to data slicing and service modules. The method fails to deeply mine the real-time context and the dynamic system state of the query, has obvious limitations that firstly, the rule is static and stiff, the method cannot adapt to the dynamic change of the query mode and the different mixing degree, secondly, the method lacks comprehensive consideration of real-time loads (such as CPU, memory and I/O) of all nodes of the database cluster during decision, hot spot unevenness is easy to cause, the whole throughput and the response time are influenced, thirdly, the method lacks a closed-loop mechanism for continuous learning and self optimization from a historical routing result, and the routing strategy is difficult to adjust along with the system operation. Accordingly, there is a need for improvement and development in the art. Disclosure of Invention The invention provides a database intelligent routing method, device, equipment and storage medium, which are used for intelligently extracting query feature matching database nodes. The first aspect of the invention provides an intelligent routing method for a database, which comprises the steps of receiving a database query request, extracting static grammar characteristics and dynamic context characteristics of the database query request, inputting the static grammar characteristics and the dynamic context characteristics into a pre-trained query type classification model to obtain a transaction and analysis mixed tendency index corresponding to the database query request, judging preliminary node type preference based on the transaction and analysis mixed tendency index, screening nodes with corresponding type labels from a database cluster to be queried based on the preliminary node type preference, calculating the suitability score of the database query request and each candidate node, inputting the static grammar characteristics, the dynamic context characteristics and the suitability score into a pre-trained intelligent routing decision model to generate a routing decision result, wherein the routing decision result comprises a target node and routing confidence, performing self-adaption routing on the database query request to the target node based on the routing decision result, and executing the routing to the target node based on the routing decision result, and reflecting the query execution performance and resource consumption performance index in the query execution rewrite process. Optionally, in a first implementation manner of the first aspect of the present invention, the receiving the database query request, extracting the static grammar feature and the dynamic context feature of the database query request includes receiving the database query request, performing standardized parsing on an SQL statement of the database query request to obtain an abstract syntax tree, traversing the abstract syntax tree, extracting an operation type weight, a number of association tables, an aggregation function number and a complexity score, and a sub-query nesting depth as the static grammar feature, and extracting a transaction boundary identifier, a request arrival timestamp, and a user priority tag from an encapsulation protocol or an application log of the database query request as the dynamic context feature. Optionally, in a sec