CN-121996755-A - Intelligent power data question-answering method and device based on efficient fine tuning and tool calling

CN121996755ACN 121996755 ACN121996755 ACN 121996755ACN-121996755-A

Abstract

The invention discloses an intelligent power data question-answering method and device based on efficient fine tuning and tool calling, and relates to the technical field of artificial intelligence. The method comprises the steps of inquiring natural language problems according to historical high-frequency electric power business data, constructing problem templates, writing SQL code templates corresponding to each problem template, acquiring data values for filling the problem templates and the SQL code templates to be set as candidates, randomly selecting one problem template and the candidate filling problem template and the corresponding SQL code template to generate fine-tuning corpus, enabling a Text-to-SQL model to automatically generate SQL inquiry sentences and complete time analysis and region mapping tasks by adopting a LoRA efficient fine-tuning method and a formulated tool calling mechanism based on the fine-tuning corpus, and conducting knowledge supplement through a RAG compensation tool when SQL inquiry fails or results are empty, so that natural language inquiry, automatic analysis and visual display of electric power data are achieved, and the efficiency and the intelligent level of data analysis are improved.

Inventors

ZHAO WEIBO
TIAN CHUANBO
ZHANG LINQING
WEN HUAJIE
DU CONGCONG
XU JING
WEI YUSI
WU YAJIE
JIN LU
BAI XUEFENG
QIU MIN
LUO XIONG
BAI XUE
QIU JINGYUE
YANG YUANDONG
ZHOU YING
CHEN KE
LIU YINGHUI
ZHOU YAJUN

Assignees

中国电力科学研究院有限公司
北京科技大学

Dates

Publication Date: 20260508
Application Date: 20251231

Claims (10)

1. An intelligent power data question-answering method based on efficient fine tuning and tool calling is characterized by comprising the following steps: S1, inquiring natural language questions according to historical high-frequency power business data, and constructing question templates; s2, acquiring data values for filling a problem template and an SQL code template, setting a candidate to randomly select one problem template, randomly selecting the candidate to fill the problem template and the corresponding SQL code template, and generating a fine-tuning corpus of a pre-trained Text-to-SQL model; s3, based on the fine tuning corpus, adopting LoRA algorithm to perform fine tuning on the pre-trained Text-to-SQL model to obtain a fine-tuned Text-to-SQL large model, wherein the fine-tuned Text-to-SQL large model is packaged as an SQL tool; S4, constructing a tool calling mechanism, wherein the tool calling mechanism comprises an SQL generating tool and an RAG compensating tool for packaging the Text-to-SQL big model after fine tuning; S5, acquiring a natural language problem of user query and inputting a general large language model, wherein the general large language model carries out intention recognition and parameter analysis on the user query and judges whether to call a tool or not; S6, the general large language model calls an SQL generating tool of the Text-to-SQL large model after encapsulation fine tuning based on an intermediate result returned by the tool, generates a target SQL sentence and executes the target SQL sentence to output a structured query result, generates an answer corresponding to a user query natural language question based on the structured query result if the structured query result is judged to be effective, calls an RAG compensating tool to carry out knowledge supplement if the structured query result is judged to be abnormal, and obtains a search result, and outputs the answer corresponding to the user query natural language question according to the search result.
2. The intelligent power data question-answering method based on efficient fine-tuning and tool calling of claim 1, wherein the tool calling mechanism further comprises a time resolution and standardization tool, an industry and locale mapping tool, an SQL execution and security audit tool and a drawing tool.
3. The intelligent power data question-answering method based on efficient fine tuning and tool calling according to claim 2, wherein the time resolution and standardization tool is used for mapping user semantic time stability into executable time conditions and eliminating instability; the industry and region mapping tool is used for maintaining a standard coding system and an alias library of the industry and the region so as to map the natural language name into a unique code; The SQL generating tool of the Text-to-SQL big model after fine tuning is packaged and used for outputting a structured query result corresponding to a natural language problem of a user; the SQL execution and security audit tool is used for performing compliance checking and read-only execution mechanisms on SQL and comprises checking a table, a field or a function white list, intercepting a dangerous statement, forcibly limiting return scale and controlling overtime; The drawing tool is used for making an automatic mapping rule from a result to a graph type, unifying graph metadata and a display caliber, supporting common visual requirements of time sequence and classification summarization, and outputting structural description required by graph configuration and display so as to ensure that the graph style is consistent with the caliber under different scenes; The RAG compensation tool is used for retrieving and supplementing information most relevant to the natural language query problem of the user through the knowledge base.
4. The intelligent power data question-answering method based on efficient fine tuning and tool calling according to claim 1, wherein the step S3 of fine tuning the pre-trained Text-to-SQL model by LoRA algorithm based on the fine tuning corpus to obtain a fine tuned Text-to-SQL big model comprises: LoRA is inserted into the linear mapping position of the attention layer and the feedforward layer of the pre-training Text-to-SQL model, the base weight is frozen, and only the low-rank increment parameter is trained; And fixing a weight matrix of the pre-trained Text-to-SQL model, and adjusting the weight of the pre-trained Text-to-SQL model by adding a low-rank matrix to obtain a fine-tuned Text-to-SQL large model.
5. The intelligent power data question-answering method based on efficient fine tuning and tool calling according to claim 4, wherein the process of obtaining the fine-tuned Text-to-SQL big model is represented by the following formula (1): (1) Wherein, the A feature representation representing a user natural language query entered into a Text-to-SQL big model; representing a trainable first matrix; representing a trainable second matrix; representing the rank of the matrix; The number of rows representing the weight matrix; representing the number of columns of the weight matrix, where the rank Number of rows far smaller than the weight matrix Column number of weight matrix Is the minimum value of (a); representing a low rank matrix; a weight matrix representing a Text-to-SQL big model.
6. The intelligent power data question-answering method based on efficient fine tuning and tool calling according to claim 1, wherein if the structured query result in S6 is abnormal, calling a RAG compensation tool to perform knowledge supplementation to obtain a search result, comprising: if the structured query result is abnormal, calling a RAG compensation tool, and searching information most relevant to the user query natural language problem by means of semantic neighbor searching and keyword matching based on a pre-constructed knowledge base to obtain candidate search results; Based on the candidate search result, carrying out rule filtering processing according to service constraint to obtain a filtered search result, wherein the rule filtering content comprises a time validity period, a regional range, an industry caliber and an electric quantity type; and ranking the processed results, and splicing the data sets with the three names before ranking with a preset prompt template to obtain a standardized context.
7. The intelligent power data question-answering method based on efficient fine tuning and tool calling according to claim 1, wherein the outputting the answer corresponding to the user query natural language question according to the search result comprises: Carrying out structural analysis on the natural language problem of the user query, and extracting time type, electricity type, start-stop date, industry code and region code parameters to generate a unique code as a cache key, wherein the unique code is judged to be the same query only when the code contents are completely consistent; Searching a cache by using a unique code when a query request sent by a user is received, and inputting the cached historical query data into a general large language model to obtain a query result if the query request is hit; And inputting the standardized context into a general large language model, pushing the standardized context to a client in an incremental text form according to sentence, paragraph or token granularity fragments by an SSE (simple sequence analysis) streaming output mode until a ending mark is sent, and outputting the final answered incremental content.
8. An intelligent power data question-answering device based on efficient fine tuning and tool calling, which is used for realizing the intelligent power data question-answering method based on efficient fine tuning and tool calling as claimed in any one of claims 1-7, and is characterized in that the device comprises: the first construction unit is used for inquiring natural language questions according to the historical high-frequency power business data and constructing question templates; The generating unit is used for acquiring data values for filling the problem templates and the SQL code templates, setting the data values as candidate random selection problem templates, randomly selecting candidate filling problem templates and corresponding SQL code templates, and generating fine-tuning corpus of the pre-trained Text-to-SQL model; the fine tuning unit is used for carrying out fine tuning on the pre-trained Text-to-SQL model by adopting LoRA algorithm based on the fine tuning corpus to obtain a fine-tuned Text-to-SQL big model, wherein the fine-tuned Text-to-SQL big model is packaged as an SQL tool; The second construction unit is used for constructing a tool calling mechanism, wherein the tool calling mechanism comprises an SQL generating tool and an RAG compensating tool for packaging the fine-tuned Text-to-SQL big model; The system comprises a first output unit, a first input unit, a first output unit and a second output unit, wherein the first output unit is used for acquiring a natural language problem queried by a user and inputting a general large language model; The second output unit is used for calling an SQL generating tool for encapsulating the fine-tuned Text-to-SQL big model based on the intermediate result returned by the tool, generating a target SQL sentence and executing the target SQL sentence, outputting a structured query result, generating an answer corresponding to the user query natural language question based on the structured query result if the structured query result is judged to be effective, calling an RAG compensating tool to carry out knowledge supplement if the structured query result is judged to be abnormal, obtaining a search result, and outputting the answer corresponding to the user query natural language question according to the search result.
9. The utility model provides a power data intelligence questioning and answering equipment based on high-efficient fine setting and instrument call which characterized in that, power data intelligence questioning and answering equipment based on high-efficient fine setting and instrument call includes: A processor; A memory having stored thereon computer readable instructions which, when executed by the processor, implement the method of any of claims 1 to 7.
10. A computer readable storage medium having stored therein program code which is callable by a processor to perform the method of any one of claims 1 to 7.

Description

Intelligent power data question-answering method and device based on efficient fine tuning and tool calling Technical Field The invention relates to the technical field of artificial intelligence, in particular to an intelligent power data question-answering method and device based on efficient fine tuning and tool calling. Background Along with the continuous increase of the data scale of the power industry, the traditional query mode relying on manually writing SQL sentences is difficult to meet the requirements of real-time analysis and multidimensional statistics, and particularly has low query efficiency and high error rate when facing complex time, region and industry condition combinations. At present, the power industry generally adopts a manual query mode based on SQL in a data analysis and report statistics system, and the system is generally composed of a database management module, a visual display module and a user input interface, and the query statement is written manually to realize data retrieval and statistics. Because the power business relates to time, region, industry and electricity consumption type multidimensional structured information, SQL sentences are complicated to write, the query process depends on professional knowledge, the system availability is low, the query efficiency is poor, and natural language interaction and real-time response are difficult to support. In recent years, the Text-to-SQL model was introduced into the database question-and-answer field, enabling users to implement automatic SQL generation through natural language. However, most of the existing models are based on general corpus training, special fields and query logic in the power industry are not combined, semantic matching capability is insufficient, and accuracy of generated results is low. Meanwhile, the existing model generally adopts a full-parameter training mode in a fine adjustment stage, so that the occupied computing resource is high, the deployment cost is high, and the requirements of enterprise scenes on quick adaptation and low-cost updating cannot be met. Some researches introduce LoRA to carry out light fine adjustment, but the field fusion aiming at the electric power data characteristics cannot be carried out, so that the generalization capability of the model in professional questions and answers is limited. In addition, the prior art lacks a perfect tool calling mechanism, the model can not automatically complete time condition analysis, regional and industry field mapping and visual output, manual auxiliary processing is needed, and an interactive chain is incomplete. When SQL execution fails or results are empty, RAG retrieval compensation mechanisms are also lacking, the system cannot provide alternative answers or knowledge supplement, and the overall robustness is insufficient. In summary, the prior art cannot realize a power data question-answering method with the functions of efficient fine tuning, intelligent tool calling and knowledge retrieval compensation, and is difficult to meet the requirements of the power industry on intelligent and accurate data analysis. Disclosure of Invention In order to solve the technical problems of complex writing of query sentences, poor model adaptability, lack of automation tool coordination and result compensation in the power data analysis in the prior art, the embodiment of the invention provides a power data intelligent question-answering method and device based on efficient fine tuning and tool calling. The technical scheme is as follows: In one aspect, a method for intelligent power data question-answering based on efficient fine tuning and tool calling is provided, the method is implemented by intelligent power data question-answering equipment based on efficient fine tuning and tool calling, and the method comprises the following steps: S1, inquiring natural language questions according to historical high-frequency power business data, and constructing question templates; s2, acquiring data values for filling a problem template and an SQL code template, setting a candidate to randomly select one problem template, randomly selecting the candidate to fill the problem template and the corresponding SQL code template, and generating a fine-tuning corpus of a pre-trained Text-to-SQL model; s3, based on the fine tuning corpus, adopting LoRA algorithm to perform fine tuning on the pre-trained Text-to-SQL model to obtain a fine-tuned Text-to-SQL large model, wherein the fine-tuned Text-to-SQL large model is packaged as an SQL tool; S4, constructing a tool calling mechanism, wherein the tool calling mechanism comprises an SQL generating tool and an RAG compensating tool for packaging the Text-to-SQL big model after fine tuning; S5, acquiring a natural language problem of user query and inputting a general large language model, wherein the general large language model carries out intention recognition and parameter analysis on the user query