CN-121786172-B - Question-answer type intelligent query method and system based on power distribution network knowledge base
Abstract
The application relates to the technical field of question-answer intelligent inquiry, in particular to a question-answer intelligent inquiry method and a question-answer intelligent inquiry system based on a power distribution network knowledge base, wherein the method comprises the steps of obtaining question-answer pairs consisting of question texts and answer texts in a history question-answer record; classifying all question-answer pairs, screening to obtain high-frequency sets of all questions and low-frequency sets of all questions, obtaining high-correlation question high-frequency sets of all the low-frequency sets of all the questions, obtaining the degree of correlation of all the low-frequency sets of all the questions and the high-frequency sets of all the high-correlation questions, obtaining the degree of correlation of all the question-answer pairs of all the low-frequency sets of all the questions and all the question-answer pairs of all the high-correlation question high-frequency sets of all the question-answer pairs, obtaining the semantic consistency of all the question-answer pairs of all the low-frequency sets of all the questions, constructing a FAQ corpus, and performing intelligent query of the question-answer pairs. The application aims to improve the efficiency and the accuracy of question-answer intelligent query based on a power distribution network knowledge base by constructing a comprehensive and accurate FAQ corpus.
Inventors
- OUYANG WEN
- WEN QIANG
- LIU LI
Assignees
- 湖南新天电数科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260306
Claims (10)
- 1. The question-answer type intelligent query method based on the power distribution network knowledge base is characterized by comprising the following steps of: Acquiring each question-answer pair consisting of a question text and an answer text in a history question-answer record; Classifying all question-answer pairs according to semantic information of keywords in all question-answer pairs, acquiring keyword sets of all questions according to the keywords of question texts in all question-answer pairs aiming at various types, screening to obtain high-frequency sets of all questions and low-frequency sets of all questions according to occurrence conditions of the keyword sets of all questions, acquiring high-correlation question sets of all question low-frequency sets by comparing the low-frequency sets of all questions with the high-frequency sets of all questions, acquiring the degree of correlation of questions between the low-frequency sets of all questions and the high-correlation question sets of all questions by similarity of word vectors of the keywords between the low-frequency sets of all questions and the high-frequency sets of all questions, acquiring the degree of correlation of answers between the low-frequency sets of all questions and the high-correlation question sets of all questions by combining the degree of correlation of all questions and the high-frequency sets of all questions, constructing a corpus of questions and answers by using the degree of correlation of all questions and the questions to construct a corpus of answers by using the corpus of FAQ.
- 2. The intelligent query method based on question-answer of power distribution network knowledge base as claimed in claim 1, wherein said classifying all question-answer pairs comprises: the word vectors of all keywords of the question text in each question-answer pair are spliced with the word vectors of all keywords of the answer text, so that comprehensive vectors of each question-answer pair are obtained; and classifying all question-answer pairs according to the similarity between the comprehensive vectors of all question-answer pairs.
- 3. The intelligent query method based on the question-answer type power distribution network knowledge base as claimed in claim 1, wherein the process of obtaining the high-frequency set of each question and the low-frequency set of each question is as follows: counting the occurrence frequency of various problem keyword sets, marking a segmentation threshold value of the occurrence frequency of all the problem keyword sets as a frequency segmentation threshold value, taking various problem keyword sets with the occurrence frequency larger than or equal to the frequency segmentation threshold value as various problem high-frequency sets, and taking the rest various problem keyword sets as various problem low-frequency sets.
- 4. The intelligent query method based on the question-answer type power distribution network knowledge base as claimed in claim 1, wherein the process of obtaining each high-correlation question high-frequency set of each question low-frequency set is: The segmentation threshold value of the similarity between the low-frequency set of each problem and the high-frequency set of all the problems is recorded as a similarity segmentation threshold value; and taking each problem high-frequency set with the similarity between the problem high-frequency set and each problem low-frequency set being greater than or equal to the similarity segmentation threshold value as each high-correlation problem high-frequency set of each problem low-frequency set.
- 5. The intelligent query method based on question answering of the power distribution network knowledge base as set forth in claim 1, wherein the process of obtaining the degree of correlation of the questions is: the similarity of word vectors between each keyword in the low-frequency set of each question and each keyword in the high-frequency set of each high-correlation question is recorded as the similarity of the question keywords; and the problem association degree is obtained through the similarity of the problem keywords between all keywords in the low-frequency set of each problem and all keywords in the high-frequency set of each high-correlation problem.
- 6. The intelligent query method based on a knowledge base of a distribution network according to claim 5, wherein the problem association degree is a mean value of similarity of all keywords in a low-frequency set of each problem and all keywords in a high-frequency set of each high-correlation problem.
- 7. The intelligent query method based on question-answer of power distribution network knowledge base as claimed in claim 1, wherein the answer association obtaining process is as follows: Aiming at any question-answer pair to which any question low-frequency set belongs and any question-answer pair to which any high-correlation question high-frequency set belongs, marking the maximum value in word vector similarity between any keyword of answer text in any question-answer pair and all keywords of answer text in any question-answer pair as answer keyword similarity between any keyword and any question-answer pair; calculating the similarity of any keyword of the answer text in any question-answer pair and the answer keyword between any question-answer pair; Aiming at the similarity of all keywords of the answer text in any question-answer pair and the answer keywords between any question-answer pair, and the similarity of all keywords of the answer text in any question-answer pair and the answer keywords between any question-answer pair, calculating together to obtain an average value; counting the number proportion of keywords with the similarity of the answer keywords in the answer text in any question-answer pair being greater than or equal to a preset threshold value in all keywords; And obtaining the answer association degree between any question-answer pair and any question-answer pair by fusing the average value and the quantity ratio.
- 8. The intelligent query method based on the question-answer type power distribution network knowledge base as claimed in claim 1, wherein the semantic consistency obtaining process is as follows: acquiring consistency coefficients between question-answer pairs of each question low-frequency set and question-answer pairs of each high-correlation question high-frequency set by combining the question relevance and the answer relevance; And taking the maximum value in the consistency coefficient between each question-answer pair of each question low-frequency set and all question-answer pairs of all high-correlation question high-frequency sets as the semantic consistency degree of each question-answer pair of each question low-frequency set.
- 9. The intelligent query method based on question-answer of power distribution network knowledge base according to claim 1, wherein the screening question-answer pairs for constructing a FAQ corpus according to the semantic consistency comprises: and marking the segmentation threshold value of the semantic consistency degree of the question answer pairs belonging to the low-frequency sets of all the questions in all the classes as a semantic consistency segmentation threshold value, and determining the question answer pairs belonging to the low-frequency sets of all the questions as the question answer pairs for constructing the FAQ corpus when the semantic consistency degree of the question answer pairs belonging to the low-frequency sets of all the questions is greater than the semantic consistency segmentation threshold value.
- 10. A question-answer intelligent query system based on a power distribution network knowledge base, comprising a memory, a processor and a computer program stored in the memory and running on the processor, characterized in that the processor implements the steps of the question-answer intelligent query method based on a power distribution network knowledge base according to any one of claims 1-9 when executing the computer program.
Description
Question-answer type intelligent query method and system based on power distribution network knowledge base Technical Field The application relates to the technical field of question-answer intelligent inquiry, in particular to a question-answer intelligent inquiry method and system based on a power distribution network knowledge base. Background In recent years, the structure of a power distribution network is increasingly complex, and the mass of operation data is increased, so that the efficiency of the traditional power distribution network fault knowledge query by means of expert experience is low. The development of the AI technology provides technical support for the digital and intelligent transformation of the power distribution network, wherein the question-answer type intelligent query based on the knowledge base of the power distribution network is one of important application fields, and the knowledge retrieval efficiency and the operation stability of the power distribution network can be remarkably improved. However, the existing question-answer type intelligent query method mainly retrieves answers of input questions from a power distribution network massive knowledge base, the actual power distribution network knowledge base is massive in data, data types are complex and various, high-frequency questions and low-frequency questions are stored in a mixed mode, waiting time for inquiring the high-frequency questions is prolonged, although the intelligent query efficiency is improved through a method for constructing an FAQ corpus by counting the high-frequency questions, the method mainly focuses on the frequency dimension of the questions, semantic similarity of the questions in different expression forms is not fully considered, the same type of questions with the same semantic relation are difficult to accurately identify, for example, different people have differences in question modes of the same questions, although answers are consistent, statistical frequency of the same type of questions is low, the content of the FAQ corpus is incomplete, and efficiency and accuracy of question-answer type intelligent query based on the power distribution network knowledge base are affected. Disclosure of Invention In view of the above, it is necessary to provide a query-and-answer type intelligent query method and system based on a power distribution network knowledge base, which improve the efficiency and accuracy of the query-and-answer type intelligent query based on the power distribution network knowledge base by constructing a comprehensive and accurate FAQ corpus, compared with the traditional query-and-answer type intelligent query method based on the power distribution network knowledge base. In a first aspect, an embodiment of the present application provides a question-answer intelligent query method based on a power distribution network knowledge base, where the method includes the following steps: Acquiring each question-answer pair consisting of a question text and an answer text in a history question-answer record; Classifying all question-answer pairs according to semantic information of keywords in all question-answer pairs, acquiring keyword sets of all questions according to the keywords of question texts in all question-answer pairs aiming at various types, screening to obtain high-frequency sets of all questions and low-frequency sets of all questions according to occurrence conditions of the keyword sets of all questions, acquiring high-correlation question sets of all question low-frequency sets by comparing the low-frequency sets of all questions with the high-frequency sets of all questions, acquiring the degree of correlation of questions between the low-frequency sets of all questions and the high-correlation question sets of all questions by similarity of word vectors of the keywords between the low-frequency sets of all questions and the high-frequency sets of all questions, acquiring the degree of correlation of answers between the low-frequency sets of all questions and the high-correlation question sets of all questions by combining the degree of correlation of all questions and the high-frequency sets of all questions, constructing a corpus of questions and answers by using the degree of correlation of all questions and the questions to construct a corpus of answers by using the corpus of FAQ. In one embodiment, the process of classifying all question and answer pairs is as follows: the word vectors of all keywords of the question text in each question-answer pair are spliced with the word vectors of all keywords of the answer text, so that comprehensive vectors of each question-answer pair are obtained; and classifying all question-answer pairs according to the similarity between the comprehensive vectors of all question-answer pairs. In one embodiment, the acquiring process of the high-frequency set of each problem and the low-frequency set of each problem is as fo