CN-122019717-A - Meteorological question-answer matching method and device
Abstract
The invention provides a weather question-answer matching method and a weather question-answer matching device, which relate to the technical field of natural language processing, wherein the method comprises the steps of acquiring a natural language question input by a user; the method comprises the steps of carrying out semantic analysis and intention recognition on natural language questions, classifying tag vectors, retrieving entity relation sets from weather domain knowledge maps based on the classified tag vectors, retrieving text fragments from a text library, calculating attention weights of the classified tag vectors, the entity relation sets and the text fragments by adopting a dynamic attention model, carrying out feature fusion on the classified tag vectors, the entity relation sets and the text fragments based on the attention weights to obtain guide vectors, and generating response information corresponding to the natural language questions based on the guide vectors. The weather question-answer matching method and device provided by the invention can be used for more accurately and comprehensively carrying out guided reasoning based on the guide vector, so that the accuracy of response information is ensured, and the interactive experience of a user is further improved.
Inventors
- ZHAO WENFANG
- MENG HUIFANG
- HUANG MINGMING
- FAN MIN
- MIAO YUPENG
- Mu Qizhan
- Tian Xuwei
Assignees
- 北京市气象数据中心(北京市气象档案馆)
Dates
- Publication Date
- 20260512
- Application Date
- 20260128
Claims (10)
- 1. A weather question-answer matching method, the method comprising: Acquiring a natural language problem input by a user; Carrying out semantic analysis and intention recognition on the natural language problem to obtain a classification tag vector of the natural language problem, wherein elements in the classification tag vector are used for representing the probability that the natural language problem belongs to a corresponding problem type; Retrieving a plurality of entities and a plurality of relations related to the natural language problem from a pre-constructed weather domain knowledge graph based on the classification tag vector to obtain an entity relation set comprising a plurality of the entities and a plurality of the relations; Calculating the attention weights of the classification tag vector, the entity relation set and the text segment by adopting a pre-constructed dynamic attention model, and carrying out feature fusion on the classification tag vector, the entity relation set and the text segment based on the attention weights to obtain a guide vector corresponding to the natural language problem; response information corresponding to the natural language question is generated based on the guide vector.
- 2. The method of claim 1, wherein the step of performing semantic parsing and intent recognition on the natural language question to obtain a classified tag vector for the natural language question comprises: word segmentation pretreatment is carried out on the natural language problem to obtain input information meeting a preset format corresponding to the natural language problem; Inputting the input information into a pre-trained natural language processing model, and extracting semantic features of the natural language problem through the natural language processing model; And predicting the probability that the natural language questions belong to each question type based on the semantic features, and generating the classification label vector according to the prediction result.
- 3. The method of claim 1, wherein retrieving a plurality of entities related to the natural language problem from a pre-constructed weather domain knowledge graph based on the class tag vector comprises: Acquiring a knowledge graph directed graph corresponding to the knowledge graph of the meteorological field; Mapping the relations among the weather-related entities in the knowledge graph directed graph to a pre-constructed low-dimensional embedded space to obtain weather entity vectors and corresponding relation vectors; calculating the correlation degree of each meteorological entity vector and the classification label vector in the low-dimensional embedded space; and determining an entity corresponding to the meteorological entity vector with the correlation degree larger than a preset threshold as an entity related to the natural language problem.
- 4. A method according to claim 3, wherein the step of retrieving a plurality of relationships related to the natural language question from a pre-constructed weather domain knowledge graph based on the classification tag vector comprises: Calculating the similarity of each relation vector and the classification label vector; and selecting the relationship vector with the highest similarity and the preset quantity as a plurality of relationships related to the natural language problem.
- 5. A method according to claim 3, wherein the step of calculating the degree of correlation of each of the weather entity vectors with the classification tag vector comprises: calculating semantic similarity and frequency correlation of each meteorological entity vector and the classification label vector; Weighting calculation is carried out on the semantic similarity and the frequency correlation according to preset weight parameters, and a correlation score of the meteorological entity vector and the classification label vector is obtained; and weighting and calculating the relevance score based on each element in the classification label vector to obtain the relevance degree.
- 6. The method of claim 1, wherein retrieving a preset number of text segments semantically related to the natural language question from the pre-constructed text library comprises: constructing a comprehensive vector based on the natural language questions and the classification tag vector; Inputting the comprehensive vector into a pre-constructed semantic matching model, and enabling the semantic matching model to search a preset number of text fragments which are most relevant to the natural language problem semantics from the text library based on the comprehensive vector.
- 7. The method of claim 1, wherein the step of calculating the category label vector, the set of entity relationships, and the attention weight of the text segment using a pre-constructed dynamic attention model comprises: Extracting the classification tag vector, the entity relation set and the characteristic representation of the text segment respectively; and respectively constructing corresponding dynamic attention weight functions based on the feature representations, and calculating the corresponding attention weight of each feature representation according to the dynamic attention weight functions.
- 8. The method of claim 7, wherein the step of feature fusing the category label vector, the set of entity relationships, and the text segment based on the attention weight to obtain a guide vector corresponding to the natural language question comprises: And carrying out weighted fusion on the classification tag vector, the entity relation set and the characteristic representation corresponding to the text segment by using the attention weight to obtain a guide vector corresponding to the natural language problem.
- 9. The method according to claim 1, wherein the method further comprises: And if the classification tag vector represents that the problem type of the natural language problem is a simple type, retrieving response information corresponding to the natural language problem from the weather domain knowledge graph or retrieving response information corresponding to the natural language problem from the text library based on the semantics of the natural language problem.
- 10. A weather question-answer matching device, the device comprising: the acquisition module is used for acquiring natural language problems input by a user; The system comprises a natural language question analyzing module, a classifying label vector analyzing module and a judging module, wherein the natural language question analyzing module is used for carrying out semantic analysis and intention recognition on the natural language question to obtain a classifying label vector of the natural language question, and elements in the classifying label vector are used for representing the probability that the natural language question belongs to a corresponding question type; The retrieval module is used for retrieving a plurality of entities and a plurality of relations related to the natural language problem from a pre-constructed weather domain knowledge graph based on the classification label vector to obtain an entity relation set comprising a plurality of the entities and a plurality of the relations; The fusion module is used for calculating the attention weights of the classification tag vector, the entity relation set and the text segment by adopting a pre-constructed dynamic attention model, and carrying out feature fusion on the classification tag vector, the entity relation set and the text segment based on the attention weights to obtain a guide vector corresponding to the natural language problem; and the response module is used for generating response information corresponding to the natural language problem based on the guide vector.
Description
Meteorological question-answer matching method and device Technical Field The invention relates to the technical field of natural language processing, in particular to a weather question-answer matching method and device. Background In the intelligent question-answering system in the meteorological field, the prior art mainly comprises two main schemes, namely a scheme based on traditional information retrieval. The scheme retrieves text fragments or data related to the problem from a structured weather database or unstructured weather documents through keyword matching and directly returns the text fragments or data to the user. This approach is suitable for dealing with simple, de facto problems such as "what is today's air temperature. And the second category is a scheme based on a general large language model. The scheme directly utilizes the natural language generating capability of the general large model which is not optimized in the field to directly generate answers to complex questions input by users. However, aiming at the first type of traditional retrieval technical scheme, the main defects are that the deep semantic understanding and reasoning capability is lacking, the system cannot understand the implicit complex logic relationship in the problem, such as time sequence, causal relationship and the like, only scattered and fragmented information can be returned, and coherent and structured answers cannot be automatically fused and generated, so that the user experience is poor. The technical scheme aiming at the second general large model has the main defects that the reliability of the generated result is low, and the technical scheme is particularly characterized in that a factual illusion is easy to generate, such as that the large model possibly generates a reasonable and weather fact wrong answer, knowledge hysteresis and real-time property are not available, such as that knowledge in parameters of the large model is static, latest real-time weather data (such as typhoon path forecast just generated) cannot be accessed and utilized, and the field expertise is insufficient, such as that the accurate weather field knowledge support is lacked, so that the reasoning process is not strict, and the conclusion expertise cannot be guaranteed. Therefore, in the prior art, when facing complex weather questions and answers, the intelligent question-answering system in the weather field often returns multi-source heterogeneous information fusion to be difficult, and complex semantics and multi-step reasoning capability are weak, so that the problems of answer accuracy, instantaneity and professionality are difficult to consider, and the user experience is further reduced. Disclosure of Invention Therefore, the invention aims to provide a weather question-answer matching method and device for alleviating the technical problems. The embodiment of the invention provides a weather question-answer matching method, which comprises the steps of obtaining a natural language question input by a user, carrying out semantic analysis and intention recognition on the natural language question to obtain a classification tag vector of the natural language question, wherein elements in the classification tag vector are used for representing the probability that the natural language question belongs to a corresponding question type, retrieving a plurality of entities and a plurality of relations related to the natural language question from a pre-constructed weather domain knowledge graph based on the classification tag vector to obtain an entity relation set containing the entities and the relations, retrieving a preset number of text fragments related to the natural language question semantic from a pre-constructed text library, calculating attention weights of the classification tag vector, the entity relation set and the text fragments by adopting a pre-constructed dynamic attention model, carrying out feature fusion on the classification tag vector, the entity relation set and the text fragments based on the attention weights to obtain a guide vector corresponding to the natural language question, and generating a guide vector corresponding to the natural language question based on the guide information. With reference to the first aspect, an embodiment of the present invention provides a first possible implementation manner of the first aspect, where the step of performing semantic analysis and intent recognition on the natural language question to obtain a classification tag vector of the natural language question includes performing word segmentation preprocessing on the natural language question to obtain input information meeting a preset format corresponding to the natural language question, inputting the input information into a pre-trained natural language processing model, extracting semantic features of the natural language question through the natural language processing model, predicting a probabili