CN-116662515-B - Search type multi-round dialogue method and device, storage medium and electronic equipment
Abstract
The invention discloses a search type multi-round dialogue method and device, a storage medium and electronic equipment, belonging to the field of natural language and artificial intelligence, and solving the technical problems of how to effectively utilize the last sentence in a history dialogue to complete semantic matching and information screening on the rest history dialogue sequence and simultaneously screen candidate responses so as to improve the prediction accuracy of a multi-round dialogue model; the method comprises the steps of constructing a multi-round dialogue model, constructing a history dialogue sequence by marking each sentence in a history dialogue according to dialogue sequence for each piece of data in a multi-round dialogue data set, constructing candidate responses for each history dialogue sequence, respectively encoding each sentence in the history dialogue sequence and the candidate responses to obtain an encoding representation of each sentence in the history dialogue and a candidate response encoding representation, and training the multi-round dialogue model.
Inventors
- DU YUEHAN
- Mou Beiqi
- CHEN WEIMIN
- GAO JUN
- ZHANG XIANG
- LU WENPENG
Assignees
- 山东省精神卫生中心
- 齐鲁工业大学(山东省科学院)
Dates
- Publication Date
- 20260505
- Application Date
- 20230615
Claims (8)
- 1. A search type multi-round dialogue method, which is characterized in that the method comprises the following steps: The method comprises the steps of obtaining a multi-round dialogue data set, namely crawling dialogue data on an online public medical question-answering platform by using a crawler technology to obtain a multi-round dialogue knowledge base, constructing positive example data and negative example data for each piece of dialogue data, and dividing a training set and a testing set according to a set proportion; Constructing a multi-round dialogue model, namely, for each piece of data in a multi-round dialogue data set, sequentially marking each sentence in a history dialogue according to a dialogue sequence to construct a history dialogue sequence, and constructing a candidate response for each history dialogue sequence, respectively carrying out coding processing on each sentence and the candidate response in the history dialogue sequence to obtain a coding representation of each sentence in the history dialogue and a candidate response coding representation, respectively carrying out semantic matching and information screening processing on the coding representation of each sentence in the rest dialogue sentences by taking a boundary sentence as a boundary, simultaneously carrying out information screening processing on the candidate response coding representation, completing information screening operation on the candidate response coding representation, obtaining dialogue semantic representation through aggregation operation, mapping the dialogue semantic representation into floating point type data on a designated interval, taking the floating point type data as the matching degree of the candidate response and the history dialogue, comparing the matching degrees of different candidate responses, and taking the candidate response with the highest matching degree as a correct response; training the multi-round dialogue model by utilizing a training set, constructing a loss function and an optimization function, and completing the prediction of candidate response; the demarcation matching screening is specifically as follows: Coupling the boundary statement code representation and the code representation 1, and mapping the coupling result by using a layer of fully-connected network Dense with an activation function of Sigmoid to obtain the matching degree of the code representation 1 and the boundary statement code representation Coding and representing the matching degree by the cross-delimited sentences Performing a multiplication operation with the encoded representation 1 to obtain a filtered representation 1, denoted as The formula is as follows: ; ; Wherein, the Representing the demarcation statement encoded representation; Representation code represents 1; Coupling the boundary statement coding representation and the coding representation 2, and mapping the coupling result by using a layer of fully-connected network Dense with an activation function of Sigmoid to obtain the matching degree of the coding representation 2 and the boundary statement coding representation Coding and representing the matching degree by the cross-delimited sentences Performing a multiplication operation with the encoded representation 2 to obtain a filtered representation 2, denoted as The formula is as follows: ; ; Wherein, the Representing the demarcation statement encoded representation; Representation code representation 2; linking the demarcation statement encoded representation with encoded representation 3, the subsequent operation is similar to the operation of obtaining the screening representation 2, thereby obtaining the screening representation 3, denoted as And so on until the boundary statement encodes the representation and the encoded representation n-1, and performs an operation similar to that for obtaining the screening representation 2, the screening representation n-1 is obtained, denoted as The formula is as follows: ; ; Wherein, the Representing the demarcation statement encoded representation; the representation code represents n-1; Aggregating the screening representation 1-screening representation n-1, the key information representation and the candidate response screening representation through addition operation, thereby obtaining dialogue semantic representation which is recorded as The formula is as follows: ; Wherein, the 、 、...、 Screening representation 1, screening representation 2,..and screening representation n-1, respectively; Representing a key information representation; A candidate response screening representation is represented.
- 2. The method of claim 1, wherein the multi-round dialogue model is constructed as follows: Constructing input data, namely constructing a historical dialogue sequence according to dialogue sequence for each sentence in a historical dialogue aiming at each piece of data in a data set, marking the historical dialogue sequence as h 1 、 h 2 、 …、h n , selecting one response from a plurality of candidate responses as a current response, formalizing the current response as r, determining a label of the piece of data according to whether the response is a correct response, namely marking the correct response as 1, otherwise marking the correct response as 0, and jointly forming the input data by the historical dialogue sequence, the current response and the label in the form of (h 1 、h 2 、…、h n , r and the label); encoding the input data by using a pre-training language model BERT to obtain the encoded representation of each sentence in the history dialogue and the candidate response encoded representation, which are recorded as , ,..., , And Wherein, the method comprises the steps of, Is a demarcation statement encoded representation; the candidate response is encoded and represented by the following formula: ; ; Wherein h 1 , h 2 , …, h n-1 , h n represents statement 1, statement 2, statement n-1, statement n in the history dialogue, r represents the candidate response; The method comprises the steps of selecting a boundary sentence as a boundary, respectively carrying out semantic matching calculation on the boundary sentence and the coding representation of each sentence in the rest dialogue sentences, and completing an information screening process according to the matching degree between each boundary sentence and the boundary sentence; The label prediction is that dialog semantic representation is processed by a layer of fully connected network with dimension of 1 and activation function of Sigmoid to obtain probability that the current response is correct response; When the multi-round dialogue model is not trained, training the multi-round dialogue model to optimize parameters of the multi-round dialogue model; when the multi-round dialog model is trained, which of the candidate responses is the correct response is predicted.
- 3. The method of claim 1, wherein training the multi-round dialogue model is as follows: Constructing a loss function, namely adopting cross entropy as the loss function, wherein the formula is as follows: ; Wherein y true is a real label, y pred is the correct probability output by the multi-round dialogue model; And constructing an optimization function, namely finally selecting AdamW optimization functions as the optimization function by testing a plurality of optimization functions, and selecting pytorch default values for other super parameters of AdamW except that the learning rate is set to be 2 e-5.
- 4. A retrievable multi-round dialog device, the device comprising: The data set acquisition unit is used for crawling dialogue data on the online public medical question-answering platform by using a crawler technology to obtain a multi-round dialogue knowledge base, constructing positive example data and negative example data for each piece of dialogue data, and dividing a training set and a testing set according to a set proportion; The system comprises a multi-round dialogue model construction unit, a multi-round dialogue model selection unit, a semantic matching and information screening unit, a clustering unit and a clustering unit, wherein the multi-round dialogue model construction unit is used for marking each sentence in a history dialogue according to a dialogue sequence to construct a history dialogue sequence and constructing candidate responses according to each history dialogue sequence, respectively carrying out coding treatment on each sentence and the candidate responses in the history dialogue sequence to obtain coding representation and candidate response coding representation of each sentence in the history dialogue; The multi-round dialogue model training unit is used for training the multi-round dialogue model by utilizing a training set, constructing a loss function and an optimization function and completing the prediction of candidate responses; the implementation process of the demarcation matching and screening module is specifically as follows: (1) Coupling the boundary statement code representation and the code representation 1, and mapping the coupling result by using a layer of full-connection network Dense with the activation function of Sigmoid to obtain the matching degree of the code representation 1 and the boundary statement code representation Coding and representing the matching degree by the cross-delimited sentences Performing a multiplication operation with the encoded representation 1 to obtain a filtered representation 1, denoted as The formula is as follows: ; ; Wherein, the Representing the demarcation statement encoded representation; Representation code represents 1; (2) Coupling the boundary statement coding representation and the coding representation 2, and mapping the coupling result by using a layer of full-connection network Dense with an activation function of Sigmoid to obtain the matching degree of the coding representation 2 and the boundary statement coding representation Coding and representing the matching degree by the cross-delimited sentences Performing a multiplication operation with the encoded representation 2 to obtain a filtered representation 2, denoted as The formula is as follows: ; ; Wherein, the Representing the demarcation statement encoded representation; Representation code representation 2; (3) The boundary statement code representation and code representation 3 are linked, and the subsequent operation is similar to the operation of obtaining the screening representation 2, so that the screening representation 3 is obtained and recorded as And so on until the boundary statement encodes the representation and the encoded representation n-1, and performs an operation similar to that for obtaining the screening representation 2, the screening representation n-1 is obtained, denoted as The formula is as follows: ; ; Wherein, the Representing the demarcation statement encoded representation; the representation code represents n-1; (4) Aggregating the screening representation 1-screening representation n-1, the key information representation and the candidate response screening representation by adding operation to obtain dialogue semantic representation which is recorded as The formula is as follows: ; Wherein, the 、 、...、 Screening representation 1, screening representation 2,..and screening representation n-1, respectively; Representing a key information representation; A candidate response screening representation is represented.
- 5. The search multi-turn conversation device of claim 4 wherein the multi-turn conversation model construction unit comprises: The input module is used for preprocessing an original data set and constructing input data; the coding module is used for respectively coding the input data by using the pre-training language model BERT so as to obtain a coding representation of each sentence in the history dialogue and a candidate response coding representation; The boundary matching screening module is used for respectively carrying out semantic matching calculation on the boundary statement and the coding representation of each sentence in the rest dialogue statement by taking the boundary statement as a boundary, and completing the information screening process according to the matching degree between each boundary statement and the boundary statement; and the label prediction module is used for judging whether the current response is a correct response or not based on the dialogue semantic representation.
- 6. The retrievable multi-turn dialog device of claim 4, wherein the multi-turn dialog model training unit includes: the loss function construction module is used for calculating errors of the prediction result and the real data by using the cross entropy loss function; and the optimization function construction module is used for training and adjusting parameters in model training, and reducing prediction errors.
- 7. An electronic device comprising a memory and at least one processor; wherein the memory has a computer program stored thereon; The at least one processor executing the computer program stored by the memory causes the at least one processor to perform the retrievable multi-turn dialog method of any of claims 1 to 3.
- 8. A computer readable storage medium having stored therein a computer program executable by a processor to implement the retrieved multi-round dialog method of any of claims 1 to 3.
Description
Search type multi-round dialogue method and device, storage medium and electronic equipment Technical Field The invention relates to the fields of natural language processing and artificial intelligence, in particular to a search type multi-round dialogue method and device, a storage medium and electronic equipment. Background The statistics shows that the number of various mental patients exceeds 1 hundred million, wherein the number of the patients with schizophrenia exceeds 640 ten thousand, and the number of the patients with bipolar disorder reaches 110 ten thousand. However, diagnosis and treatment of mental diseases are severely insufficient, and the treatment rate is less than 10%. Psychological health professionals point out that the greatest hazard is not mental illness itself, but rather the person's attitude to mental illness. Jian Shandian the mental disorder patient often does not accept or face the disease, and the surrounding people are more concerned about knowing that the patient is ill. In this case, the patient does not always feel heart sounds open to anyone, resulting in serious illness. With the rapid development of internet technology, the dialogue question-answering technology is also maturing. If the dialogue question-answering technology can be used for patient condition consultation, the rejection emotion of the patient to the psychiatrist can be effectively reduced, and the pubic feeling of the patient talking about psychological problems in front of the real person can be avoided. The dialogue question-answering technique includes a single-round dialogue and a multi-round dialogue. Because general patients lack specialized medical knowledge, the problem cannot be described in a one-time manner, multiple rounds of dialogue technology are required to recognize the description of the patient, and then the symptoms of the patient are predicted. In a multi-round dialog task, the last sentence in a history dialog often has special value for the entire dialog. On the one hand, it is the sentence extracted by all the historical dialog sequences, which is the result of the historical dialog sequences, i.e. it takes over the above, and on the other hand, it plays the greatest role in the selection of the correct response in terms of logical consistency, semantic connection inertia, etc., i.e. it inspires the following. Since the last sentence in the history dialogue sequence plays a role in the upstroke and downstroke as a demarcation point, the last sentence in the history dialogue sequence plays a greater role in the information screening of the history dialogue and the candidate response. However, the existing method does not pay attention to the problem, which results in unsatisfactory performance of the existing model. Therefore, how to effectively utilize the last sentence in the history dialogue to complete semantic matching and information screening on the residual history dialogue sequence and to screen the candidate response at the same time, so as to improve the prediction accuracy of the multi-round dialogue model is a technical problem to be solved urgently. Disclosure of Invention The technical task of the invention is to provide a search type multi-round dialogue method and device, a storage medium and electronic equipment, which are used for solving the problem of how to effectively utilize the last dialogue in a history dialogue to complete semantic matching and information screening on the residual history dialogue sequence and simultaneously screen candidate responses, thereby improving the prediction accuracy of a multi-round dialogue model. The technical task of the invention is realized in the following way, namely a search type multi-round dialogue method which comprises the following steps: The method comprises the steps of obtaining a multi-round dialogue data set, namely crawling dialogue data on an online public medical question-answering platform by using a crawler technology to obtain a multi-round dialogue knowledge base, constructing positive example data and negative example data for each piece of dialogue data, and dividing a training set and a testing set according to a set proportion; Constructing a multi-round dialogue model, namely, for each piece of data in a multi-round dialogue data set, sequentially marking each sentence in a history dialogue according to a dialogue sequence to construct a history dialogue sequence, and constructing a candidate response for each history dialogue sequence, respectively carrying out coding processing on each sentence and the candidate response in the history dialogue sequence to obtain a coding representation of each sentence in the history dialogue and a candidate response coding representation, respectively carrying out semantic matching and information screening processing on the coding representation of each sentence in the rest dialogue sentences by taking a boundary sentence as a boundary, simultaneously carry