Search

CN-121996746-A - Question-answering model training method, question-answering processing method and task platform

CN121996746ACN 121996746 ACN121996746 ACN 121996746ACN-121996746-A

Abstract

The embodiment of the specification provides a training method, a question-answer processing method and a task platform of a question-answer model, wherein the training method comprises the steps of obtaining first training data, comprising first to-be-analyzed data, first answer information corresponding to first to-be-analyzed data aiming at first cognitive questions of the first to-be-analyzed data and first positions of the first answer information in the first to-be-analyzed data, generating first perception questions based on the first cognitive questions and the first positions, inputting the first to-be-analyzed data and the first perception questions into the pre-trained question-answer model, predicting first reference answer information corresponding to the first cognitive questions, and predicting second reference answer information of the first perception questions based on the first reference answer information, and adjusting the pre-trained question-answer model based on the first answer information, the first positions, the first reference answer information and the second reference answer information to obtain the trained question-answer model. The question-answering task of the question-answering model obtained in this way has a good execution effect.

Inventors

  • Shao Zirui
  • Luo Chuwei
  • ZHU ZHAOQING
  • ZHENG QI
  • ZHANG JI

Assignees

  • 阿里巴巴(中国)有限公司

Dates

Publication Date
20260508
Application Date
20241108

Claims (20)

  1. 1. A training method of a question-answering model, comprising: acquiring first training data, wherein the first training data comprises first data to be analyzed, first reply information corresponding to first cognitive questions aiming at the first cognitive questions of the first data to be analyzed, and a first position of the first reply information in the first data to be analyzed; generating a first perceived problem based on the first perceived problem, wherein the first perceived problem indicates that the position of reply information of the first perceived problem in the first data to be analyzed is determined; inputting the first data to be analyzed and the first perception questions into a pre-trained question-answer model, predicting first reference answer information corresponding to the first perception questions, and predicting second reference answer information of the first perception questions based on the first reference answer information; and adjusting the pre-trained question-answer model based on the first answer information, the first position, the first reference answer information and the second reference answer information to obtain a trained question-answer model.
  2. 2. The method of claim 1, wherein the first training data further comprises at least one first auxiliary location of the first data to be analyzed that is different from the first location, wherein the generating a first perception question based on the first perception question comprises: Generating a first perception question based on the first cognitive question, the first position and the at least one first auxiliary position, wherein the first perception question indicates the position of reply information of the first cognitive question in the first position and the at least one first auxiliary position.
  3. 3. The method of claim 2, the generating a first perception question based on the first cognitive question, the first location, and the at least one first auxiliary location, comprising: And filling the first cognitive problem, the information of the first position and the information of the at least one first auxiliary position into corresponding positions in a query information template to obtain the first cognitive problem, wherein the query information template comprises stem information indicating the cognitive problem, filling position information of the first cognitive problem, filling position information of the first position and filling position information of the first auxiliary position.
  4. 4. The method of claim 1, the inputting the first data to be analyzed and the first perceived problem into a pre-trained question-answering model, predicting first reference answer information corresponding to the first perceived problem, and predicting second reference answer information for the first perceived problem based on the first reference answer information, comprising: Inputting the first data to be analyzed and the first perceived problems into a pre-trained question-answer model to obtain first reference answer information corresponding to the first perceived problems and second reference answer information of the first perceived problems, which are arranged according to a response information template; The second reference reply information is predicted based on the first reference reply information, and the response information template comprises relation information of the first cognitive questions and the first reference reply information, filling position information of the second reference reply information and relation information between the first reference reply information and the second reference reply information.
  5. 5. The method of any one of claims 1 to 4, further comprising: acquiring second training data, wherein the second training data comprises second data to be analyzed, second cognitive questions aiming at the second data to be analyzed and second reply information corresponding to the second cognitive questions; generating a third cognitive question based on the second cognitive question and the second answer information, wherein the third cognitive question indicates answer information for determining the second cognitive question based on the second answer information; inputting the second data to be analyzed and the third cognitive questions into the pre-trained question-answer model, and predicting third reference answer information of the second cognitive questions; And adjusting the pre-trained question-answer model based on the relation between the third reference answer information and the second answer information to obtain a trained question-answer model.
  6. 6. The method of claim 5, wherein the second training data further comprises at least one auxiliary response message in the second data to be analyzed that is different from the second response message, wherein the generating a third cognitive question based on the second cognitive question and the second response message comprises: Generating a third cognitive question based on the second cognitive question, the second response information and the at least one auxiliary response information, wherein the third cognitive question indicates response information of the second cognitive question to be determined in the second response information and the at least one auxiliary response information.
  7. 7. The method of any one of claims 1 to 4, further comprising: acquiring third training data, wherein the third training data comprises third to-be-analyzed data, third reply information corresponding to a second perception problem of the third to-be-analyzed data, and a second position of the third reply information in the third to-be-analyzed data; generating a third perception question based on the third reply information, wherein the third perception question indicates that the position of the third reply information in the third data to be analyzed is determined; inputting the third data to be analyzed and the third perceived problem into the pre-trained question-answering model, and predicting fourth reference answer information of the third perceived problem; and adjusting the pre-trained question-answer model based on the relation between the fourth reference answer information and the second position to obtain a trained question-answer model.
  8. 8. The method of claim 7, wherein the third training data further comprises at least one second auxiliary location different from the second location in the third data to be analyzed, wherein the generating a third perception question based on the third reply information comprises: Generating a third perceived question based on the third answer information, the second location, and the at least one second auxiliary location, wherein the third perceived question indicates answer information for the third perceived question that is determined in the second location and the at least one second auxiliary location.
  9. 9. A question-answering processing method comprises the following steps: task data of a question-answering task is obtained, wherein the task data comprises data to be analyzed and problem information proposed for the data to be analyzed; and calling a question-answering model to analyze the task data to obtain answer information corresponding to the question information, wherein the question-answering model is a question-answering model trained by the training method according to any one of claims 1 to 8.
  10. 10. The method of claim 9, the obtaining task data of a question-answer task, comprising: Receiving a question-answer request uploaded by front-end equipment, wherein the question-answer request comprises task data of the question-answer task; the method comprises the steps of calling a question-answer model to analyze the task data, and obtaining answer information corresponding to the problem information, wherein the method further comprises the following steps: And sending reply information corresponding to the question information to the front-end equipment.
  11. 11. The method of claim 10, further comprising, after the sending, to the front-end device, the reply information corresponding to the question information: And under the condition that the problem feedback information sent by the front-end equipment aiming at the reply information is received, adjusting the question-answering model based on the problem feedback information.
  12. 12. The method according to claim 10 or 11, further comprising, after the sending, to the front-end device, reply information corresponding to the question information: sending recommended task information related to the data to be analyzed or the problem information to the front-end equipment; And under the condition that a trigger instruction of the front-end equipment aiming at the recommended task information is received, calling the question-answer model to generate an execution result of the recommended task information, and sending the execution result to the front-end equipment.
  13. 13. A question-answering processing method is applied to a task platform and comprises the following steps: Receiving a model request sent by a terminal device, wherein the model request comprises at least one item of information of a scene identifier of a target scene, scene input data of the target scene and target model specification parameters; And determining a corresponding question-answer model from at least one model based on the model request, wherein the question-answer model is obtained by training based on the training method according to any one of claims 1 to 8, and the question-answer model is used for outputting answer information corresponding to the problem information.
  14. 14. The method of claim 13, wherein the model library stores at least one question-answer model adapted to different scenarios and a plurality of question-answer models of different model specification parameters, wherein the determining a corresponding question-answer model from at least one model based on the model request comprises: Searching a question-answer model corresponding to the at least one item of information in the model library based on the at least one item of information in the model request, wherein the scene identification of the target scene and the scene input data of the target scene are both corresponding to the model adapted to the target scene, and the target model specification parameter corresponds to the model with the same model specification parameter as the target model specification parameter; and training the searched question-answer model based on the scene input data of the target scene under the condition that the at least one item of information comprises the scene input data of the target scene, so as to obtain a trained question-answer model.
  15. 15. The method of claim 13 or 14, further comprising, after said determining a corresponding question-answer model from at least one model based on said model request: And deploying the question-answer model, and constructing a question-answer interface based on the question-answer model so that the terminal equipment dispatches the question-answer model through the question-answer interface to generate answer information corresponding to the question information.
  16. 16. A task platform comprising a request interface and a response unit; the request interface is used for receiving a model request sent by the terminal equipment, wherein the model request comprises at least one of a scene identifier of a target scene, scene input data of the target scene and model specification parameters; The response unit is configured to determine a corresponding question-answer model from at least one model based on the model request, where the question-answer model is trained based on the training method according to any one of claims 1 to 8.
  17. 17. The task platform of claim 16, further comprising a question-answer interface, wherein the question-answer interface is constructed based on the question-answer model, and the question-answer interface is used for scheduling by the terminal device and providing answer information of the question information.
  18. 18. A computing device includes a memory and a processor; The memory is adapted to store a computer program/instruction, the processor being adapted to execute the computer program/instruction, which when executed by the processor implements the steps of the method of any of claims 1 to 15.
  19. 19. A computer readable storage medium storing a computer program/instruction which when executed by a processor performs the steps of the method of any one of claims 1 to 15.
  20. 20. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of any of claims 1 to 15.

Description

Question-answering model training method, question-answering processing method and task platform Technical Field The embodiment of the specification relates to the technical field of artificial intelligence and the technical field of terminals, in particular to a training method, a question-answering processing method and a task platform of a question-answering model. Background Along with the rapid development of artificial intelligence technology, various models are widely applied to help people to analyze and process data, and the convenience of people in work and life is improved. For example, the artificial intelligence model may perform a question-answering task that solves a question posed by a user. The user may input a document to be analyzed to the artificial intelligence model and input a question for the document. The artificial intelligence model may analyze the document based on the question and output a response to the question. But currently artificial intelligence models may give different answer information for some perceived and perceived questions that are directed to the same document and that should correspond to the same answer information. Therefore, the execution effect of the question-answering task has yet to be improved. Disclosure of Invention In view of this, the embodiment of the present disclosure provides a training method for a question-answering model, which can promote the execution effect of a question-answering task. One or more embodiments of the present specification relate to a question-answering method, a task platform, a computing device, a computer-readable storage medium, and a computer program product. According to a first aspect of embodiments of the present disclosure, there is provided a training method of a question-answering model, including: acquiring first training data, wherein the first training data comprises first data to be analyzed, first reply information corresponding to first cognitive questions aiming at the first cognitive questions of the first data to be analyzed, and a first position of the first reply information in the first data to be analyzed; Generating a first perception question based on the first cognitive question and the first position, wherein the first perception question indicates a position of reply information of the first cognitive question in the first data to be analyzed based on the first position; inputting the first data to be analyzed and the first perception questions into a pre-trained question-answer model, predicting first reference answer information corresponding to the first perception questions, and predicting second reference answer information of the first perception questions based on the first reference answer information; and adjusting the pre-trained question-answer model based on the first answer information, the first position, the first reference answer information and the second reference answer information to obtain a trained question-answer model. According to a second aspect of embodiments of the present specification, there is provided a question-answering processing method, including: task data of a question-answering task is obtained, wherein the task data comprises data to be analyzed and problem information proposed for the data to be analyzed; And calling a question-answering model to analyze the task data to obtain answer information corresponding to the question information, wherein the question-answering model is a question-answering model trained by the training method. According to a third aspect of embodiments of the present disclosure, there is provided a question-answering processing method, applied to a task platform, including: Receiving a model request sent by a terminal device, wherein the model request comprises at least one item of information of a scene identifier of a target scene, scene input data of the target scene and target model specification parameters; and determining a corresponding question-answer model from at least one model based on the model request, wherein the question-answer model is trained based on the training method of the question-answer model, and the question-answer model is used for outputting answer information corresponding to the problem information. According to a fourth aspect of embodiments of the present specification, there is provided a task platform comprising a request interface and a response unit; the request interface is used for receiving a model request for unit test sent by the terminal equipment, wherein the model request comprises at least one of a scene identifier of a target scene, scene input data of the target scene and model specification parameters; And the response unit is used for determining a corresponding question-answer model from at least one model based on the model request, wherein the question-answer model is trained based on the training method of the question-answer model. According to a fifth aspect of embodiments o