Search

CN-122019691-A - Data processing method and legal problem processing method

CN122019691ACN 122019691 ACN122019691 ACN 122019691ACN-122019691-A

Abstract

The embodiment of the specification provides a data processing method and a legal question processing method, wherein the data processing method comprises the steps of determining a to-be-processed question and inputting the to-be-processed question into a language generation model, and generating a question answer of the to-be-processed question in the language generation model by acquiring reference data corresponding to the to-be-processed question from a question-answer retrieval database, wherein the reference data is any target question-answer pair stored in the question-answer retrieval database, the target question-answer pair is obtained by conducting text processing on a plurality of different types of initial text data, the plurality of different types of initial text data are determined from a plurality of data sources, and the plurality of different types of initial text data correspond to the same target field.

Inventors

  • Qing Lizhi
  • SUN CHANGLONG
  • XIE PENGJUN

Assignees

  • 阿里巴巴(中国)有限公司

Dates

Publication Date
20260512
Application Date
20241108

Claims (16)

  1. 1. A data processing method, comprising: determining a problem to be processed, and inputting the problem to be processed into a language generation model; In the language generation model, a question answer of the question to be processed is generated by acquiring reference data corresponding to the question to be processed from a question and answer retrieval database, wherein the reference data is any target question and answer pair stored in the question and answer retrieval database, the target question and answer pair is obtained by conducting text processing on a plurality of different types of initial text data, the plurality of different types of initial text data are determined from a plurality of data sources, and the plurality of different types of initial text data correspond to the same target field.
  2. 2. The data processing method according to claim 1, further comprising, before inputting the question to be processed into a language generation model: Determining the plurality of different types of initial text data from the plurality of data sources, wherein the plurality of different types of initial text data correspond to the same target field; And carrying out text processing on the initial text data of the different types to obtain the target question-answer pairs of the different types, and storing the target question-answer pairs of the different types into the question-answer retrieval database.
  3. 3. The data processing method according to claim 2, wherein the text processing is performed on the plurality of different types of initial text data to obtain the plurality of different types of target question-answer pairs, including: performing text analysis on the plurality of different types of initial text data by using a text processing module to obtain text summaries of all the initial text data in the plurality of different types of initial text data and text data corresponding to the text summaries; and carrying out text processing on the text summary and text data corresponding to the text summary to obtain a plurality of target question-answer pairs of different types, wherein the target question-answer pairs comprise target questions and target answers corresponding to the target questions.
  4. 4. A data processing method according to claim 3, wherein the text processing is performed on the text summary and the text data corresponding to the text summary to obtain a plurality of different types of target question-answer pairs, including: Combining the text summary, the text data corresponding to the text summary and text reasoning prompt information to obtain text reasoning data, wherein the text reasoning prompt information is used for prompting a language generation model to conduct text reasoning on the text summary and the text data; Inputting the text reasoning data into a language generation model, and carrying out text reasoning on the text summary and the text data based on the text reasoning prompt information by utilizing the language generation model to obtain the target question-answer pairs of the different types.
  5. 5. A data processing method according to any one of claims 2 to 4, said storing said plurality of different types of target question-answer pairs into said question-answer retrieval database comprising: determining the types of the target question-answer pairs in the plurality of different types; determining a target question-answer retrieval database corresponding to each target question-answer pair from a plurality of question-answer retrieval databases of different types based on the types of the target question-answer pairs; and storing the target question-answer pairs of different types into a corresponding target question-answer retrieval database.
  6. 6. The data processing method according to claim 5, wherein the storing the plurality of different types of target question-answer pairs in the corresponding target question-answer retrieval database includes: determining label information of each target question-answer pair based on text attribute information of initial text data corresponding to each target question-answer pair; And storing the target question-answer pairs with different types and the label information of each target question-answer pair into a corresponding target question-answer retrieval database.
  7. 7. The data processing method according to any one of claims 1 to 4, wherein in the language generation model, generating a question answer to the question to be processed by acquiring reference data corresponding to the question to be processed from a question-answer retrieval database, comprises: In the language generation model, carrying out search judgment on the to-be-processed problem to obtain a search judgment result, wherein the search judgment result is a judgment search result or a judgment no search result; Under the condition that the search judgment result is the judgment search result, carrying out problem rewriting on the problem to be processed according to a target problem format of the target problem in the target question-answering pair, and obtaining a rewritten problem to be processed; acquiring reference data corresponding to the rewritten problem to be processed from the question-answer retrieval database; and generating answers to the rewritten questions to be processed according to the reference data by using the language generation model to obtain answers to the questions.
  8. 8. The data processing method according to claim 7, the acquiring, from the question-answer retrieval database, reference data corresponding to the rewritten question to be processed, comprising: classifying the rewritten problem to be processed to obtain a problem type corresponding to the rewritten problem to be processed; And acquiring reference data corresponding to the rewritten to-be-processed problem from the question-answer retrieval database based on the problem type.
  9. 9. The data processing method according to claim 8, the question-answer search database being a plurality of different types of question-answer search databases; the obtaining, based on the question type, reference data corresponding to the rewritten to-be-processed question from the question-answer retrieval database includes: According to the question type, determining a question-answer retrieval database corresponding to the rewritten to-be-processed question from the question-answer retrieval databases of different types; And acquiring reference data corresponding to the rewritten to-be-processed problem from a question-answer retrieval database corresponding to the rewritten to-be-processed problem according to the rewritten to-be-processed problem.
  10. 10. The data processing method according to claim 7, wherein the reference data is a plurality of; And generating answers to the rewritten questions to be processed according to the reference data by using the language generation model to obtain answers to the questions, wherein the steps comprise: determining a data screening index of each reference data in a plurality of reference data, and sequencing the plurality of reference data based on the data screening index to obtain a reference data sequence; selecting target reference data corresponding to a preset target sequencing position from the reference data sequence; and generating answers to the rewritten questions to be processed according to the target reference data by using the language generation model to obtain answers to the questions.
  11. 11. A data processing method, comprising: Determining a plurality of different types of initial text data from a plurality of data sources, wherein the plurality of different types of initial text data correspond to the same target field; Performing text processing on the initial text data of the different types to obtain a plurality of target question-answer pairs of the different types, and storing the target question-answer pairs of the different types into a question-answer retrieval database, wherein the question-answer retrieval database is used for obtaining reference data corresponding to a to-be-processed problem by a language generation model under the condition of executing the to-be-processed problem, and generating a question answer of the to-be-processed problem based on the reference data, and the reference data is any target question-answer pair stored in the question-answer retrieval database.
  12. 12. A legal issue processing method, comprising: determining legal questions to be processed, and inputting the legal questions to be processed into a language generation model; In the language generation model, a legal question answer of the legal question to be processed is generated by acquiring reference legal data corresponding to the legal question to be processed from a question and answer retrieval database, wherein the reference legal data is any target question and answer pair stored in the question and answer retrieval database, the target question and answer pair is obtained by conducting text processing on a plurality of different types of initial legal text data, the plurality of different types of initial legal text data are determined from a plurality of legal data sources, and the plurality of different types of initial legal text data correspond to legal fields.
  13. 13. A legal data processing method, comprising: determining a plurality of different types of initial legal text data from a plurality of legal data sources, wherein the plurality of different types of initial legal text data correspond to legal fields; Performing text processing on the plurality of initial legal text data of different types to obtain a plurality of target legal question-answer pairs of different types, and storing the plurality of target legal question-answer pairs of different types into a question-answer retrieval database, wherein the question-answer retrieval database is used for obtaining reference legal data corresponding to a legal question to be processed under the condition that the legal question to be processed is executed by a language generation model, and generating legal question answers of the legal question to be processed based on the reference legal data, and the reference legal data is any target legal question-answer pair stored in the question-answer retrieval database.
  14. 14. A computing device, comprising: A memory and a processor; the memory is adapted to store a computer program/instruction, the processor being adapted to execute the computer program/instruction, which when executed by the processor performs the steps of the method of any one of claims 1 to 13.
  15. 15. A computer readable storage medium storing a computer program/instruction which, when executed by a processor, carries out the steps of the method of any one of claims 1 to 13.
  16. 16. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of any of claims 1 to 13.

Description

Data processing method and legal problem processing method Technical Field The embodiment of the specification relates to the technical field of artificial intelligence, in particular to a data processing method. One or more embodiments of the present specification relate to data processing methods, legal problem processing methods, and legal data processing methods at the same time. Background With the continuous development of artificial intelligence technology, a neural network model can be applied to perform various tasks, for example, the neural network model can be applied to a problem processing scene to perform a task of generating answers to questions. In the process of generating answers to questions, the current neural network model has limited knowledge learned by the neural network model, so that accurate answers to the questions cannot be generated for all the questions, and therefore, how to improve the accuracy of the answers to the questions output by the neural network model becomes a problem to be solved urgently. Disclosure of Invention In view of this, the present embodiments provide a data processing method. One or more embodiments of the present specification relate to a data processing method, a legal issue processing method, a legal data processing apparatus, a legal issue processing apparatus, a legal data processing apparatus, a computing device, a computer readable storage medium, and a computer program product, to solve the technical drawbacks of the prior art. According to a first aspect of embodiments of the present specification, there is provided a data processing method, including: determining a problem to be processed, and inputting the problem to be processed into a language generation model; In the language generation model, a question answer of the question to be processed is generated by acquiring reference data corresponding to the question to be processed from a question and answer retrieval database, wherein the reference data is any target question and answer pair stored in the question and answer retrieval database, the target question and answer pair is obtained by conducting text processing on a plurality of different types of initial text data, the plurality of different types of initial text data are determined from a plurality of data sources, and the plurality of different types of initial text data correspond to the same target field. According to a second aspect of embodiments of the present specification, there is provided a data processing apparatus comprising: The problem determination module is configured to determine a problem to be processed and input the problem to be processed into the language generation model; and the question processing module is configured to generate a question answer of the to-be-processed question by acquiring reference data corresponding to the to-be-processed question from a question and answer retrieval database in the language generation model, wherein the reference data is any target question and answer pair stored in the question and answer retrieval database, the target question and answer pair is obtained by conducting text processing on a plurality of different types of initial text data, the plurality of different types of initial text data are determined from a plurality of data sources, and the plurality of different types of initial text data correspond to the same target field. According to a third aspect of embodiments of the present specification, there is provided a data processing method, comprising: Determining a plurality of different types of initial text data from a plurality of data sources, wherein the plurality of different types of initial text data correspond to the same target field; Performing text processing on the initial text data of the different types to obtain a plurality of target question-answer pairs of the different types, and storing the target question-answer pairs of the different types into a question-answer retrieval database, wherein the question-answer retrieval database is used for obtaining reference data corresponding to a to-be-processed problem by a language generation model under the condition of executing the to-be-processed problem, and generating a question answer of the to-be-processed problem based on the reference data, and the reference data is any target question-answer pair stored in the question-answer retrieval database. According to a fourth aspect of embodiments of the present specification, there is provided a data processing apparatus comprising: the data determining module is configured to determine a plurality of different types of initial text data from a plurality of data sources, wherein the plurality of different types of initial text data correspond to the same target field; The question-answer pair storage module is configured to perform text processing on the initial text data of the different types, obtain target question-answer pairs of the differe