Search

CN-122019616-A - Text processing method, electronic device and computer readable storage medium

CN122019616ACN 122019616 ACN122019616 ACN 122019616ACN-122019616-A

Abstract

The application discloses a text processing method, electronic equipment and a computer readable storage medium, and relates to the fields of large model technology and data processing. The method comprises the steps of obtaining first text data, retrieving target knowledge data and target program codes matched with the first text data from a database, storing a plurality of knowledge data and program codes corresponding to different knowledge data in the database, executing the target program codes based on target parameters contained in the first text data to obtain numerical calculation results corresponding to the target parameters, and generating second text data corresponding to the first text data based on the first text data, the numerical calculation results and the target knowledge data. The application solves the technical problem of lower accuracy of numerical calculation in the related art.

Inventors

  • LIU CHENGYUAN
  • Song Kaisong
  • LIN JUN
  • SUN CHANGLONG
  • ZHANG JI
  • WANG SHIHANG
  • Qing Lizhi
  • KUANG KUN

Assignees

  • 阿里巴巴(中国)有限公司

Dates

Publication Date
20260512
Application Date
20241108

Claims (14)

  1. 1. A text processing method, comprising: acquiring first text data, wherein the first text data comprises target parameters for numerical calculation; retrieving target knowledge data and target program codes matched with the first text data from a database, wherein the database stores a plurality of knowledge data and program codes corresponding to different knowledge data, and the program codes are used for carrying out numerical calculation based on the corresponding knowledge data; executing the target program code based on the target parameter contained in the first text data to obtain a numerical value calculation result corresponding to the target parameter; and generating second text data corresponding to the first text data based on the first text data, the numerical value calculation result and the target knowledge data.
  2. 2. The method of claim 1, wherein retrieving target knowledge data and target program code from a database that matches the first text data comprises: Matching the first text data with the knowledge data in the database to obtain the target knowledge data matched with the first text data; And acquiring the program codes corresponding to the target knowledge data from the database to obtain the target program codes.
  3. 3. The method according to claim 1, wherein the executing the object program code based on the object parameter included in the first text data obtains a numerical calculation result corresponding to the object parameter, including: Converting the target parameters based on target format requirements corresponding to the target program codes to obtain converted parameters; And executing the target program code based on the converted parameters to obtain the numerical value calculation result.
  4. 4. The method of claim 1, wherein the generating second text data corresponding to the first text data based on the first text data, the numerical calculation result, and target knowledge data corresponding to the target program code comprises: Converting the numerical calculation result into a text form to obtain result text data; And inputting the first text data, the result text data and the target knowledge data into a text generation model, and generating the second text data by using the text generation model.
  5. 5. The method according to any one of claims 1 to 4, wherein the target program code is a result of code generation of the target knowledge data using a code generation model, wherein the target program code contains at least knowledge reference data describing the target knowledge data, target format requirements, and logic calculation code for performing numerical calculation, the logic calculation code matching logic and calculation conditions in the target knowledge data.
  6. 6. The method of claim 5, wherein the code generation model is obtained by performing iterative training on an initial generation model multiple times by using a training knowledge data set, a target loss function used in any one iterative training process is determined based on a probability that a first type of program code is generated by a current generation model, a probability that a second type of program code is generated by the current generation model, a probability that the first type of program code is generated by the initial generation model, and a probability that the second type of program code is generated by the initial generation model, wherein the current generation model is a generation model obtained by training in the initial generation model or a previous iterative training process, an accuracy of the first type of program code for performing numerical computation is greater than an accuracy of the second type of program code for performing numerical computation, and the first type of program code and the second type of program code are generated based on the same knowledge data in the training knowledge data set.
  7. 7. A method of training a code generation model, comprising: acquiring a training knowledge data set; performing repeated iterative training on the initial generation model by using the training knowledge data set to obtain a code generation model, wherein the code generation model is used for generating program codes corresponding to different knowledge data, and the program codes are used for performing numerical calculation based on the corresponding knowledge data; The target loss function used in any iterative training process is determined based on the probability that a current generation model generates a first type of program code, the probability that the current generation model generates a second type of program code, the probability that the initial generation model generates the first type of program code and the probability that the initial generation model generates the second type of program code, the current generation model is a generation model obtained by training in the initial generation model or the previous iterative training process, the accuracy of the first type of program code for carrying out numerical calculation is larger than the accuracy of the second type of program code for carrying out numerical calculation, and the first type of program code and the second type of program code are generated based on the same knowledge data in the training knowledge data set.
  8. 8. The method of claim 7, wherein during the any one iteration of training, the method comprises: generating the same knowledge data for multiple times by using the initial generation model to obtain multiple first program codes, and generating the same knowledge data for multiple times by using the current generation model to obtain multiple second program codes; performing numerical calculation by using the plurality of first program codes, determining the accuracy of the plurality of first program codes, and performing numerical calculation by using the plurality of second program codes, determining the accuracy of the plurality of second program codes; Determining a type of the plurality of first program codes based on the accuracy of the plurality of first program codes, and determining a type of the plurality of second program codes based on the accuracy of the plurality of second program codes; the objective loss function is generated based on a probability that the current generation model generates a first type of program code, a probability that the current generation model generates a second type of program code, a probability that the initial generation model generates the first type of program code, and a probability that the initial generation model generates the second type of program code.
  9. 9. A text processing method, comprising: responding to an input instruction acting on an operation interface, and determining first text data corresponding to the input instruction, wherein the first text data comprises target parameters for numerical calculation; Retrieving target program codes matched with the first text data from a database, wherein the database stores a plurality of knowledge data and program codes corresponding to different knowledge data, and the program codes are used for carrying out numerical calculation based on the corresponding knowledge data; executing the target program code based on the target parameter contained in the first text data to obtain a numerical value calculation result corresponding to the target parameter; Generating second text data corresponding to the first text data based on the first text data, the numerical value calculation result and target knowledge data corresponding to the target program code; And outputting the second text data.
  10. 10. A text processing method, comprising: acquiring inquiry text data, wherein the inquiry text data comprises target parameters for numerical calculation; retrieving target legal data and target program codes matched with the query text data from a database, wherein the database stores a plurality of legal data and program codes corresponding to different legal data, and the program codes are used for carrying out numerical calculation based on the corresponding legal data; executing the target program code based on the target parameters contained in the query text data to obtain a numerical value calculation result corresponding to the target parameters; And generating reply text data based on the inquiry text data, the numerical value calculation result and the target legal system data.
  11. 11. A text processing method, comprising: Acquiring first text data by calling a first interface, wherein the first interface comprises a first parameter, a parameter value of the first parameter comprises the first text data, and the first text data comprises a target parameter for numerical calculation; retrieving target knowledge data and target program codes matched with the first text data from a database, wherein the database stores a plurality of knowledge data and program codes corresponding to different knowledge data, and the program codes are used for carrying out numerical calculation based on the corresponding knowledge data; executing the target program code based on the target parameter contained in the first text data to obtain a numerical value calculation result corresponding to the target parameter; generating second text data corresponding to the first text data based on the first text data, the numerical calculation result and the target knowledge data; and outputting the second text data by calling a second interface, wherein the second interface comprises a second parameter, and the parameter value of the second parameter comprises the second text data.
  12. 12. An electronic device, comprising: A memory storing an executable program; a processor for executing the program, wherein the program when run performs the method of any of claims 1 to 11.
  13. 13. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored executable program, wherein the executable program when run controls a device in which the computer readable storage medium is located to perform the method according to any one of claims 1 to 11.
  14. 14. A computer program product comprising a computer program which, when executed by a processor, implements the method of any one of claims 1 to 11.

Description

Text processing method, electronic device and computer readable storage medium Technical Field The present application relates to the field of large model technology and data processing, and in particular, to a text processing method, an electronic device, and a computer readable storage medium. Background In the domain-valued computing scenario, in the event that a user enters a question, the processing system first retrieves the relevant domain knowledge from the knowledge base, and then gives the knowledge along with the question to the large language model, which answers. However, the language model often cannot strictly follow the logic description of the knowledge document when answering, so that after the questions of different descriptions are made for knowledge in the same field, the answers generated by the large language model are different, that is, the content generated by the large language model is greatly influenced by the input of the user, which results in lower accuracy of the numerical calculation in the field. In view of the above problems, no effective solution has been proposed at present. Disclosure of Invention The embodiment of the application provides a text processing method, electronic equipment and a computer readable storage medium, which at least solve the technical problem of lower accuracy of numerical calculation in the related art. According to one aspect of the embodiment of the application, a text processing method is provided, which comprises the steps of obtaining first text data, retrieving target knowledge data and target program codes matched with the first text data from a database, wherein the database stores a plurality of knowledge data and program codes corresponding to different knowledge data, the program codes are used for carrying out numerical computation based on the corresponding knowledge data, executing the target program codes based on the target parameters contained in the first text data to obtain a numerical computation result corresponding to the target parameters, and generating second text data corresponding to the first text data based on the first text data, the numerical computation result and the target knowledge data. According to another aspect of the embodiment of the application, a training method of a code generation model is provided, which comprises the steps of obtaining a training knowledge data set, carrying out repeated iterative training on an initial generation model by utilizing the training knowledge data set to obtain the code generation model, wherein the code generation model is used for generating program codes corresponding to different knowledge data, the program codes are used for carrying out numerical computation based on the corresponding knowledge data, an objective loss function used in any iterative training process is determined based on the probability of generating the first type of program codes by the current generation model, the probability of generating the second type of program codes by the current generation model, the probability of generating the first type of program codes by the initial generation model and the probability of generating the second type of program codes by the initial generation model, the accuracy of carrying out numerical computation by the first type of program codes is higher than the accuracy of carrying out numerical computation by the second type of program codes, and the first type of program codes and the second type of program codes are generated based on the same knowledge data in the training knowledge data set. According to another aspect of the embodiment of the application, a text processing method is provided, which comprises the steps of responding to an input instruction acted on an operation interface, determining first text data corresponding to the input instruction, wherein the first text data comprises target parameters for numerical calculation, retrieving target program codes matched with the first text data from a database, storing a plurality of knowledge data and program codes corresponding to different knowledge data, wherein the program codes are used for numerical calculation based on the corresponding knowledge data, executing the target program codes based on the target parameters contained in the first text data to obtain a numerical calculation result corresponding to the target parameters, generating second text data corresponding to the first text data based on the first text data, the numerical calculation result and target knowledge data corresponding to the target program codes, and outputting the second text data. According to another aspect of the embodiment of the application, a text processing method is provided, which comprises the steps of obtaining query text data, retrieving target legal data and target program codes matched with the query text data from a database, wherein the database stores a plurality of legal data and program code