CN-121979907-A - Text processing method, apparatus, device, readable storage medium and product

CN121979907ACN 121979907 ACN121979907 ACN 121979907ACN-121979907-A

Abstract

The present application relates to a text processing method, apparatus, computer device, computer readable storage medium and computer program product. The method comprises the steps of obtaining a text to be recognized, which is input by a user, inputting the text to be recognized into a pre-trained language model, outputting a query sentence corresponding to the text to be recognized under the constraint of metadata of a preset database, determining the query sentence by the language model based on a reward value generated by a deep reinforcement learning constraint device, accessing the preset database based on the query sentence, and obtaining a corresponding response result. By adopting the method, the query statement generation quality and efficiency can be improved.

Inventors

Chen Runsen

Assignees

天翼云科技有限公司

Dates

Publication Date: 20260505
Application Date: 20251222

Claims (10)

1. A method of text processing, the method comprising: acquiring a text to be recognized input by a user, wherein the text to be recognized is a natural language sentence; Inputting the text to be identified into a pre-trained language model, and outputting a query sentence corresponding to the text to be identified under the constraint of metadata of a preset database, wherein the language model determines the query sentence based on a reward value generated by a deep reinforcement learning constraint device; based on the query statement, accessing the preset database, and acquiring a corresponding response result.
2. The method according to claim 1, wherein the inputting the text to be recognized into a pre-trained language model, outputting the query sentence corresponding to the text to be recognized under the constraint of metadata of a preset database, includes: acquiring metadata of the preset database; inputting the text to be identified into the language model, and gradually generating a plurality of sentence fragments under the constraint of the metadata; calculating a reward value of the language model by adopting the deep reinforcement learning constrainer in the generation process of the sentence fragments; And ending the generation of the statement fragments under the condition that the reward value reaches a preset reward value threshold value, and obtaining the query statement based on the generated statement fragments.
3. The method of claim 2, wherein the reward values comprise at least one of a syntax correctness reward value, a pattern matching reward value, and a logical rational reward value.
4. The method according to claim 2, wherein after calculating the reward value of the language model using the deep reinforcement learning constrainer in the generation of the sentence fragment, further comprising: and triggering retraining of the language model under the condition that the reward value does not reach a preset reward value threshold value so as to realize parameter tuning of the language model.
5. The method of claim 2, wherein the step of training the language model comprises: Inputting sample metadata and a plurality of sample texts into an initial language model to gradually generate a plurality of sample sentence fragments corresponding to each sample text; Ending the generation of sample sentence fragments corresponding to each sample text under the condition that the rewarding value reaches an initial rewarding value threshold value, and obtaining a plurality of sample inquiry sentences based on the generated sample sentence fragments; And when the sample query statement meets the preset requirement, finishing training of the language model to obtain a trained language model, and taking the initial rewarding value threshold as the preset rewarding value threshold.
6. The method of claim 5, wherein after calculating the reward value of the language model using the deep reinforcement learning constrainer during the generating of the sample sentence fragment, further comprising: When the sample query statement does not meet the preset requirement, continuing to train the language model by adopting the sample text and the sample metadata, and increasing the initial rewarding value threshold according to a preset amplitude; and taking the initial rewarding value threshold after the heightening as the preset rewarding value threshold until the sample inquiry statement output by the language model meets the preset requirement.
7. A text processing apparatus, the apparatus comprising: The acquisition module is used for acquiring a text to be identified input by a user; the text to be identified is a natural language sentence; The system comprises an input module, an output module, a deep reinforcement learning constraint device, a query sentence generation module and a query sentence generation module, wherein the input module is used for inputting the text to be identified into a pre-trained language model and outputting the query sentence corresponding to the text to be identified; And the acquisition module is used for accessing a preset database based on the query statement and acquiring a corresponding response result.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.

Description

Text processing method, apparatus, device, readable storage medium and product Technical Field The present application relates to the field of artificial intelligence technology, and in particular, to a text processing method, apparatus, device, readable storage medium and product. Background At present, a method for identifying sentences and outputting corresponding SQL query sentences by adopting a large language model (Large Language Model, LLM) is adopted, so that invalid SQL sentences can be easily generated, manual repeated correction is needed, and the natural language identification efficiency is low. Disclosure of Invention In view of the foregoing, it is desirable to provide a text processing method, apparatus, device, readable storage medium, and product that can improve the validity of a query sentence. In a first aspect, the present application provides a text processing method, including: acquiring a text to be recognized input by a user, wherein the text to be recognized is a natural language sentence; Inputting the text to be identified into a pre-trained language model, and outputting a query sentence corresponding to the text to be identified under the constraint of metadata of a preset database, wherein the language model determines the query sentence based on a reward value generated by a deep reinforcement learning constraint device; And accessing the preset database based on the query statement, and acquiring a corresponding response result. In one embodiment, the inputting the text to be identified into a pre-trained language model, outputting a query sentence corresponding to the text to be identified under the constraint of metadata of a preset database, includes: acquiring metadata of the preset database; inputting the text to be identified into the language model, and gradually generating a plurality of sentence fragments under the constraint of the metadata; calculating a reward value of the language model by adopting the deep reinforcement learning constrainer in the generation process of the sentence fragments; And ending the generation of the statement fragments under the condition that the reward value reaches a preset reward value threshold value, and obtaining the query statement based on the generated statement fragments. In one embodiment, the reward value comprises at least one of a syntax correctness reward value, a pattern matching reward value, and a logical rational reward value. In one embodiment, after the deep reinforcement learning constrainer is used to calculate the reward value of the language model in the generation process of the sentence fragment, the method further includes: and triggering retraining of the language model under the condition that the reward value does not reach a preset reward value threshold value so as to realize parameter tuning of the language model. In one embodiment, the step of training the language model comprises: Inputting sample metadata and a plurality of sample texts into an initial language model to gradually generate a plurality of sample sentence fragments corresponding to each sample text; Ending the generation of sample sentence fragments corresponding to each sample text under the condition that the rewarding value reaches an initial rewarding value threshold value, and obtaining a plurality of sample inquiry sentences based on the generated sample sentence fragments; And when the sample query statement meets the preset requirement, finishing training of the language model to obtain a trained language model, and taking the initial rewarding value threshold as the preset rewarding value threshold. In one embodiment, after the deep reinforcement learning constrainer is used to calculate the reward value of the language model in the generating process of the sample sentence fragment, the method further includes: When the sample query statement does not meet the preset requirement, continuing to train the language model by adopting the sample text and the sample metadata, and increasing the initial rewarding value threshold according to a preset amplitude; and taking the initial rewarding value threshold after the heightening as the preset rewarding value threshold until the sample inquiry statement output by the language model meets the preset requirement. In a second aspect, the present application also provides a text processing apparatus, the apparatus comprising: The acquisition module is used for acquiring a text to be identified input by a user; the text to be identified is a natural language sentence; The system comprises an input module, an output module, a deep reinforcement learning constraint device, a query sentence generation module and a query sentence generation module, wherein the input module is used for inputting the text to be identified into a pre-trained language model and outputting the query sentence corresponding to the text to be identified; And the acquisition module is used for accessing a preset