JP-7856234-B2 - Data generation method, apparatus, device, and medium
Inventors
- ゾーァヤーン レイ
- スーチイ バオ
- ホワ ウー
- ハイフオン ワーン
Assignees
- ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド
Dates
- Publication Date
- 20260511
- Application Date
- 20240619
- Priority Date
- 20230630
Claims (17)
- A data generation method performed by a computer , wherein the method is To generate first response data based on first query data from the user, In response to receiving negative feedback from the user regarding the first response data, a first reconsideration result for the first response data is determined based on the first response data and the negative feedback, and the first reconsideration result indicates the reason why the user's feedback regarding the first response data is negative feedback. This includes generating a second response data to the first query data based on the first query data and the first reconsideration result, Based on the first response data and the negative feedback, determining the first reconsideration result for the first response data is: A method comprising inputting the first response data and the negative feedback into a rethinking generation network to obtain the first rethinking result output by the rethinking generation network, wherein the rethinking generation network is obtained by training using a sample corpus, and the sample corpus includes sample response data, sample feedback, and sample rethinking results for the sample response data .
- Generating the first response data based on the first query data from the user, as described above, Based on the first query data, the first input data used in a deep learning model for generating response data based on the input data is determined, This includes inputting the first input data into the deep learning model to obtain the first response data, Here, generating second response data for the first query data based on the first query data and the first reconsideration result is: Based on the first query data and the first reconsideration result, the second input data used in the deep learning model is determined, The method according to claim 1, further comprising inputting the second input data into the deep learning model to obtain the second response data.
- Based on the aforementioned first query data and the first reconsideration result, determining the second input data used in the deep learning model is: The method according to claim 2, comprising determining the second input data based on the first query data, the first reconsideration result, and task description information indicating that the second input data includes the first reconsideration result.
- In response to receiving negative feedback from the user regarding the first response data, determining a first reconsideration result for the first response data based on the first response data and the negative feedback is: The method according to any one of claims 1 to 3, comprising determining a first reconsideration result for the first response data based on the first response data and the first feedback, in response to receiving first feedback from the user for the first response data, and in response to determining that the first feedback is negative feedback.
- The method according to any one of claims 1 to 3, further comprising generating a third response data for the second query data based on the first query data, the second response data, and the second query data, in response to the determination that the similarity between the second query data from the user and the first query data is greater than a preset threshold.
- The method further includes storing the first query data and the second response data in a memory bank. In response to the determination that the similarity between the second query data from the user and the first query data is greater than a predetermined threshold, generating a third response data for the second query data based on the first query data, the second response data, and the second query data is: In response to the determination that the similarity between the second query data from the user and the first query data in the memory bank is greater than the preset threshold, the second response data is retrieved from the memory bank. The method according to claim 5 , comprising generating the first query data, the second response data, and the third response data based on the second query data.
- The method according to any one of claims 1 to 3, further comprising the first reconsideration result, an optimization policy for the first response data.
- A data generation device, wherein the device is A first generation unit configured to generate first response data based on first query data from a user, A confirmation unit configured to confirm a first reconsideration result for the first response data based on the first response data and the negative feedback, in response to receiving negative feedback from the user for the first response data, wherein the first reconsideration result indicates the reason for diagnosing that the user's feedback for the first response data is negative feedback. The system includes a second generation unit that generates a second response data for the first query data based on the first query data and the first reconsideration result, The aforementioned confirmation unit is The device is configured to obtain the first reconsideration result output by the reconsideration generation network by inputting the first response data and the negative feedback into the reconsideration generation network, wherein the reconsideration generation network is obtained by training using a sample corpus, and the sample corpus includes sample response data, sample feedback, and sample reconsideration results for the sample response data.
- The first generation unit is A first determinative subunit configured to determine first input data used in a deep learning model for generating response data based on input data, based on the first query data, It includes a first input subunit configured to obtain the first response data by inputting the first input data into the deep learning model, The second generation unit is, A second determinative subunit configured to determine the second input data used in the deep learning model based on the first query data and the first reconsideration result, The apparatus according to claim 8 , further comprising a second input subunit configured to obtain the second response data by inputting the second input data into the deep learning model.
- The second input subunit is, The apparatus according to claim 9, configured to determine the second input data based on the first query data, the first reconsideration result, and task description information indicating that the second input data includes the first reconsideration result.
- The aforementioned confirmation unit is The apparatus according to any one of claims 8 to 10, configured to determine a first reconsideration result for the first response data based on the first response data and the first feedback, in response to receiving first feedback from the user for the first response data, and in response to determining that the first feedback is negative feedback.
- The apparatus according to any one of claims 8 to 10, further comprising a third generation unit configured to generate a third response data for the second query data based on the first query data, the second response data, and the second query data, in response to the determination that the similarity between a second query data from a user and the first query data is greater than a preset threshold .
- The system further includes a storage unit configured to store the first query data and the second response data in a memory bank, The third generation unit is An acquisition subunit configured to acquire the second response data from the memory bank in response to the determination that the similarity between the second query data from the user and the first query data in the memory bank is greater than the preset threshold, The apparatus according to claim 12 , comprising the first query data, the second response data, and a generating subunit configured to generate the third response data based on the second query data.
- The apparatus according to any one of claims 8 to 10 , further comprising the first reconsideration result, an optimization policy for the first response data.
- It is an electronic device, At least one processor, Includes memory communicated to at least one processor, An electronic device wherein the memory stores instructions that can be executed by the at least one processor, and by executing the instructions by the at least one processor, the at least one processor can be made to perform the method according to claim 1.
- A non-temporary computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause a computer to execute the method described in claim 1.
- A computer program, which, when executed by a processor, implements the method described in claim 1.
Description
This disclosure relates to the field of artificial intelligence technology, particularly to the fields of natural language processing and deep learning, and specifically to data generation methods, apparatus, electronic devices, computer-readable storage media, and computer program products. Artificial intelligence is the study of enabling computers to simulate certain human thought processes and intelligent behaviors (e.g., learning, reasoning, thinking, planning, etc.), encompassing both hardware and software technologies. Hardware technologies in artificial intelligence generally include sensors, dedicated AI chips, cloud computing, distributed storage, and big data processing, while artificial intelligence software technologies primarily encompass several major areas such as computer vision, speech recognition, natural language processing, machine learning/deep learning, big data processing, and knowledge graph technologies. Generative language megamodels can be applied to various natural language processing tasks, and in particular, they can enable interaction with users by generating natural language text for responses based on user queries. The methods described in this section are not necessarily previously conceived or adopted. Unless otherwise specified, none of the methods described in this section should be considered prior art simply because they are included in this section. Similarly, unless otherwise specified, none of the problems mentioned in this section should be considered to have been acknowledged in any prior art. This disclosure provides data generation methods, apparatus, electronic devices, computer-readable storage media, and computer program products. According to one aspect of this disclosure, a data generation method is provided, the method comprising: generating first response data based on first query data from a user; determining a first reconsideration result for the first response data based on the first response data and the negative feedback in response to receiving negative feedback from the user regarding the first response data, wherein the first reconsideration result indicates the reason why the user's feedback on the first response data is negative; and generating second response data for the first query data based on the first query data and the first reconsideration result. According to another aspect of this disclosure, a data generation apparatus is provided, comprising: a first generation unit configured to generate first response data based on first query data from a user; a finalization unit configured to determine a first reconsideration result for the first response data based on the first response data and the negative feedback, in response to receiving negative feedback from the user for the first response data, wherein the first reconsideration result indicates the diagnostic reason why the user's feedback for the first response data is negative feedback; and a second generation unit that generates second response data for the first query data based on the first query data and the first reconsideration result. According to another aspect of this disclosure, an electronic device is provided, comprising at least one processor and a memory communicated to the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to cause the at least one processor to perform the data generation method described above. According to another aspect of this disclosure, a non-temporary computer-readable storage medium is provided which stores computer instructions, and the computer instructions are used to cause the computer to execute the data generation method described above. According to another aspect of this disclosure, a computer program product including a computer program is provided, the computer program, when executed by a processor, can realize the data generation method described above. According to one or more embodiments of this disclosure, the quality of response data generation can be improved. It should be understood that the content described in this section is not intended to identify essential or important features of the embodiments of this disclosure, nor is it intended to limit the scope of this disclosure. Other features of this disclosure are readily apparent from the following specification. The drawings illustrate embodiments and constitute part of the specification, and are used to illustrate exemplary embodiments of the embodiments together with the textual description of the specification. The illustrated embodiments are for illustrative purposes only and do not limit the scope of the claims. In all drawings, the same reference numerals refer to similar but not necessarily identical elements. This is a schematic diagram showing an exemplary system in which various methods described herein can be carried out according to exemplary embo