CN-121980041-A - Label processing method, device, equipment, medium and product

CN121980041ACN 121980041 ACN121980041 ACN 121980041ACN-121980041-A

Abstract

The embodiment of the invention provides a tag processing method, a device, equipment, a medium and a product, and relates to the technical field of multimedia intelligent processing. The method comprises the steps of obtaining a description text to be utilized of target content, processing the description text to be utilized and a first prompt word by using a first large language model to obtain a label to be utilized of the target content, wherein the first prompt word is used for indicating the first large language model to generate a label of the target content represented by the description text to be utilized, processing the label to be utilized, the description text to be utilized and the second prompt word by using a second large language model to obtain a verification result which indicates whether the label to be utilized has errors, and the second prompt word is used for indicating the large language model to judge whether the label to be utilized can be used as the label of the content represented by the description text to be utilized to obtain a verification result, and taking the label to be utilized as a final label of the target content if the verification result indicates no errors. The method and the device can improve the accuracy of label generation.

Inventors

WEN XU
CHENG QIJIAN
LI DAWEI

Assignees

北京奇艺世纪科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260123

Claims (15)

1. A method of label processing, the method comprising: acquiring a description text of target content as a description text to be utilized; Processing the description text to be utilized and a first prompt word by using a first large language model to obtain a label to be utilized of the target content, wherein the first prompt word is used for indicating the first large language model to generate the label of the target content represented by the description text to be utilized; Processing the to-be-utilized tag, the to-be-utilized description text and a second prompt word by using a second large language model to obtain a verification result which indicates whether the to-be-utilized tag has errors, wherein the second prompt word is used for indicating the large language model to judge whether the to-be-utilized tag can be used as the tag of the content represented by the to-be-utilized description text to obtain the verification result; and if the verification result shows no errors, taking the label to be utilized as a final label of the target content.
2. The method of claim 1, wherein the first large language model is a plurality of; the processing the description text to be utilized and the first prompt word by using the first large language model to obtain a label to be utilized of the target content comprises the following steps: Processing the text to be described and the first prompt word by using each first large language model to obtain a predictive tag of the target content output by the first large language model; and obtaining the label to be utilized of the target content based on each prediction label.
3. The method of claim 1 or 2, wherein the first cue word comprises a plurality of preset labels; The first prompt word is used for indicating the first large language model to determine the label of the target content to be characterized by the descriptive text in the preset labels.
4. The method of claim 1 or 2, wherein the second large language model is a plurality; The processing of the label to be utilized, the description text to be utilized and the second prompt word by using the second large language model to obtain a verification result indicating whether the label to be utilized has errors or not comprises the following steps: processing the label to be utilized, the description text to be utilized and the second prompt word by using each second large language model to obtain a verification result which is output by the second large language model and indicates whether the label to be utilized has errors or not; Calculating the ratio of the verification results representing no errors in all the verification results; And if the ratio is larger than a preset threshold value, obtaining a verification result indicating that the label to be utilized is correct.
5. The method according to claim 1 or 2, wherein the first prompt word comprises a positive example text and a negative example text, wherein the positive example text comprises a positive sample description text, a positive sample label and a reason why the positive sample label can be used as a label of the content represented by the positive sample description text; The first prompt word is used for indicating that the first large language model generates a label of the target content to be characterized by the description text based on the positive example text and the negative example text, and generating a reason of the label.
6. The method of claim 1, wherein the second cue word comprises a plurality of preset labels; The processing of the label to be utilized, the description text to be utilized and the second prompt word by using the second large language model to obtain a verification result indicating whether the label to be utilized has errors or not comprises the following steps: performing the following steps using the second largest language model: Detecting whether the tags to be utilized exist in the preset tags or not; if the to-be-utilized tag does not exist, determining that the to-be-utilized tag is a tag with a first type of error, and obtaining a verification result, wherein the first type of error indicates that the to-be-utilized tag does not belong to the plurality of preset tags; if the to-be-utilized tag exists, judging whether the to-be-utilized tag can be used as the tag of the content represented by the to-be-utilized description text, and obtaining a verification result.
7. The method of claim 6, wherein the second prompt word further includes a relationship description text for describing a parent-child relationship among the plurality of preset labels, and a child label corresponding to a parent label represents a child category under a category represented by the parent label; if the to-be-utilized tag exists, judging whether the to-be-utilized tag can be used as the tag of the content represented by the to-be-utilized description text, and obtaining a verification result comprises the following steps: judging whether sub-class labels corresponding to the to-be-utilized labels exist in all the to-be-utilized labels existing in the plurality of preset labels based on the father-son relationship aiming at each to-be-utilized label existing in the plurality of preset labels under the condition that the plurality of to-be-utilized labels exist in the plurality of preset labels; if the sub-class label corresponding to the to-be-utilized label exists, judging whether the to-be-utilized label can be used as the label of the content represented by the to-be-utilized description text; if the to-be-utilized tag cannot be used as the tag of the content represented by the to-be-utilized description text, determining the to-be-utilized tag and the corresponding sub-class tag as the tag with the first class error, and obtaining a verification result.
8. The method of claim 6 or 7, wherein the step of the second large language model performing further comprises: under the condition that a plurality of tags to be utilized are provided, determining the text proportion of the content represented by the tags to be utilized in the description text to be utilized aiming at each tag to be utilized without first-class errors; And determining the label to be utilized with the text proportion smaller than the preset proportion as a label with a second type error, and obtaining a verification result, wherein the second type error indicates that the label to be utilized is not matched with the target content.
9. The method of claim 8, wherein the method further comprises: Calculating the proportion of the tags to be utilized, which have the first type of errors and the second type of errors, in all the tags to be utilized, and obtaining the accuracy of all the tags to be utilized; And if the accuracy rate does not reach the preset accuracy rate, outputting alarm information to prompt that other operations for distributing labels to the target content are required to be executed.
10. The method of claim 9, wherein calculating the proportion of the tags to be utilized having the first type of errors and the second type of errors in all the tags to be utilized to obtain the accuracy of all the tags to be utilized comprises: Calculating a weighted sum of the number of tags to be utilized, which have the first type of errors, and the number of tags to be utilized, which have the second type of errors, wherein the weight of the number of tags to be utilized, which have the first type of errors, is greater than the weight of the number of tags to be utilized, which have the second type of errors; and calculating the ratio of the obtained weighted sum to the total number of all the tags to be utilized to obtain the accuracy of all the tags to be utilized.
11. The method of claim 1, wherein the second prompt word comprises a positive example text and a negative example text, wherein the positive example text comprises a positive sample description text, a positive sample label and a reason why the positive sample label can be used as a label of the content represented by the positive sample description text; The second prompt word is used for indicating that the second large language model is based on the positive example text and the negative example text, judging whether the label to be utilized can be used as the label of the content represented by the description text to be utilized, obtaining a verification result, and generating a reason for obtaining the verification result.
12. A label processing apparatus, the apparatus comprising: The first acquisition module is used for acquiring a description text of the target content as a description text to be utilized; The first utilization module is used for processing the description text to be utilized and a first prompt word by utilizing a first large language model to obtain a label to be utilized of the target content, wherein the first prompt word is used for indicating the first large language model to generate the label of the target content represented by the description text to be utilized; the second utilization module is used for processing the label to be utilized, the description text to be utilized and a second prompt word by utilizing a second large language model to obtain a verification result which indicates whether the label to be utilized has errors or not; And the determining module is used for taking the label to be utilized as a final label of the target content if the verification result shows no errors.
13. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus; a memory for storing a computer program; a processor for implementing the method of any of claims 1-11 when executing a program stored on a memory.
14. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the method of any of claims 1-11.
15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-11.

Description

Label processing method, device, equipment, medium and product Technical Field The invention relates to the technical field of multimedia intelligent processing, in particular to a label processing method, a device, equipment, a medium and a product. Background The tag is a text describing contents such as news, blogs and videos, and enterprises providing the contents for users can conveniently conduct classified management on the contents through the tag and help the users to accurately and quickly find the required contents. In the prior art, a worker may manually classify each content (e.g., news, video, etc.) according to the descriptive text of the content to obtain a tag of the content. However, the accuracy of manually generating labels is not high. Disclosure of Invention The embodiment of the invention aims to provide a label processing method, a device, equipment, a medium and a product, so as to improve the accuracy of label generation. The specific technical scheme is as follows: in a first aspect, an embodiment of the present invention provides a tag processing method, where the method includes: acquiring a description text of target content as a description text to be utilized; Processing the description text to be utilized and a first prompt word by using a first large language model to obtain a label to be utilized of the target content, wherein the first prompt word is used for indicating the first large language model to generate the label of the target content represented by the description text to be utilized; Processing the to-be-utilized tag, the to-be-utilized description text and a second prompt word by using a second large language model to obtain a verification result which indicates whether the to-be-utilized tag has errors, wherein the second prompt word is used for indicating the large language model to judge whether the to-be-utilized tag can be used as the tag of the content represented by the to-be-utilized description text to obtain the verification result; and if the verification result shows no errors, taking the label to be utilized as a final label of the target content. Optionally, the first large language model is a plurality of; the processing the description text to be utilized and the first prompt word by using the first large language model to obtain a label to be utilized of the target content comprises the following steps: Processing the text to be described and the first prompt word by using each first large language model to obtain a predictive tag of the target content output by the first large language model; and obtaining the label to be utilized of the target content based on each prediction label. Optionally, the first prompting word includes a plurality of preset labels; The first prompt word is used for indicating the first large language model to determine the label of the target content to be characterized by the descriptive text in the preset labels. Optionally, the second large language model is a plurality of; The processing of the label to be utilized, the description text to be utilized and the second prompt word by using the second large language model to obtain a verification result indicating whether the label to be utilized has errors or not comprises the following steps: processing the label to be utilized, the description text to be utilized and the second prompt word by using each second large language model to obtain a verification result which is output by the second large language model and indicates whether the label to be utilized has errors or not; Calculating the ratio of the verification results representing no errors in all the verification results; And if the ratio is larger than a preset threshold value, obtaining a verification result indicating that the label to be utilized is correct. The first prompt word comprises a positive example text and a negative example text, wherein the positive example text comprises a positive sample description text, a positive sample label and a reason why the positive sample label can be used as a label of a content represented by the positive sample description text; The first prompt word is used for indicating that the first large language model generates a label of the target content to be characterized by the description text based on the positive example text and the negative example text, and generating a reason of the label. Optionally, the second prompting word includes a plurality of preset labels; The processing of the label to be utilized, the description text to be utilized and the second prompt word by using the second large language model to obtain a verification result indicating whether the label to be utilized has errors or not comprises the following steps: performing the following steps using the second largest language model: Detecting whether the tags to be utilized exist in the preset tags or not; if the to-be-utilized tag does not exist, determining that the to-be-utilized