CN-121997049-A - Multi-mode file label automatic generation method

CN121997049ACN 121997049 ACN121997049 ACN 121997049ACN-121997049-A

Abstract

The application discloses an automatic generation method of a multi-mode file label, which relates to the technical field of label generation, and comprises the steps of firstly constructing a multi-level file label class set as a reference library for label matching, then uploading a target multi-mode file, preprocessing a text extracted according to the file, and constructing a file representation vector; finally, judging the calculation power of the current equipment, dividing Jing Shengcheng the labels, extracting the labels step by step to generate a multi-level candidate label set if the calculation power is limited, merging the multi-level label class sets to directly extract the candidate label set if the calculation power is sufficient, modifying the word segmentation result by calculating the aggregation degree of adjacent words in the primary word segmentation result, extracting multi-level file labels again automatically, and finally calculating the comprehensive cosine similarity values of the generated multiple candidate file labels to output the optimal file label. The method and the device can realize effective utilization of resources and improve the accuracy of the generated file labels.

Inventors

XU SHENGWEI
ZHENG HUIMIN
HE JUNHUA
ZHU YONGHENG

Assignees

北京电子科技学院

Dates

Publication Date: 20260508
Application Date: 20260129

Claims (10)

1. The method for automatically generating the multi-mode file tag is characterized by comprising the following steps of: constructing a multi-level file tag class set comprising a first-level tag, a second-level tag and a third-level tag, and constructing corresponding tag expression vectors for each level of tag class; the method comprises the steps of receiving and analyzing a multi-modal file, extracting text content of the file and generating corresponding text data, carrying out word segmentation processing on the text data, screening candidate words, and constructing a file representation vector based on word frequency weights of the candidate words; Extracting at least one label path consisting of a primary label, a secondary label and a tertiary label based on cosine similarity between the file representation vector and each level label representation vector; And calculating the aggregation degree of adjacent words in the segmentation result, updating the candidate word set, reconstructing the file representation vector based on the updated candidate word set, repeating the steps of extracting the label paths and calculating the comprehensive similarity, and finally outputting the label path with the maximum similarity as the final labeling result of the file.
2. The method of claim 1, wherein constructing a multi-level file tag class set comprising a primary tag, a secondary tag, and a tertiary tag, and constructing a corresponding tag representation vector for each level of tag class, comprises: Presetting a primary label class set, and configuring corresponding label description information for each primary label class; based on the primary label class set, logically expanding each primary label class to generate a corresponding secondary label class set; Based on the secondary label category set, logically expanding each secondary label category to generate a corresponding tertiary label category set; And respectively carrying out vectorization processing on tag description information corresponding to the first-level tag, the second-level tag and the third-level tag to generate tag expression vectors corresponding to the tag categories of each level.
3. The method of claim 2, wherein receiving and parsing the multimodal file, extracting textual content of the file and generating corresponding text data comprises: Receiving a multi-mode file to be processed, and determining a corresponding analysis mode according to the file type; When the multi-mode file is a document file, directly analyzing the file content to extract text information; when the multi-mode file is an image, audio or video file, extracting text information related to the file content in a content analysis or metadata acquisition mode; and carrying out unified format processing on the extracted text information to generate text data for subsequent processing.
4. The method of claim 3, wherein word segmentation and screening of the text data for candidate words, constructing a document representation vector based on word frequency weights of the candidate words, comprises: Word segmentation processing is carried out on the text data, and word segmentation results formed by a plurality of words are obtained; Marking part-of-speech information for words in the word segmentation result, and selecting candidate words according to a preset part-of-speech screening rule; counting the occurrence frequency of each candidate word in the corresponding text data, and calculating the word frequency weight of the candidate word based on the occurrence frequency; And sequencing the candidate words with the word frequency weight values, taking the candidate words sequenced in the front of the preset ranking as the characteristic words of the text data, and constructing corresponding file expression vectors, wherein each element in the vectors expresses the word frequency weight value corresponding to one characteristic value.
5. The method of claim 4, wherein extracting at least one label path consisting of primary labels, secondary labels, and tertiary labels based on cosine similarity between the file representation vector and each level of label representation vector when computational effort is limited, comprises: calculating cosine similarity between the file representation vector and each primary label representation vector in the primary label class set, and selecting a candidate primary label according to a preset screening rule; based on the secondary label class set corresponding to the candidate primary label, calculating cosine similarity between the file representation vector and each secondary label representation vector, and selecting a candidate secondary label; Based on the three-level label class set corresponding to the candidate two-level label, calculating cosine similarity between the file representation vector and each three-level label representation vector, and selecting a candidate three-level label; combining the candidate primary label, the candidate secondary label and the candidate tertiary label to form at least one complete label path.
6. The method of claim 4, wherein extracting at least one label path comprised of primary labels, secondary labels, and tertiary labels based on cosine similarity between the file representation vector and each level of label representation vector when computation is sufficient, comprises: combining the first-level tag class set, the second-level tag class set and the third-level tag class set into a unified tag class set; Calculating cosine similarity between the file representation vector and each label representation vector in the unified label class set, and selecting a candidate label set according to a preset screening rule; And screening a label path which at least comprises a primary label, a secondary label and a tertiary label in the candidate label set.
7. The method according to any one of claims 5 or 6, wherein calculating the degree of aggregation of adjacent words in the segmentation result and updating the candidate word set, reconstructing a document representation vector based on the updated candidate word set, repeating the steps of extracting a label path and calculating the comprehensive similarity, and finally outputting the label path with the maximum similarity as a final labeling result of the document, comprises: Calculating corresponding condensation degree of adjacent words in the word segmentation result; combining adjacent words with the aggregation degree larger than a preset threshold value into a phrase, and updating a candidate word set based on the phrase; recalculating word frequency weights of candidate words based on the updated candidate word set, and reconstructing file representation vectors; Repeatedly executing the label path extraction and comprehensive similarity calculation steps based on the reconstructed file representation vector; and comparing the comprehensive similarity corresponding to the label paths obtained before updating and after updating, and outputting the label paths with large comprehensive similarity as the final labeling result of the file.
8. An automatic generation device of a multi-mode file tag, comprising: the first module is used for constructing a multi-level file tag class set comprising a first-level tag, a second-level tag and a third-level tag, and constructing corresponding tag expression vectors for each level of tag class; The second module is used for receiving and analyzing the multi-modal file, extracting text content of the file and generating corresponding text data, carrying out word segmentation processing on the text data, screening candidate words, and constructing a file representation vector based on word frequency weights of the candidate words; The third module is used for extracting at least one label path consisting of a primary label, a secondary label and a tertiary label based on cosine similarity between the file representation vector and each level of label representation vector; And a fourth module, configured to calculate a condensation degree of adjacent words in the word segmentation result, update the candidate word set, reconstruct a file representation vector based on the updated candidate word set, repeat the steps of extracting a label path and calculating the comprehensive similarity, and finally output the label path with the maximum similarity as a final labeling result of the file.
9. An electronic device comprising a processor and a memory; Wherein the processor runs a program corresponding to executable program code stored in the memory by reading the executable program code for implementing the method according to any one of claims 1-7.
10. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the method according to any one of claims 1-7.

Description

Multi-mode file label automatic generation method Technical Field The invention relates to the technical field of label generation, in particular to an automatic generation method of a multi-mode file label. Background In the prior art, most of the label generation of the multi-mode file is dependent on manual operation to carry out custom operation according to analysis requirements and experience, the whole process possibly involves complicated processes of manually reading file content, manually extracting information, screening matched labels and the like, the manual participation is high, the efficiency is low, a large amount of labor cost is consumed, and the label generation requirement of the multi-mode file is difficult to adapt to the batch. According to the method, cosine similarity calculation is carried out on the pre-constructed multi-level file label class set and the file representation vector, matching and generation of file labels can be automatically completed, labor input is effectively reduced, and label generation efficiency is improved. On the other hand, the prior art does not adapt to the difference of the computational power resources, and if a unified label generation method is adopted to run on equipment with different computational power configurations, two defects are likely to occur, namely, task running is blocked or even cannot be advanced when the computational power resources are insufficient, and resource utilization efficiency is low due to flow redundancy when the computational power resources are excessive. Disclosure of Invention The invention mainly aims to provide an automatic generation method of a multi-mode file label. Another object of the present invention is to provide an apparatus for automatically generating a multi-modal file tag. A third object of the present invention is to propose an electronic device. A fourth object of the present invention is to propose a non-transitory computer readable storage medium. To achieve the above objective, an embodiment of a first aspect of the present invention provides a method for automatically generating a multi-mode file tag, including: constructing a multi-level file tag class set comprising a first-level tag, a second-level tag and a third-level tag, and constructing corresponding tag expression vectors for each level of tag class; the method comprises the steps of receiving and analyzing a multi-modal file, extracting text content of the file and generating corresponding text data, carrying out word segmentation processing on the text data, screening candidate words, and constructing a file representation vector based on word frequency weights of the candidate words; Extracting at least one label path consisting of a primary label, a secondary label and a tertiary label based on cosine similarity between the file representation vector and each level label representation vector; And calculating the aggregation degree of adjacent words in the segmentation result, updating the candidate word set, reconstructing the file representation vector based on the updated candidate word set, repeating the steps of extracting the label paths and calculating the comprehensive similarity, and finally outputting the label path with the maximum similarity as the final labeling result of the file. Optionally, constructing a multi-level file tag class set including a first level tag, a second level tag, and a third level tag, and constructing a corresponding tag expression vector for each level tag class, including: Presetting a primary label class set, and configuring corresponding label description information for each primary label class; based on the primary label class set, logically expanding each primary label class to generate a corresponding secondary label class set; Based on the secondary label category set, logically expanding each secondary label category to generate a corresponding tertiary label category set; And respectively carrying out vectorization processing on tag description information corresponding to the first-level tag, the second-level tag and the third-level tag to generate tag expression vectors corresponding to the tag categories of each level. Optionally, receiving and analyzing the multi-modal file, extracting text content of the file and generating corresponding text data, including: Receiving a multi-mode file to be processed, and determining a corresponding analysis mode according to the file type; When the multi-mode file is a document file, directly analyzing the file content to extract text information; when the multi-mode file is an image, audio or video file, extracting text information related to the file content in a content analysis or metadata acquisition mode; and carrying out unified format processing on the extracted text information to generate text data for subsequent processing. Optionally, word segmentation processing is performed on the text data, candidate words are screened, and a