CN-121997073-A - Word screening method, device, electronic equipment and storage medium
Abstract
The embodiment of the application relates to an artificial intelligence technology, in particular to a word screening method, a word screening device, electronic equipment and a storage medium. The method comprises the steps of obtaining sentences to be screened, carrying out Chinese character recognition on the sentences to be screened to obtain a Chinese character sequence to be screened, splitting each Chinese character in the Chinese character sequence to be screened according to a Chinese character part head mapping table to obtain a radical sequence to be screened, recombining at least 2 adjacent radicals in the radical sequence to be screened according to the Chinese character part head mapping table to obtain Chinese characters to be matched, and matching the Chinese characters to be matched with a preset screening word to obtain a word screening result. The embodiment of the application improves the effectiveness of screening the specific words and improves the accuracy of identifying the variant specific words.
Inventors
- LIN JUNXIONG
Assignees
- 中国工商银行股份有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260213
Claims (10)
- 1. A method of screening words, comprising: acquiring a sentence to be screened, and carrying out Chinese character recognition on the sentence to be screened to obtain a Chinese character sequence to be screened; splitting each Chinese character in the Chinese character sequence to be screened according to the Chinese character part and head mapping table to obtain a radical sequence to be screened; Recombining at least 2 adjacent radicals in the radical sequence to be screened according to a character part and head mapping table to obtain characters to be matched; And matching the Chinese characters to be matched with a preset screening word to obtain a word screening result.
- 2. The method of claim 1, wherein the reorganizing at least 2 adjacent radicals in the sequence of radicals to be screened to obtain Chinese characters to be matched comprises: selecting at least 2 adjacent radicals in the radical sequence to be screened through a matching window as radicals to be matched; Judging whether the radicals to be matched have corresponding candidate Chinese characters according to a Chinese character part and head mapping table; If yes, the candidate Chinese characters are used as the Chinese characters to be matched.
- 3. The method of claim 2, further comprising, after said determining whether said candidate kanji exists based on said kanji part head mapping table,: If not, judging whether the size of the matched window is a window threshold value or not; If yes, sliding the matching window rightward by a sliding unit, recovering the size of the matching window to be a window initial value, and returning to the step of selecting radicals to be matched; otherwise, the size of the matching window is increased, and the step of selecting radicals to be matched is returned.
- 4. The method of claim 1, wherein splitting each chinese character in the sequence of chinese characters to be screened according to the chinese character part-to-head mapping table to obtain the sequence of radicals to be screened comprises: Determining the radical similarity of each Chinese character to be split and each candidate mapped Chinese character according to the radical editing distance, the radical semantic coefficient and the radical position coefficient of each candidate mapped Chinese character in the Chinese character sequence to be screened and the Chinese character part-to-head mapping table; and determining target mapped Chinese characters from the candidate mapped Chinese characters with the radical similarity larger than a preset threshold value, so as to obtain a radical sequence to be screened according to the target mapped Chinese characters.
- 5. The method of claim 1, wherein the matching the chinese character to be matched with a preset screening word to obtain a word screening result includes: matching the Chinese characters to be matched with the fonts and the semantics of each preset screening word; if the matching is successful, taking the Chinese characters to be matched which are successfully matched as a word screening result; otherwise, splitting the Chinese character to be matched, and returning to the step of recombining the radical sequence to be screened.
- 6. The method of claim 1, wherein the kanji part head mapping table comprises: the bi-directional mapping relation of Chinese characters and corresponding radicals, the corresponding variants of the radicals and the position weights of the radicals in the Chinese characters.
- 7. A word screening apparatus, comprising: the Chinese character recognition module is used for obtaining sentences to be screened, and recognizing Chinese characters of the sentences to be screened to obtain a Chinese character sequence to be screened; the Chinese character splitting module is used for splitting each Chinese character in the Chinese character sequence to be screened according to the Chinese character part and head mapping table to obtain a radical sequence to be screened; The radical recombination module is used for recombining at least 2 adjacent radicals in the radical sequence to be screened according to the Chinese character part and radical mapping table to obtain Chinese characters to be matched; And the word screening result determining module is used for matching the Chinese characters to be matched with the preset screening words to obtain word screening results.
- 8. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the word screening method of any one of claims 1-6 when the program is executed by the processor.
- 9. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the word screening method of any of claims 1-6.
- 10. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the word screening method according to any one of claims 1-6.
Description
Word screening method, device, electronic equipment and storage medium Technical Field The embodiment of the application relates to an artificial intelligence technology, in particular to a word screening method, a word screening device, electronic equipment and a storage medium. Background In the financial field, how to realize the effective screening of the language in the enterprise platform and the specific words in the data submitted by the user is an important content for maintaining the network order of the enterprise platform and guaranteeing the accuracy of the data auditing result. In the prior art, text to be screened is usually compared with preset specific words to be screened word by word. However, with the advent of various text variants, the word-by-word comparison approach has failed to achieve efficient screening of specific words. Disclosure of Invention The application provides a word screening method, a device, electronic equipment and a storage medium, which are used for improving the effectiveness of screening specific words and improving the accuracy of identifying the specific words after text variants. In a first aspect, an embodiment of the present application provides a word screening method, including: Acquiring a sentence to be screened, and carrying out Chinese character recognition on the sentence to be screened to obtain a Chinese character sequence to be screened; splitting each Chinese character in the Chinese character sequence to be screened according to the Chinese character part and head mapping table to obtain a radical sequence to be screened; recombining at least 2 adjacent radicals in the radical sequence to be screened according to the character part and head mapping table to obtain characters to be matched; And matching the Chinese characters to be matched with a preset screening word to obtain a word screening result. In a second aspect, an embodiment of the present application further provides a word screening apparatus, including: The Chinese character recognition module is used for acquiring sentences to be screened, and recognizing Chinese characters of the sentences to be screened to obtain a Chinese character sequence to be screened; The Chinese character splitting module is used for splitting each Chinese character in the Chinese character sequence to be screened according to the Chinese character part and head mapping table to obtain the radical sequence to be screened; The radical recombination module is used for recombining at least 2 adjacent radicals in the radical sequence to be screened according to the Chinese character part and radical mapping table to obtain Chinese characters to be matched; The word screening result determining module is used for matching the Chinese characters to be matched with the preset screening words to obtain word screening results. In a third aspect, an embodiment of the present application further provides an electronic device, including: One or more processors; a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement any of the word screening methods as provided by embodiments of the present application. In a fourth aspect, embodiments of the present application also provide a storage medium comprising computer-executable instructions, which when executed by a computer processor, are configured to perform any one of the word screening methods as provided by the embodiments of the present application. In a fifth aspect, embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements any of the word screening methods as provided by the embodiments of the present application. The method comprises the steps of obtaining a sentence to be screened, carrying out Chinese character recognition on the sentence to be screened to obtain a Chinese character sequence to be screened, accurately recognizing characters in the sentence to be screened, splitting each Chinese character in the Chinese character sequence to be screened according to a Chinese character part head mapping table to obtain a radical sequence to be screened, recombining at least 2 adjacent radicals in the radical sequence to be screened according to the Chinese character part head mapping table to obtain a Chinese character to be matched, matching the Chinese character to be matched with a preset screening word to obtain a word screening result, splitting and recombining the radicals to recognize the condition of screening specific words which are avoided through structural change, improving the screening effectiveness of the specific words, and improving the accuracy of identifying the specific words after modification. Therefore, through the technical scheme of the application, the problem that the effective screening of the specific words cannot be realized by