CN-121996776-A - User authentication method and device, readable storage medium and program product

CN121996776ACN 121996776 ACN121996776 ACN 121996776ACN-121996776-A

Abstract

The application discloses a user identity verification method and device, a readable storage medium and a program product, and relates to the field of artificial intelligence; the method comprises the steps of obtaining a high-dimensional vector matrix corresponding to a first dimension of a target text and a low-dimensional vector matrix of a second dimension of the high-dimensional vector matrix in a dimension decreasing mode based on importance degrees of characters included in the target text, carrying out similarity matching with a blacklist database based on the low-dimensional vector matrix and the high-dimensional vector matrix to search blacklist texts exceeding a preset similarity threshold, storing the high-dimensional vector matrix corresponding to the first dimension and the low-dimensional vector matrix of the second dimension of a plurality of blacklist texts in the blacklist database in advance, and verifying the identity of a target user based on the blacklist texts exceeding the preset similarity threshold.

Inventors

WANG YU
LI YICHEN
YUAN BOWEN
LI JINLAN

Assignees

中国人民保险集团股份有限公司
人保信息科技有限公司
中国人民人寿保险股份有限公司

Dates

Publication Date: 20260508
Application Date: 20251230

Claims (10)

1. A method for user authentication, comprising: Acquiring a target text carrying identity information of a target user; Based on the importance degree of each character included in the target text, obtaining a high-dimensional vector matrix of a first dimension corresponding to the target text and a low-dimensional vector matrix of a second dimension of dimension reduction of the high-dimensional vector matrix; Performing similarity matching with a blacklist database based on the low-dimensional vector matrix and the high-dimensional vector matrix to search blacklist texts exceeding a preset similarity threshold, wherein the blacklist database stores a plurality of blacklist texts in advance corresponding to the high-dimensional vector matrix of the first dimension and the low-dimensional vector matrix of the second dimension; and verifying the identity of the target user based on the blacklist text exceeding a preset similarity threshold.
2. The method of claim 1, wherein obtaining a high-dimensional vector matrix for the target text corresponding to the first dimension comprises: Splitting the target text into a plurality of substrings, wherein each substring comprises one or more characters with the same length; Calculating importance degrees of each split substring in the target text and the plurality of blacklist texts; and obtaining a high-dimensional vector matrix of the target text corresponding to the first dimension based on the importance degree and the total number of different sub-strings corresponding to the plurality of blacklist texts, wherein the first dimension corresponds to the total number of different sub-strings.
3. The method of claim 2, wherein splitting the target text into a plurality of substrings comprises: and carrying out sliding window operation of a preset length on a plurality of characters included in the target text through an N-gram model, and splitting the target text into a plurality of substrings including the characters of the preset length.
4. The method of claim 2, wherein calculating the importance of each of the split substrings in the target text and the plurality of blacklist texts comprises: Calculating TF-IDF weights of a target substring on the target text and the plurality of blacklist texts through a TF-IDF model; and determining the importance degree of the target substring in the target text and the plurality of blacklist texts according to the TF-IDF weight.
5. The method of claim 1, wherein obtaining the low-dimensional vector matrix of the second dimension of the high-dimensional vector matrix dimension reduction comprises: and performing dimension reduction on the high-dimensional vector matrix through a singular value decomposition model to obtain a low-dimensional vector matrix with the second dimension.
6. The method of claim 1, wherein performing similarity matching with a blacklist database based on the low-dimensional vector matrix and the high-dimensional vector matrix to retrieve blacklist text exceeding a preset similarity threshold comprises: Performing similarity matching on the low-dimensional vector matrix corresponding to the target text and the low-dimensional vector matrix corresponding to the second dimension of a plurality of blacklist texts prestored in the blacklist database so as to retrieve the first M most similar blacklist texts; Performing similarity matching on the high-dimensional vector matrix corresponding to the target text and the high-dimensional vector matrix corresponding to the first dimension of the first M most similar blacklist texts stored in the blacklist database in advance to search out first N most similar blacklist texts exceeding the preset similarity threshold, wherein N is smaller than M; And determining the first N most similar blacklist texts as blacklist texts which are searched out corresponding to the target texts.
7. The method according to any one of claims 1 to 6, further comprising, before retrieving the blacklist text exceeding a preset similarity threshold: and obtaining a high-dimensional vector matrix of each blacklist text corresponding to a first dimension and a low-dimensional vector matrix of the high-dimensional vector matrix of a second dimension for reducing the dimension based on the importance degree of each character included in the blacklist texts in the corresponding blacklist texts.
8. A user authentication device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the method of any of claims 1-7.
9. A readable storage medium, characterized in that it has stored thereon a program or instructions which, when executed by a processor, implement the steps of the method according to any of claims 1-7.
10. A computer program product, characterized in that the computer program product comprises a non-transitory computer readable storage medium storing a computer program, the computer program being operable to cause a computer to perform the steps of the method according to any of claims 1-7.

Description

User authentication method and device, readable storage medium and program product Technical Field The present application relates to the field of artificial intelligence technology, and in particular, to a user authentication method and apparatus, a readable storage medium, and a program product. Background In various service fields facing users, in order to improve the security of a service system, authentication is performed on different users accessing the service system. For example, a short text matching technology is used for user identity verification, and identity information provided by a user is matched with a blacklist database, so that potential bad users are effectively identified, the service system is facilitated to identify possible bad behaviors, and risks are reduced. Existing short text matching approaches include character-based similarity indicators, such as edit distance, tag-based similarity indicators, such as atomic strings, or voice-based similarity indicators, among others. But for data ambiguity in different situations, different rules need to be set to fully cover. In addition, the similarity matching precision is low, a large number of training samples are generated by formulating corresponding rules, the requirement on sample data is high, and the calculation complexity is improved. Therefore, the traditional user identity authentication scheme has the problems of complex calculation, high false alarm rate and the like on a large data set. Disclosure of Invention The embodiment of the application aims to provide a user identity verification method and device, a readable storage medium and a program product, which are used for solving the problems of complex calculation and high false alarm rate of blacklist text matching in the prior user identity verification. In order to solve the technical problems, the present specification is implemented as follows: in a first aspect, a user identity verification method is provided, including: Acquiring a target text carrying identity information of a target user; Based on the importance degree of each character included in the target text, obtaining a high-dimensional vector matrix of a first dimension corresponding to the target text and a low-dimensional vector matrix of a second dimension of dimension reduction of the high-dimensional vector matrix; Performing similarity matching with a blacklist database based on the low-dimensional vector matrix and the high-dimensional vector matrix to search blacklist texts exceeding a preset similarity threshold, wherein the blacklist database stores a plurality of blacklist texts in advance corresponding to the high-dimensional vector matrix of the first dimension and the low-dimensional vector matrix of the second dimension; and verifying the identity of the target user based on the blacklist text exceeding a preset similarity threshold. Optionally, obtaining a high-dimensional vector matrix of the target text corresponding to the first dimension includes: Splitting the target text into a plurality of substrings, wherein each substring comprises one or more characters with the same length; Calculating importance degrees of each split substring in the target text and the plurality of blacklist texts; and obtaining a high-dimensional vector matrix of the target text corresponding to the first dimension based on the importance degree and the total number of different sub-strings corresponding to the plurality of blacklist texts, wherein the first dimension corresponds to the total number of different sub-strings. Optionally, splitting the target text into a plurality of substrings includes: and carrying out sliding window operation of a preset length on a plurality of characters included in the target text through an N-gram model, and splitting the target text into a plurality of substrings including the characters of the preset length. Optionally, calculating importance of each split substring in the target text and the plurality of blacklist texts includes: Calculating TF-IDF weights of a target substring on the target text and the plurality of blacklist texts through a TF-IDF model; and determining the importance degree of the target substring in the target text and the plurality of blacklist texts according to the TF-IDF weight. Optionally, obtaining a low-dimensional vector matrix of a second dimension of the high-dimensional vector matrix dimension reduction includes: and performing dimension reduction on the high-dimensional vector matrix through a singular value decomposition model to obtain a low-dimensional vector matrix with the second dimension. Optionally, performing similarity matching with a blacklist database based on the low-dimensional vector matrix and the high-dimensional vector matrix to retrieve blacklist text exceeding a preset similarity threshold, including: Performing similarity matching on the low-dimensional vector matrix corresponding to the target text and the low-di