CN-117421488-B - User identity alignment method across social networks based on user space-time data
Abstract
The invention provides a user identity alignment method across social networks based on user space-time data, which is characterized by comprising the following steps of S1, respectively preprocessing user sign-in data A and user sign-in data B to respectively obtain a user track T A and a user track T β , and S2, inputting the user track T A and the user track T B into a user identity alignment prediction model together to obtain a user identity alignment result, wherein the user identity alignment prediction model comprises a vectorization module, a track cross attention module, a Bi-LSTM module, a local attention module and an output module. In a word, the method can obtain more accurate user identity alignment results.
Inventors
- GUAN JIHONG
- YANG YUE
- ZHANG YICHAO
- LI WENGEN
Assignees
- 同济大学
Dates
- Publication Date
- 20260505
- Application Date
- 20230920
Claims (7)
- 1. A user identity alignment method based on user space-time data and crossing social networks is used for obtaining user identity alignment results of a user A and a user B according to user sign-in data A of a user u i in a platform A and user sign-in data B of a user u j in a platform B, and is characterized by comprising the following steps: Step S1, preprocessing the user check-in data A and the user check-in data B respectively to obtain a user track T A and a user track T B respectively; step S2, inputting the user track T A and the user track T B together into a user identity alignment prediction model to obtain the user identity alignment result, Wherein the user identity alignment prediction model comprises a vectorization module, a track crossing attention module, a Bi-LSTM module, a local attention module and an output module, The vectorization module is used for vectorizing the user track T A and the user track T B to obtain a check-in position vector X i and a check-in position vector X j , The track cross attention module is used for enhancing the check-in position vector X i and the check-in position vector X j according to a track cross attention mechanism to obtain an enhanced check-in position vector X ' i and an enhanced check-in position vector X' j , The Bi-LSTM module comprises a two-way long and short memory neural network for respectively carrying out feature extraction on the enhanced check-in position vector X ' i and the enhanced check-in position vector X' j to respectively obtain a track hidden feature h i and a track hidden feature h j , The local attention module is configured to obtain a final embedded representation vector s ' i according to the track hidden feature h j , the enhanced check-in position vector X' i , and the track hidden feature h i , obtain a final embedded representation vector s ' j according to the track hidden feature h i , the enhanced check-in position vector X' j , and the track hidden feature h j , The output module is used for obtaining the matching probability of the user track T A and the user track T B according to the final embedded representation vector s ' i and the final embedded representation vector s' j As a result of said user identity alignment.
- 2. The method for aligning user identities across social networks based on user spatiotemporal data according to claim 1, characterized in that: wherein the check-in data includes GPS information and corresponding time stamp information for each check-in location, In the step S1, the preprocessing is to sort the time stamp information and the corresponding GPS information in the user sign-in data according to a time sequence to obtain the user track.
- 3. The method for aligning user identities across social networks based on user spatiotemporal data according to claim 1, characterized in that: the vectorization module obtains corresponding embedded vector representations from each check-in position in the input user track through a single-hot coding algorithm, maps all the embedded vector representations to a low-dimensional vector space and performs splicing operation to obtain a check-in position vector corresponding to the user track.
- 4. The method for aligning user identities across social networks based on user spatiotemporal data according to claim 1, characterized in that: the calculation formula of the enhanced sign-in position vector X' i is as follows: X′ i =softmax(γ ij )V j +X i , Wherein V j is the position feature of the check-in position vector X j mapped into the V feature space, softmax () is a nonlinear activation function, Q i is the position feature of the check-in position vector X i mapped into the Q feature space, K j is the position feature of the check-in position vector X j mapped into the K feature space, as a multiplication operation of corresponding elements of a matrix, lambda is a balance parameter of space-time correlation, distance () is a semi-normal distance formula, Is that The spatial correlation of the mth check-in position corresponding to user u i and the nth check-in position corresponding to user u j , GPS information for the mth check-in position corresponding to user u i for user trajectory T A , For the GPS information of the nth check-in position corresponding to user u j in user trajectory T B , a is the distance adjustment parameter, Is that The time correlation of the mth check-in position corresponding to user u i and the nth check-in position corresponding to user u j , For the time stamp information of the mth check-in position corresponding to the user u i in the user trajectory T A , For the timestamp information of the nth check-in position corresponding to user u j in user trajectory T B , β is a time adjustment parameter, ||is a first order paradigm operation, e is a constant, The calculation formula of the enhanced check-in position vector X' j is as follows: X′ j =softmax(Υ ji )V i +X j , Where V i is the position feature of the check-in position vector X i mapped into V feature space, Q j is the position feature of the check-in position vector X j mapped into Q feature space, and K i is the position feature of the check-in position vector X i mapped into K feature space.
- 5. The method for aligning user identities across social networks based on user spatiotemporal data according to claim 1, characterized in that: The calculation formula of the final embedded representation vector s' i is as follows: s′ i =σ(W T [s i ;h i ]+b T ), a m,j =w T (tanh([W l X′ i ,m;W g h j ])), Wherein sigma is an activation function, W T is a training parameter, b T is a bias term, l i is the length of the user track T A , X ' i,m is a vector corresponding to the mth sign-in position of the user track T A in the enhanced sign-in position vector X' i , tanh () is an activation function, W l 、W g and W are training parameters, The calculation formula of the final embedded representation vector s' j is: s′ j =σ(W T [s j ;h j ]+b T ), a m,i =w T (tanh([W l X′ j,m ;W g h i ])), Where l j is the length of the user track T B and X ' j,m is the vector corresponding to the mth check-in position of the user track T B in the enhanced check-in position vector X' j .
- 6. The method for aligning user identities across social networks based on user spatiotemporal data according to claim 1, characterized in that: Wherein the matching probability The calculation formula of (2) is as follows: Wherein W 1 and W 2 are training parameter matrices, and b 1 and b 2 are bias terms.
- 7. The method for aligning user identities across social networks based on user spatiotemporal data according to claim 1, characterized in that: wherein the user identity alignment prediction model is trained based on a training dataset and a Loss function Loss constructed from existing social network data, the training dataset comprising M sets of sample pairs, each set of sample pairs comprising samples and corresponding real labels, The calculation formula of the Loss function Loss is as follows: In the middle of And inputting the matching probability obtained after the user identity alignment prediction model for the ith sample, wherein y i is the real label corresponding to the ith sample.
Description
User identity alignment method across social networks based on user space-time data Technical Field The invention relates to the field of online social networks, in particular to a method for aligning user identities across social networks based on user spatiotemporal data. Background With the popularity of mobile devices, online social networks have become increasingly popular, and various types of social platforms have grown endlessly, which has become an important part of people's daily lives. Because of the diversity of social platform functions, people can simultaneously add a plurality of different platforms to meet different requirements, such as bean paste, known, new wave microblogs and the like. The user can leave various types of user personalized information such as personal information, pictures, sign-in positions and the like on different social platforms, and if the information from the different social platforms is integrated, the user portraits can be more comprehensively depicted and personalized recommendation can be carried out. Thus, user identification across social networks has gained widespread attention in recent years. The alignment of user identities across social networks is to match the user identities from different platforms, and account numbers from the same natural person are connected, so that the social behavior information of the whole user is more comprehensive, and the social network user behavior research is promoted. For the problem of user identity alignment across social networks, there have been some researchers conducting research based on users generating a single type of data, such as user attributes, user generated content, network structure, and the like. Because the user leaves some erroneous attribute information at registration, sparsity of the social network structure and short text characteristics of the user-generated text content, it is difficult for researchers to extract enough user features to align. The geospatial information of user check-in is ideal matching data, and the user alignment result is predicted by learning the characterization of the check-in position and the movement track by adopting an end-to-end method. However, the method still has a certain problem to be solved, on one hand, due to the fact that different social platforms exist, certain deviation exists in sign-in of users at the same position, a traditional position characterization method has a certain semantic gap problem and cannot learn potential semantic relations among the positions, on the other hand, the sign-in behaviors of the users at different platforms have certain preference, so that the sign-in data of the users at the different platforms have great unbalance, and great interference is brought to accurately characterizing the behavior patterns of the users. In summary, in the prior art, there is a large lifting space for aligning user identities across social networks by user check-in data. Disclosure of Invention The invention aims to solve the problems, and aims to provide a method for aligning user identities across social networks based on user spatiotemporal data. The invention provides a user identity alignment method based on user space-time data and crossing social networks, which is used for obtaining user identity alignment results of a user A and a user B according to user sign-in data A of a user u i in a platform A and user sign-in data B of a user u j in a platform B, and has the characteristics that the method comprises the following steps of S1, respectively preprocessing the user sign-in data A and the user sign-in data B to respectively obtain a user track T A and a user track T B, S2, inputting a user track T A and a user track T B into a user identity alignment prediction model together to obtain the user identity alignment results, wherein the user identity alignment prediction model comprises a vectorization module, track crossing attention module, bi-LSTM module, The local attention module is used for vectorizing a user track T A and a user track T B respectively to obtain a check-in position vector X i and a check-in position vector X j respectively, the track cross attention module is used for enhancing the check-in position vector X i and the check-in position vector X j according to a track cross attention mechanism to obtain an enhanced check-in position vector X 'i and an enhanced check-in position vector X' j, the Bi-LSTM module comprises a bidirectional long and short memory mental network and is used for extracting features of the enhanced check-in position vector X 'i and the enhanced check-in position vector X' j respectively to obtain a track hiding feature h i and a track hiding feature h j respectively, and the local attention module is used for hiding the feature h j according to the track, Enhancing the sign-in position vector X 'i and the track hidden feature h i to obtain a final embedded representation vector s' i ac