Search

CN-122022922-A - Advertisement click through rate prediction method and system

CN122022922ACN 122022922 ACN122022922 ACN 122022922ACN-122022922-A

Abstract

The invention relates to the field of data processing, in particular to a method and a system for predicting advertisement click through rate. The method comprises the steps of generating sparse feature vectors according to original feature data, inputting the sparse feature vectors into a shared embedding layer to obtain feature embedding matrixes, carrying out weighted distribution on the feature embedding matrixes by using explicit branch weights and implicit branch weights to respectively obtain explicit input vectors and implicit input vectors, extracting to obtain explicit interaction feature vectors, extracting to obtain initial implicit interaction feature vectors, mapping the explicit interaction feature vectors into alignment feature vectors, calculating projection components of the initial implicit interaction feature vectors in the alignment feature vector direction, obtaining orthogonalization implicit interaction feature vectors, and generating a prediction result of advertisement click passing rate. The method can effectively improve the accuracy and generalization capability of the advertisement click through rate prediction result.

Inventors

  • Yin Juxian
  • CAO XIAOXIAO
  • Zhu Huogeng

Assignees

  • 钛动科技股份有限公司

Dates

Publication Date
20260512
Application Date
20260206

Claims (10)

  1. 1. The advertisement click through rate prediction method is characterized by comprising the steps of obtaining original characteristic data of an advertisement to be predicted and generating a sparse characteristic vector according to the original characteristic data; Inputting the sparse feature vector into a shared embedding layer to obtain a dense feature vector sequence containing a plurality of feature domains, and converting the dense feature vector sequence into a feature embedding matrix E; Inputting the feature embedding matrix E into a domain-level gating network to calculate an explicit branch weight and an implicit branch weight, and carrying out weighted allocation on the feature embedding matrix E by using the explicit branch weight and the implicit branch weight to respectively obtain an explicit input vector Implicit input vector ; Using an explicit interaction network to vector the explicit input Performing feature interaction processing, and extracting to obtain explicit interaction feature vectors Utilizing an implicit interaction network to input the vector implicitly Performing nonlinear feature transformation, and extracting to obtain initial implicit interaction feature vector ; Using an alignment layer to align the explicit interaction feature vector Mapping to alignment feature vectors And calculates an initial implicit interaction feature vector In aligning feature vectors A projection component in the direction; Will initially implicit interaction feature vectors Subtracting the projection components to obtain orthogonalization implicit interaction feature vectors Wherein the orthogonalizing implicit interaction feature vectors Alignment feature vector with the Maintaining geometric orthogonality; -generating the explicit interaction feature vector Implicit interaction feature vectors with the orthogonalization And carrying out fusion processing to further generate a prediction result of the advertisement click through rate.
  2. 2. The method for predicting advertisement click through rate according to claim 1, wherein calculating the explicit branch weight and the implicit branch weight comprises calculating, for each domain of the feature embedding matrix E, a weight assigned to the explicit branch and a weight assigned to the implicit branch by a multi-layer perceptron and a Softmax activation function, the corresponding calculation expressions being: In the formula, An embedding vector representing the i-th field in the feature embedding matrix E, The embedded vector representing the i-th domain is assigned to the weight of the explicit branch, The embedded vector representing the i-th domain is assigned the weight of the implicit branch, The activation function is represented as a function of the activation, Representing a multi-layer perceptron.
  3. 3. The advertisement click through rate prediction method of claim 2, wherein the vector is explicitly input The calculated expression of (2) is: ; Implicit input vector The calculated expression of (2) is: ; In the formula, Representing the multiplication by element, A vector of weights representing the explicit branches assigned to the respective domains in the feature embedding matrix E, A vector of weights representing the implicit branch assigned to each domain in the feature embedding matrix E.
  4. 4. The advertisement click through rate prediction method according to claim 2, wherein network parameters of the explicit interaction network, the implicit interaction network, the domain-level gating network, and the alignment layer are obtained based on joint loss function training, and the joint loss function L includes a gating loss term The gating loss term The calculation formula of (2) is as follows: ; ; ; where B is the total number of samples, F is the number of feature fields, Represents the first Sample number A gating probability of the field to the kth interaction network, Representing the average allocation proportion of the kth interaction network over the current lot, And Is the parameter of the ultrasonic wave to be used as the ultrasonic wave, For gating the loss function corresponding to entropy regularization, And (5) balancing the regular corresponding loss function for the gate control.
  5. 5. The method for predicting advertisement click through rate as recited in claim 4, wherein the joint loss function L further includes a cosine co-directional penalty term For constraining the initial implicit interaction feature vector Alignment feature vector with the Is used for the degree of similarity of (c) to (c), the computational expression is: ; In the formula, Representing cosine similarity calculations.
  6. 6. The method of predicting advertisement click through rate as recited in claim 5 wherein the joint loss function L further comprises an intra-batch decorrelation loss term For reducing the initial implicit interaction feature vector Alignment feature vector with the The statistical correlation is calculated as follows: ; In the formula, Representing covariance matrix calculation, BN () is a batch normalization process, Is the square of the Frobenius norm, Representing a pause in the gradient computation operation.
  7. 7. The advertisement click through rate prediction method of claim 1, wherein the explicit interaction feature vector The calculated expression of (2) is: ; where F is the number of feature fields, Explicit input vector Is a vector of the i-th field of the (b), Representing element-wise multiplication.
  8. 8. The advertisement click through rate prediction method according to claim 1, wherein feature vectors are aligned The calculated expression of (2) is: Orthogonalization implicit interaction feature vectors The calculated expression of (2) is: ; Where A is a trainable weight matrix of the alignment layer, Indicating that the gradient computation operation is to be suspended, The point-of-view is indicated, Representing the square of the two norms of the vector, Representing a smooth term that prevents the division of zeros.
  9. 9. The advertisement click through rate prediction method according to any one of claims 1 to 8, wherein the explicit interaction feature vector is Implicit interaction feature vectors with the orthogonalization Fusion processing is carried out, and further, the generation of the prediction result of the advertisement click through rate comprises the steps of aligning the characteristic vectors Implicit interaction feature vectors with the orthogonalization Adding elements by elements to obtain a fusion feature vector; and inputting the fusion feature vector to a multi-layer perceptron of an output layer, and outputting a prediction result of the advertisement click through rate after nonlinear transformation.
  10. 10. An advertisement click through rate prediction system comprising a processor and a memory, the memory storing computer program instructions, wherein the computer program instructions, when executed by the processor, implement the advertisement click through rate prediction method of any one of claims 1-9.

Description

Advertisement click through rate prediction method and system Technical Field The present invention relates to the field of data processing. More particularly, the invention relates to a method and a system for predicting advertisement click through rate. Background Click Through Rate (CTR) prediction is a core link of an Internet advertisement putting and recommending system, and the accuracy of a prediction result directly determines an advertisement ordering strategy, charging cost and final user experience. In an actual business processing flow, a CTR prediction task needs to calculate a probability value of clicking the advertisement by a user in real time according to the input characteristics of user attributes, advertisement content, context environment and the like. Currently, in CTR prediction, a prediction model based on a "double tower" architecture is widely used. Such models typically contain two parallel processing branches, one for capturing low-order explicit feature interactions (e.g., FM branches) and the other for mining high-order implicit nonlinear modes (e.g., DNN branches). In the prediction process, the system inputs the features into the two branches, calculates corresponding feature representations respectively, and finally fuses the feature representations to generate the final click probability. However, when the existing double-tower model is used for carrying out advertisement click through rate prediction, significant technical defects still exist in the processes of data processing and feature interaction calculation, and further improvement of prediction precision is limited: First, in the input processing stage of the prediction task, the prior art generally inputs all original features indiscriminately into both explicit and implicit branches for processing. This lack of targeted full-scale input strategies results in that, in the initial stages of the prediction computation, both branches receive a large amount of redundant information that is "noisy" to their own processing logic (e.g., explicit branches are forced to process high-dimensional dense features that are not suitable for second-order interleaving). Not only does this increase the invalid calculations, but rather, these redundant noises can interfere with the expression of valid features, resulting in the final calculated click rate deviating from the true value. Second, in the generation phase of the prediction signal, the implicit branch tends to repeatedly calculate the low-order interaction pattern that the explicit branch has captured due to lack of run-time complementarity constraints. This means that there is a high overlap in the information content of the output signals of the two branches that ultimately participate in the fusion. When the final prediction result is generated, the signal homogeneity can cause the system to excessively fit the low-order mode, so that high-order 'complement' information which is provided by the implicit branch and is used for correcting the prediction deviation cannot be effectively utilized, and the accuracy of the prediction result in a complex scene is reduced. Finally, in the result fusion stage, the existing prediction process typically performs only a simple numerical superposition or concatenation of the outputs of the two branches. The fusion mode ignores the directional difference of the feature vector in the geometric space, redundant components cannot be removed at the moment of prediction result generation, the final click rate predicted value lacks sufficient information gain, and the sequencing result with distinction degree is difficult to give in a long tail or sparse data scene. In summary, when the advertisement click through rate prediction is performed by using the dual-tower CTR model, there are mainly technical problems that the input feature allocation lacks suitability to cause prediction noise interference, and the prediction signal generation process lacks complementarity constraint to cause information homogeneity, so that the finally generated click rate prediction result is insufficient in accuracy, poor in generalization capability and the like. Disclosure of Invention In order to solve the technical problems of insufficient accuracy and poor generalization capability of the prediction result when the advertisement click through rate prediction is performed by using the double-tower CTR model, the invention provides the following aspects. In a first aspect, the invention provides an advertisement click through rate prediction method, comprising the steps of obtaining original feature data of an advertisement to be predicted, and generating a sparse feature vector according to the original feature data; Inputting the sparse feature vector into a shared embedding layer to obtain a dense feature vector sequence containing a plurality of feature domains, and converting the dense feature vector sequence into a feature embedding matrix E; Input