CN-121997976-A - Distribution generalization method and server of dynamic graph neural network model
Abstract
A distribution generalization method and a server of a dynamic graph neural network model relate to the technical field of electric digital data processing and are used for improving the distribution generalization capability of the dynamic graph neural network model. The method comprises the steps of obtaining a time sequence feature matrix from a dynamic graph snapshot sequence by a server, performing time average pooling operation on the time sequence feature matrix under the scale of gradually increasing a preset time step to obtain multi-scale feature representation, converting the multi-scale feature representation into spectrum domain features through Fourier transformation, processing the spectrum domain features based on a multi-layer perceptron of the preset scale to obtain a soft mask matrix, calculating the spectrum domain features and the soft mask matrix through a preset function group to obtain unchanged features and changed features, using a clustering center of the changed features as an environmental label through clustering analysis, constructing an optimized objective function based on the unchanged features, the environmental label, a preset auxiliary loss function and a preset task loss function, and updating network parameters of a dynamic graph neural network model according to the function.
Inventors
- LI CHANGSHENG
- Zuo Zhenmeng
- LI BOYANG
- YUAN YE
- WANG GUOREN
- LI TAO
- SHI HONGSHAN
- LI XUYANG
- BAO YI
Assignees
- 北京理工大学
- 航天万源云数据河北有限公司
- 北京理工大学唐山研究院
Dates
- Publication Date
- 20260508
- Application Date
- 20251219
Claims (10)
- 1. A method for generalizing a distribution of a dynamic graph neural network model, applied to a server, the method comprising: The server acquires a time sequence feature matrix from the dynamic graph snapshot sequence; The server executes time-averaged pooling operation on the time sequence feature matrix under the scale of gradually increasing the preset time step to obtain multi-scale feature representation under different time scales; The server converts the multi-scale feature representation into spectral domain features at different time scales by fourier transformation; the server processes the spectral domain features based on a multi-layer perceptron of a preset scale to obtain a soft mask matrix; The server calculates the spectral domain features and the soft mask matrix through a preset function set to obtain invariant features and variant features under different time scales; The server takes a clustering center of the change feature as an environment label under different time scales through cluster analysis; the server builds an optimization objective function based on the invariant feature, the environment tag, a preset auxiliary loss function and a preset task loss function, wherein the preset auxiliary loss function is used for eliminating the residual environmental influence in the invariant feature under the same time scale as the environment tag according to the environment tag, the preset task loss function is used for measuring the difference between the prediction made by the dynamic graph neural network model based on the invariant feature and the real tag, and the smaller value indicates more accurate prediction; The server updates network parameters of the dynamic graph neural network model based on the optimization objective function.
- 2. The method according to claim 1, wherein the step of calculating the spectral domain feature and the soft mask matrix by the server through a preset function set to obtain the invariant feature and the variant feature at different time scales specifically includes: The preset function set is as follows: In the above function group, the Is a invariant feature at the s-th time scale; the said Is a variation feature at the s-th time scale; the said Is an inverse fourier transform; the said Is characteristic of the spectral domain; the said For the soft mask matrix; the said Representing the element-by-element product.
- 3. The method according to claim 1, wherein the step of constructing an optimization objective function by the server based on the invariant feature, the environmental label, a preset auxiliary loss function and a preset task loss function, specifically comprises: the preset auxiliary loss function is as follows: In the above function, the The auxiliary loss function is preset and is used for enabling the invariant feature and the environment label to be maximally irrelevant so as to eliminate residual environment-related information in the invariant feature; The N is the total number of nodes in the dynamic graph; The S is the total time scale; the said Is the s time scale The invariant features of the individual nodes; the said Is an environmental label at the s-th time scale; the said Is said And said at least one of Cosine similarity of (c).
- 4. The method of claim 1, wherein prior to the step of the server constructing an optimized objective function based on the invariant feature, the environmental label, a preset auxiliary loss function, and a preset task loss function, the method further comprises: the server constructs a mutual information maximization loss function according to the change characteristics and the environment label; the mutual information maximization loss function is: In the above function, the Maximizing a loss function for the mutual information for maximizing correlation of the change characteristics at different scales and the environmental label; The N is the total number of nodes in the dynamic graph; The S is the total time scale; the said Is the first The first time scale The change characteristics of the individual nodes; the said Is the first The first time scale Environmental labels of individual nodes; the said For estimating the And said Mutual information score between them.
- 5. The method of claim 4, wherein after the step of calculating the spectral domain features and the soft mask matrix by the server through a set of preset functions to obtain invariant features and variant features at different time scales, the method further comprises: the server estimates the contribution value of the invariant feature under each time scale based on a preset lightweight neural network model; And the server takes the contribution value after normalization processing as a weight, and performs weighted summation on the invariant features under each time scale to obtain the fusion invariant features.
- 6. The method according to claim 5, wherein the step of constructing an optimization objective function by the server based on the invariant feature, the environmental label, a preset auxiliary loss function and a preset task loss function, specifically comprises: After the server performs weighted summation on the preset auxiliary loss function, the mutual information maximization loss function and the preset multi-scale intervention risk loss function based on preset super parameters, the server performs summation with the preset task loss function to obtain the optimization objective function; the preset multi-scale intervention risk loss function is as follows: In the above function, the For a multiscale intervention loss function, monitoring the dynamic graph neural network model to make a prediction result only depending on the fusion invariant feature by simulating a scene of fluctuation generated by the specific change feature ; The said A prediction result based on the fusion invariant features is provided for an invariant feature classifier in the dynamic graph neural network model; the said Representing an element-by-element product; the said Is from the variation characteristics The specific change characteristics obtained by the middle sampling; the said Potential interference values for the particular variation characteristics; the said Normalizing the potential interference value; the said And the multi-scale intervention risk loss function is preset and used for measuring the stability of the multi-scale intervention loss function.
- 7. The method according to claim 6, wherein the step of updating the network parameters of the dynamic graph neural network model by the server based on the optimization objective function comprises: the server adopts a gradient descent method to iteratively update parameters of a graph neural network backbone network, the multi-layer perceptron of the preset scale, the preset lightweight neural network model and the invariant feature classifier in the dynamic graph neural network model based on the optimization objective function; The server calculates the numerical value of the current optimization objective function after each iteration; When the number of continuous preset iteration times of the numerical value is within a preset minimum value interval, the server determines that the optimization objective function converges; and after the optimization objective function is converged, the server stops updating the parameters and saves the latest parameters.
- 8. A server comprising one or more processors and memory coupled to the one or more processors, the memory to store computer program code comprising computer instructions that the one or more processors invoke to cause the server to perform the method of any of claims 1-7.
- 9. A computer program product comprising instructions which, when run on a server, cause the server to perform the method of any of claims 1-7.
- 10. A computer readable storage medium comprising instructions which, when run on a server, cause the server to perform the method of any of claims 1-7.
Description
Distribution generalization method and server of dynamic graph neural network model Technical Field The application relates to the technical field of electric digital data processing, in particular to a distributed generalization method and a server of a dynamic graph neural network model. Background The dynamic graph neural network (DyGNN) is used as an important technology for processing time sequence graph data, aims at modeling the time evolution process of the structure and node attribute of the dynamic graph, and plays a key role in various fields such as social network analysis, financial transaction monitoring and the like. The related technology method mainly adopts two technology paths to learn the evolution rule of the dynamic graph, wherein the first method utilizes a graph neural network to capture the space structure information of each time slice, and then combines a sequence model such as a cyclic neural network or a long-term and short-term memory network to model the time dependency relationship. The second approach treats the changes of the graph as a continuous stream of events, modeled on a continuous time domain by a time-coded function or self-attention mechanism. The methods identify the evolution rule of the dynamic diagram structure and predict the evolution rule by learning the space-time characteristic mode in the dynamic diagram. However, the related art is mainly based on the assumption of independent co-distribution, that is, it is assumed that the training data and the test data are both independently sampled from the same distribution. In practical application, the distribution of the test data is often diverse in sources, and the distribution and the training data set have larger difference, so that the distribution generalization capability of the dynamic graph neural network model established by the related technology is lower. Disclosure of Invention The application provides a distribution generalization method and a server of a dynamic graph neural network model, which are used for improving the distribution generalization capability of the dynamic graph neural network model. The method comprises the steps that the server obtains a time sequence feature matrix from a dynamic graph snapshot sequence, the server performs time average pooling operation on the time sequence feature matrix under the scale of gradually increasing a preset time step to obtain multi-scale feature representation under different time scales, the server converts the multi-scale feature representation into spectral domain features under different time scales through Fourier transformation, the server processes the spectral domain features based on a multi-layer perceptron of a preset scale to obtain a soft mask matrix, the server calculates the spectral domain features and the soft mask matrix through a preset function group to obtain invariant features and variable features under different time scales, the server uses a clustering center of the variable features as an environment tag under different time scales through clustering analysis, the server constructs an optimization objective function based on the invariant features, the environment tag, a preset auxiliary loss function and a preset task loss function, the preset auxiliary loss function is used for eliminating residual environmental influences in the invariant features under the same time scales as the environment tag, the preset task loss function is used for measuring the residual environmental influences in the invariant features under the same time scales as the environment tag, the preset task loss function is used for accurately updating the dynamic graph based on a prediction model of the dynamic graph, and the dynamic graph is based on a prediction model of the real-time scale, and the dynamic graph is updated based on the model of the dynamic graph is accurately. By adopting the technical scheme, the server models the time sequence characteristics of the dynamic graph on different time scales, decouples the characteristics in a frequency domain by utilizing Fourier transformation, and separates the unchanged characteristics irrelevant to the environment and the changed characteristics relevant to the environment. Further, potential environmental labels are deduced in an unsupervised mode through clustering of the change features, and an auxiliary loss function is constructed by utilizing the environmental labels to guide model learning, so that the residual environmental influence in the unchanged features is reduced, the model can concentrate on learning a stable rule in the evolution process of the dynamic graph, and the generalization capability of the dynamic graph neural network model when facing an unknown scene (namely, a scene outside the distribution) different from the training data distribution is improved. With reference to some embodiments of the first aspect, in some embodiments, the step of calculating, by the se