CN-121981160-A - Gradient-driven sensitivity attribute-free graph learning fairness optimization method and device

CN121981160ACN 121981160 ACN121981160 ACN 121981160ACN-121981160-A

Abstract

The invention discloses a gradient-driven non-sensitive attribute graph learning fairness optimization method and device, and solves the technical problem that the fairness of a graph neural network is poor due to the existing graph learning fairness optimization method. The method comprises the steps of obtaining node data, preprocessing the node data to generate a data set for model training, reconstructing a graph neural network model, determining an initial graph neural network model through random initialization, pre-training the initial model in stages by using the training data set to obtain a pre-training model, classifying training set samples based on the pre-training model to output error and correct classification samples, and finally fine-tuning the pre-training model to output fine-tuning optimal parameters by combining the training data set and the two types of samples, and loading the fine-tuning optimal parameters to the graph neural network model to obtain a final model.

Inventors

CHEN LIANG
XIA SIMIN
ZHU YUCHANG
ZHENG ZIBIN

Assignees

中山大学

Dates

Publication Date: 20260505
Application Date: 20260123

Claims (10)

1. A gradient-driven sensitivity attribute-free graph learning fairness optimization method is characterized by comprising the following steps: Acquiring node data, preprocessing the node data, and generating a data set for model training; constructing a graph neural network model, randomly initializing the graph neural network model by adopting an initialization method, and determining an initial graph neural network model; performing phased pre-training on the initial graph neural network model by adopting the data set for model training to obtain a pre-trained graph neural network model; Classifying a plurality of training samples in a training set in the data set for model training based on the pre-training graph neural network model, and outputting a plurality of error classification samples and a plurality of correct classification samples; performing fine tuning training on the pre-training graph neural network model by adopting the data set for model training, the plurality of error classification samples and the plurality of correct classification samples, and outputting fine tuning optimal model parameters; And loading the fine-tuning optimal model parameters to the graph neural network model to obtain a final graph neural network model.
2. The gradient driven non-sensitive attribute graph learning fairness optimization method of claim 1, wherein the preprocessing the node data to generate a data set for model training comprises: performing cleaning and format normalization on the node data to obtain normalized graph input data; And carrying out data division on the normalized graph input data to obtain a data set for model training.
3. The gradient-driven non-sensitive attribute graph learning fairness optimization method of claim 1, wherein the step of using the data set for model training to pretrain the initial graph neural network model in stages to obtain a pretrained graph neural network model comprises: Taking training samples in a training set in the data set for model training as the input of the initial graph neural network model to generate a model output result; calculating a loss value according to the model output result, and updating model parameters of the initial graph neural network model based on the back propagation of the loss value until the number of preset training rounds is reached, so as to obtain an intermediate graph neural network model; forward predicting a plurality of training samples in the training set, and outputting a plurality of category prediction probabilities of each training sample; selecting the maximum category prediction probability from a plurality of category prediction probabilities of each training sample as the sample confidence of the corresponding training sample; the confidence degrees of the samples are sorted in a descending order, and training samples corresponding to the confidence degrees of the samples with the preset number of bits before are selected to form a high confidence degree subset; Training the intermediate graph neural network model by adopting the high confidence subsets to obtain a target graph neural network model; loading model parameters of the target graph neural network model into a verification set in the data set for model training to perform forward reasoning so as to obtain verification loss; Determining a pre-training optimal parameter based on the validation loss; and loading the pre-training optimal parameters to the initial graph neural network model to obtain a pre-training graph neural network model.
4. The gradient-driven non-sensitive attribute graph learning fairness optimization method of claim 1, wherein classifying a plurality of training samples in a training set of the data set for model training based on the pre-training graph neural network model, outputting a plurality of erroneous classification samples and a plurality of correctly classified samples, comprises: Respectively inputting each training sample in the training set to the pre-training graph neural network model to execute forward reasoning, and outputting a prediction result of each training sample; And comparing the prediction results corresponding to each training sample with the real labels, and identifying a plurality of error classification samples with inconsistent comparison between the prediction results and the real labels and a plurality of correct classification samples with consistent comparison between the prediction results and the real labels.
5. The gradient-driven non-sensitive attribute graph learning fairness optimization method of claim 1, wherein the employing the data set for model training, the plurality of misclassification samples, and the plurality of correct classification samples to fine tune the pre-trained graph neural network model, outputting fine-tuned optimal model parameters comprises: Calculating the gradient norms of each error classification sample, and normalizing the gradient norms of each error classification sample to obtain normalized gradient values of each error classification sample; Setting a fixed weight for each of the misclassified samples and each of the correctly classified samples; Amplifying the fixed weight of the corresponding error classification sample based on the normalized gradient value of each error classification sample to obtain the amplified weight of each error classification sample; constructing a weighted loss function based on the fixed weight of each correctly classified sample and the amplified weight of each incorrectly classified sample; Taking a pre-training optimal parameter of the pre-training graph neural network model as an initial parameter, and carrying out multi-round weighted fine tuning training on the pre-training graph neural network model by adopting the weighted loss function according to training samples in a training set in the data set for model training to obtain a plurality of fine tuning model parameters; Respectively loading each fine tuning model parameter into a verification set in the data set for model training to perform forward reasoning, and calculating verification loss corresponding to each fine tuning model parameter; and selecting the fine tuning model parameters corresponding to the minimum verification loss as the fine tuning optimal parameters.
6. The gradient-driven sensitivity-free property graph learning fairness optimization method of claim 1, further comprising: inputting each test sample in the test set in the data set for model training to the final graph neural network model to execute forward reasoning, and outputting a final prediction result of each test sample; Calculating a task performance index and a fairness index of the final graph neural network model based on the final prediction result of each test sample; and generating a model evaluation report according to the task performance index and the fairness index.
7. A gradient-driven sensitivity attribute-free graph learning fairness optimization device, comprising: The acquisition module is used for acquiring node data, preprocessing the node data and generating a data set for model training; The building module is used for building a graph neural network model, randomly initializing the graph neural network model by adopting an initialization method, and determining an initial graph neural network model; The pre-training module is used for carrying out phased pre-training on the initial graph neural network model by adopting the data set for model training to obtain a pre-trained graph neural network model; the classification module is used for classifying a plurality of training samples in a training set in the data set for model training based on the pre-training graph neural network model and outputting a plurality of error classification samples and a plurality of correct classification samples; the fine tuning training module is used for carrying out fine tuning training on the pre-training graph neural network model by adopting the data set for model training, a plurality of error classification samples and a plurality of correct classification samples, and outputting fine tuning optimal model parameters; and the final model output module is used for loading the fine-tuning optimal model parameters to the graph neural network model to obtain a final graph neural network model.
8. An electronic device comprising a memory and a processor, wherein the memory stores a computer program that, when executed by the processor, causes the processor to perform the steps of the gradient-driven sensitivity attribute-free graph learning fairness optimization method of any one of claims 1-6.
9. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed implements the gradient-driven sensitivity attribute-free graph learning fairness optimization method according to any one of claims 1-6.
10. A computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, wherein the program instructions, when executed by a computer, cause the computer to perform the steps of the gradient-driven sensitivity attribute-free graph learning fairness optimization method of any of claims 1-6.

Description

Gradient-driven sensitivity attribute-free graph learning fairness optimization method and device Technical Field The invention relates to the technical field of artificial intelligence, in particular to a gradient-driven sensitivity attribute-free graph learning fairness optimization method and device. Background The graph neural network is a deep learning model specially designed for graph structural data, and can effectively learn the representation of nodes, edges or the whole graph by carrying out message transmission and neighbor feature aggregation on the graph, and is widely applied to tasks such as node classification, link prediction, recommendation ordering and the like at present. In an actual service system, more and more key decisions (such as content recommendation, credit approval, risk early warning and the like) are supported by a graph neural network model, and the prediction results of the key decisions directly influence the interests of different user groups. However, because there are various biases inevitably existing in the historical data, such as systematic differences in connection patterns and behavior characteristics among groups of different sexes, regions and income levels, the graphic neural network inherits or even amplifies the biases during training, so that the predicted performances of the models on the different groups have obvious differences. For example, the same loan product may have significantly lower rate of passage for one group than another, or under the same conditions, it may be more difficult for users in one area to obtain platform recommendations. In order to alleviate the problems, the 'graph fairness' research aiming at the graph neural network is gradually raised, and related work can be roughly divided into two technical routes, namely, on one hand, directly using a sensitive attribute label of a sample in training, constructing a loss constraint or removing a representation learning module of sensitive information based on the sensitive attribute label so as to reduce the dependence of a prediction result on sensitive attributes, and on the other hand, when a true sensitive attribute is difficult to obtain, firstly deducing the sensitive attribute in a potential space or constructing a 'pseudo sensitive attribute representation', and then carrying out fairness optimization such as countermeasure training, inverse fact constraint or probability difference minimization. The existing graph learning fairness optimization method generally predicts sensitive properties or constructs pseudo sensitive properties in potential space by changing a self-encoder, a graph encoder and the like, and designs fairness loss or countermeasure constraint on the basis. The accuracy of the predicted sensitive or pseudo sensitive attribute is difficult to ensure, and errors are continuously amplified in the subsequent fairness optimization, so that the fairness of the graph neural network is poor. Disclosure of Invention The invention provides a gradient-driven sensitivity-free attribute graph learning fairness optimization method and device, which solve the technical problem that the fairness of a graph neural network is poor due to the existing graph learning fairness optimization method. The gradient-driven sensitivity attribute-free graph learning fairness optimization method provided by the first aspect of the invention comprises the following steps: Acquiring node data, preprocessing the node data, and generating a data set for model training; constructing a graph neural network model, randomly initializing the graph neural network model by adopting an initialization method, and determining an initial graph neural network model; performing phased pre-training on the initial graph neural network model by adopting the data set for model training to obtain a pre-trained graph neural network model; Classifying a plurality of training samples in a training set in the data set for model training based on the pre-training graph neural network model, and outputting a plurality of error classification samples and a plurality of correct classification samples; performing fine tuning training on the pre-training graph neural network model by adopting the data set for model training, the plurality of error classification samples and the plurality of correct classification samples, and outputting fine tuning optimal model parameters; And loading the fine-tuning optimal model parameters to the graph neural network model to obtain a final graph neural network model. Optionally, the preprocessing the node data to generate a data set for model training includes: performing cleaning and format normalization on the node data to obtain normalized graph input data; And carrying out data division on the normalized graph input data to obtain a data set for model training. Alternatively, the process may be carried out in a single-stage, The step of pre-training the initial graph neural netwo