CN-121984691-A - Abnormal information message identification method, device, equipment, medium and program product
Abstract
The application provides a method, a device, equipment, a storage medium and a program product for identifying abnormal information messages, which can be applied to the technical fields of big data, information security and financial science and technology. The method comprises the steps of fine tuning message nodes of a pre-built knowledge graph to obtain embedded vectors, wherein the message nodes of the knowledge graph are built according to information, the user nodes of the knowledge graph are built according to user information, positive sample pairs and negative sample pairs of the message nodes in the knowledge graph are determined according to edges of the knowledge graph, the positive sample pairs are two nodes with first-order edges of the same user nodes, the negative sample pairs are two nodes without first-order edges of the same user nodes, the embedded vectors of the message nodes are optimized according to the positive sample pairs and the negative sample pairs to obtain optimized embedded vectors, the labels are transmitted to the message nodes without labels according to the optimized embedded vectors and the labeled message nodes, and the abnormal message nodes are determined according to the labels.
Inventors
- REN JUNYU
- GAO YUYAO
- ZHU ZHENYI
- HUANG RONG
Assignees
- 中国工商银行股份有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20250724
Claims (12)
- 1. An anomaly intelligence message identification method, comprising: Fine tuning message nodes of a pre-constructed knowledge graph to obtain an embedded vector, wherein the message nodes of the knowledge graph are constructed according to information, and the user nodes of the knowledge graph are constructed according to user information; Determining a positive sample pair and a negative sample pair of the message node in the knowledge graph according to the edges of the knowledge graph, wherein the positive sample pair is two nodes with first-order edges with the same user node, and the negative sample pair is two nodes without first-order edges with the same user node; Optimizing the embedded vector of the message node according to the positive sample pair and the negative sample pair to obtain an optimized embedded vector; According to the optimized embedded vector and the labeled message node, transmitting the label to the unlabeled message node; And determining an abnormal message node according to the label.
- 2. The anomaly intelligence information recognition method of claim 1, wherein optimizing the embedded vector of the message node according to the positive and negative pairs of samples, the step of obtaining an optimized embedded vector is accomplished by pre-training a graph neural network model, the pre-training the graph neural network model comprising: training a graph neural network by using sample data formed by historical information and user information to obtain model parameters; and applying the model parameters to a graph neural network to obtain the graph neural network model.
- 3. The abnormal intelligence information identification method of claim 1, wherein optimizing the embedded vector of the message node based on the positive and negative pairs of samples, the step of obtaining an optimized embedded vector comprises: Dividing the knowledge graph into subgraphs according to the positive sample pairs; And optimizing the embedded vector of the message node in the subgraph according to the positive sample pair and the negative sample pair to obtain the optimized embedded vector of each message node in the subgraph.
- 4. The abnormal intelligence information identification method according to claim 3, wherein the step of optimizing the embedded vector of the message node in the sub-graph according to the positive sample pair and the negative sample pair to obtain an optimized embedded vector of each message node in the sub-graph comprises: Constructing a target optimization function according to the embedded vectors of the message nodes in the positive sample pair and the negative sample pair, wherein the target of the target optimization function is to pull the distance between the positive sample pair and push the distance between the negative sample pair; And taking the embedded vectors of the message nodes in the positive sample pair and the negative sample pair after the target optimization function is solved as the optimized embedded vector of each message node.
- 5. The abnormal intelligence information identification method of claim 3, wherein the step of propagating labels to unlabeled message nodes based on the optimized embedded vector and labeled message nodes comprises: classifying the labeled message nodes in each subgraph to obtain m label categories, wherein m is an integer greater than or equal to 1; Calculating the center vector of each type of message node; and according to the optimized embedded vector of the non-labeled message nodes and the center vector of each type of message nodes, transmitting the labels to the non-labeled message nodes.
- 6. The abnormal intelligence information identification method of claim 5, wherein the step of propagating the labels to the untagged message nodes based on the optimized embedded vector of the untagged message nodes and the center vector of each type of message node comprises: calculating the distance between the optimized embedded vector of the unlabeled message node and the center vector of each type of message node; and taking the label of the category of the center vector with the smallest distance with the non-labeled message node as the label of the non-labeled message node.
- 7. The abnormal intelligence information identification method of claim 1, wherein the edges of the knowledge graph represent friend and attention relationships between user nodes, publication and forward relationships between user nodes and message nodes, and reply and reference relationships between message nodes.
- 8. The abnormal intelligence information identification method of claim 1, wherein, Under the condition that the abnormal information identification method is executed for the first time, a knowledge graph is built in advance to establish the knowledge graph according to the information; under the condition that the abnormal information identification method is not executed for the first time, a knowledge graph is constructed in advance to update the knowledge graph according to the newly added information.
- 9. An abnormality intelligence message recognition apparatus, comprising: The fine tuning module is used for fine tuning message nodes of a pre-constructed knowledge graph to obtain an embedded vector, wherein the message nodes of the knowledge graph are constructed according to information, and the user nodes of the knowledge graph are constructed according to user information; The first determining module is used for determining positive sample pairs and negative sample pairs of the message nodes in the knowledge graph according to the edges of the knowledge graph, wherein the positive sample pairs are two nodes with first-order edges with the same user node, and the negative sample pairs are two nodes without first-order edges with the same user node; The optimizing module is used for optimizing the embedded vector of the message node according to the positive sample pair and the negative sample pair to obtain an optimized embedded vector; The propagation module is used for propagating the labels to the non-labeled message nodes according to the optimized embedded vector and the labeled message nodes; and the second determining module is used for determining the abnormal message node according to the label.
- 10. An electronic device, comprising: One or more processors; a memory for storing one or more computer programs, Characterized in that the one or more processors execute the one or more computer programs to implement the steps of the method according to any one of claims 1-8.
- 11. A computer-readable storage medium, on which a computer program or instructions is stored, which, when executed by a processor, carries out the steps of the method according to any one of claims 1-8.
- 12. A computer program product comprising a computer program or instructions which, when executed by a processor, implement the steps of the method according to any one of claims 1 to 8.
Description
Abnormal information message identification method, device, equipment, medium and program product Technical Field The application relates to the technical fields of big data, information security and financial science and technology, in particular to an abnormal information message identification method, an abnormal information message identification device, abnormal information message identification equipment, abnormal information message identification medium and program products. Background With the rapid development of internet technology and social media, abnormal organizations propagate bad information on a large scale on a network platform, and a great threat is generated to the use environment of users. Meanwhile, bad information can bypass keyword matching in the modes of escape, isomerism and the like, so that greater hidden danger is caused. In this scenario, the data available for analysis is mainly advertisements, messages, chat logs, etc. The existing method mainly relies on a single natural language processing method to conduct semantic analysis of abnormal texts or dig potential abnormal tissues through a graph structure. Although the natural language processing method can capture semantic information, the capture of potential structures of abnormal tissues is lacking, and although the potential structures of the abnormal tissues are analyzed through the graph structure method, the necessary natural language analysis and recognition are lacking. Therefore, both of these methods are insufficient in recognition of the abnormality information. Disclosure of Invention In view of the foregoing, the present application provides an anomaly information message recognition method, apparatus, device, medium, and program product that are sufficient in recognition, simple and efficient, and high in automation degree. According to a first aspect of the application, an abnormal information message identification method is provided, and the abnormal information message identification method comprises the steps of carrying out fine adjustment on message nodes of a pre-built knowledge graph to obtain embedded vectors, wherein the message nodes of the knowledge graph are built according to information, the user nodes of the knowledge graph are built according to user information, positive sample pairs and negative sample pairs of the message nodes in the knowledge graph are determined according to edges of the knowledge graph, the positive sample pairs are two nodes with first-order edges of the same user node, the negative sample pairs are two nodes without first-order edges of the same user node, the embedded vectors of the message nodes are optimized according to the positive sample pairs and the negative sample pairs, the optimized embedded vectors are obtained, the labels are transmitted to the unlabeled message nodes according to the optimized embedded vectors and the labeled message nodes, and the abnormal message nodes are determined according to the labels. According to the embodiment of the application, the step of optimizing the embedded vector of the message node according to the positive sample pair and the negative sample pair is completed through a pre-trained graph neural network model, wherein the pre-training of the graph neural network model comprises training a graph neural network by using sample data formed by historical information and user information to obtain model parameters, and the model parameters are applied to the graph neural network to obtain the graph neural network model. According to the embodiment of the application, the step of optimizing the embedded vector of the message node according to the positive sample pair and the negative sample pair to obtain the optimized embedded vector comprises the steps of dividing the knowledge graph into subgraphs according to the positive sample pair, and optimizing the embedded vector of the message node in the subgraph according to the positive sample pair and the negative sample pair to obtain the optimized embedded vector of each message node in the subgraph. According to the embodiment of the application, the step of optimizing the embedded vectors of the message nodes in the subgraph according to the positive sample pair and the negative sample pair to obtain the optimized embedded vector of each message node in the subgraph comprises the steps of constructing a target optimization function according to the embedded vectors of the message nodes in the positive sample pair and the negative sample pair, wherein the target optimization function aims at pulling in the distance between the positive sample pair and pushing out the distance between the negative sample pair, and taking the embedded vectors of the message nodes in the positive sample pair and the negative sample pair after the target optimization function is solved as the optimized embedded vector of each message node. According to the embodiment of the application, the