CN-122027192-A - Ether Fang Pong fraud detection method based on control flow graph learning

CN122027192ACN 122027192 ACN122027192 ACN 122027192ACN-122027192-A

Abstract

The invention discloses an Ethernet Pong cheat detection method based on control flow graph learning, and belongs to the technical field of blockchain intelligent contract safety detection. The method comprises the steps of 1, utilizing a Pongshi cheating office intelligent contract label to extract intelligent contract original data from an Ethernet block chain subjected to data cleaning to label, utilizing static analysis to analyze basic blocks and control flow transfer instructions of the label data to construct a control flow diagram of the intelligent contract, 2, respectively screening byte code semantic features and structural features of the control flow diagram based on the control flow diagram of the intelligent contract, 3, training the screened byte code semantic features and the structural features of the control flow diagram through a GNN graph neural network by adopting multi-stage pruning, 4, utilizing the trained GNN graph neural network to conduct contract distinguishing on the intelligent contract in a two-class mode, and compared with the prior art, improving the accuracy of detection and identification of the Pongshi cheating office in the block chain intelligent contract.

Inventors

SHEN MENG
CHEN JINGREN
DU HANBIAO
LIU YANG
WANG LIWEN
ZHU LIEHUANG

Assignees

北京理工大学

Dates

Publication Date: 20260512
Application Date: 20251221

Claims (9)

1. The Ethernet Pond deception detection method based on control flow graph learning is characterized by comprising the following steps, The method comprises the steps of 1, marking the original data of an intelligent contract extracted from an Ethernet block chain subjected to data cleaning by using a Pongward intelligent contract label, and constructing a control flow graph of the intelligent contract by using a basic block and a control flow transfer instruction which are analyzed by static analysis; Step 2, respectively screening the semantic features of the byte codes and the structural features of the control flow graph based on the control flow graph of the intelligent contract; Step 2.1, screening byte code semantic features based on an L1 normal form normalized frequency vector, an association rule mining identification sequence execution mode and a sliding window method; step 2.2, obtaining the linear independent path number of the control flow graph through the node number, the edge density and the longest path length, and marking the nodes by adopting four-element characteristics; Step3, training the semantic features of the screened byte codes and the structural features of the control flow graph through a GNN graph neural network by adopting multi-stage pruning; Step 3.1, constructing a feature matrix of a control flow graph, inputting the feature matrix into a GNN graph neural network, and mapping node and edge features into a high-dimensional space through linear transformation shown in a formula (7); Wherein V is a node set of G= (V, E), E is an edge set, and each node V i is associated with a feature vector Each edge e ij is associated with a feature vector W v and W e are weight matrices of nodes and edges respectively, And Is an initial embedded vector; step 3.2, collecting neighbor information by using the L-layer graph volume, and acquiring node embedding update of a feature matrix of the control flow graph shown in the formula (8); Wherein, the Neighbor set for node v i For the layer 1 attention weight, W (l-1) and b (l-1) are the weight matrix and bias term, respectively, σ is the activation function; step 3.3, acquiring an attention weight vector of a feature matrix of the control flow graph shown in the formula (9) by using a self-attention mechanism; Wherein, the For attention weight vectors, ||represents vector concatenation; step 3.4, setting a pruning threshold value, and pruning the control flow graph by adopting multi-stage pruning; Step 4, performing contract differentiation on the intelligent contracts by using a trained GNN graph neural network in a two-classification mode and adopting a mode shown as a formula (12); p=σ(Wh G +b) (12) wherein, W and b are classified layer parameters, sigma is Sigmoid function, p is used for classifying normal contracts and cheating contracts in the Ethernet.
2. The Ethernet Pond deception detection method based on control flow graph learning of claim 1, wherein the implementation method of step 1 is, Step 1.1: API interface slave Ethernet for calling Etherscan extracting intelligent contract original data from a block chain; marking the original data subjected to data cleaning by using a Pongshi fraud intelligent contract label; And 1.3, constructing a control flow graph of the intelligent contract by using the basic blocks and the control flow transfer instructions after the annotation data are analyzed.
3. The Ethernet Pond deception detection method based on control flow graph learning of claim 2, wherein the implementation method of step 1.2 is, Step 1.2.1, removing invalidation, redundancy or errors in original data by adopting a data cleaning mode; and 1.2.2, marking the cleaned data by using the Pongshi cheating intelligent contract label to form marked data.
4. The Ethernet Pond deception detection method based on control flow graph learning of claim 2, wherein the implementation method of step 1.3 is, Analyzing the marked data by utilizing a static analysis technology to form a byte code consisting of a basic block and a control flow transfer instruction; and 1.3.2, constructing a control flow graph of the intelligent contract by utilizing the parsed byte codes.
5. The Ethernet Pond deception detection method based on control flow graph learning of claim 4, wherein the implementation method of step 1.3.2 is, Step 1.3.2.1, taking a basic block in the byte code as a node of a control flow graph of the intelligent contract; step 1.3.2.2, taking a control flow transfer instruction in the byte code as an edge of a control flow graph of the intelligent contract; step 1.3.2.3, constructing a control flow graph of the intelligent contract shown in the formula (1) by utilizing a control flow transfer relation among basic blocks; G=(V,E) (1) Where V is a set of nodes representing basic blocks in a contract, E = { (V i ,v j ) } there is a set of edges from V i to V j , V i and V j represent two basic block nodes in the control flow graph respectively, and if there is a control flow transfer instruction from V i to V j , an edge from V i to V j is added in the control flow graph.
6. The Ethernet Pond deception detection method based on control flow graph learning of claim 1, wherein the implementation method of step 2.1 is, Step 2.1.1, acquiring frequency distribution of semantic features of the byte codes by adopting an L1 normal form normalized frequency vector shown in a formula (2); Wherein, the Representing the original count of bytecode k in contract i; step 2.1.2, utilizing the association rule as shown in the formula (3) to mine the execution mode of the identification sequence to obtain the instruction sequence frequency of the semantic features of the byte codes; Wherein, a derivation method for obtaining high confidence is obtained by generating frequent item set I from byte code transaction database D Step 2.1.3, capturing byte code semantic features in a time execution mode by adopting a sliding window method shown in a formula (4); Wherein, the An opcode sequence of contract i.
7. The Ethernet Pond deception detection method based on control flow graph learning of claim 1, wherein the implementation method of step 2.2 is, 2.2.1, Carrying out graph complexity quantization on the control flow graph from three dimensions of node number, edge density and longest path length according to the control flow graph constructed by taking the basic block as a node and the potential execution path as an edge; Step 2.2.2, obtaining the linear independent path number of the control flow graph in a mode shown in a formula (5) according to the quantitative relation of the node number, the edge density and the longest path length; v(G)=E-N+2P (5) Wherein E, N and P represent the edge, node and connected component numbers, respectively; 2.2.3, marking the security sensitive nodes of the structural characteristics of the control flow graph according to a formula (6) through the four-element characteristics; wherein deg + (v),deg - (v) represents the degree of entry/exit respectively, To normalize gas consumption, φ (v) is the function call frequency.
8. The Ethernet Pond deception detection method based on control flow graph learning of claim 7, wherein the implementation method of step 2.2.1 is, Step 2.2.1.1, taking the code base scale as the node number, taking the branch strength as the edge density, and taking the logic depth as the longest path length; Step 2.2.1.2 constructing graph complexity quantization relationships for poincare contract identification using node number, edge density and longest path length.
9. The Ethernet Pond deception detection method based on control flow graph learning of claim 1, wherein the implementation method of step 3.4 is, Removing redundant nodes and edges of the control flow graph by using graph theory indexes in the mode shown as the formula (10), and reserving elements with absolute values not larger than a threshold value in the weight removal matrix; Removing elements with absolute values smaller than a threshold t w in the weight matrix when redundant nodes and edges are removed; And 3.4.2, removing the neurons with the activation value smaller than the pruning threshold t a in the mode shown in the formula (11).

Description

Ether Fang Pong fraud detection method based on control flow graph learning Technical Field The invention relates to an Ethernet Pope trickpoint detection method based on control flow graph learning, belongs to the technical field of blockchain intelligent contract security detection, and is applied to the aspect of blockchain intelligent contract Pope trickpoint detection. Background With the rapid development of blockchain technology, ethernet has grown a number of decentralised applications DApps as a blockchain platform supporting smart contracts. However, security issues of smart contracts are becoming increasingly prominent, especially the advent of poincare smart contracts, which brings about enormous economic losses to investors. The poincare intelligent contracts typically attract investors to participate by promised high returns, in effect paying the returns of the old investors with the funds of the new investors, forming a fund chain loop. When the new investor number is insufficient to maintain the funds chain, the cheating party will collapse, resulting in the vast majority of investors being blood-born. Because the code of the smart contract is public and non-tamperable, the poincare smart contract is often able to evade review by traditional financial regulatory authorities, thereby spreading wantonly across blockchains. Currently, security detection methods for intelligent contracts mainly comprise code auditing, rule matching, machine learning and the like. However, these approaches have significant drawbacks in dealing with poincare intelligent contracts. Code audit relies on expert experience and knowledge, is time-consuming and labor-consuming and is difficult to cover all possible vulnerabilities, rule matching methods are limited by the completeness of preset rules and are difficult to cope with complex and changeable fraud patterns, and machine learning methods, although capable of automatically learning contract features, often lack deep understanding of control flow structures, so that detection accuracy is limited. Therefore, how to improve the accuracy of the detection and identification of the ethernet poincare office in the blockchain smart contract has become a urgent problem to be solved. Disclosure of Invention The invention aims at solving the technical problem of the accuracy of the detection and identification of an Ethernet Pond deception in a blockchain intelligent contract, and provides an Ethernet Pond deception detection method based on control flow graph learning. According to the method, the control flow graph of the intelligent contract is constructed, the multidimensional feature is extracted, and the efficient and accurate detection is carried out by combining the graph neural network model. The intelligent contract feature comprehensive description method based on the fuzzy rule has the working principle that the characteristics of the intelligent contract are comprehensively described by combining TF-IDF, N-gram, word2Vec text processing technology, opcode features and control flow diagram features. The characteristics not only reflect semantic information of contracts, but also disclose an execution flow and a logic structure thereof, and provide rich data support for subsequent detection, and a graph neural network model special for a control flow graph is designed, and training efficiency is optimized and detection accuracy is maintained through hierarchical graph processing and multi-stage pruning strategies. The model can fully mine the structural information in the control flow graph and identify the special control flow mode of the intelligent contract of the Pond deception bureau. The invention aims at realizing the following technical scheme: the invention discloses an Ethernet Pope cheat detection method based on control flow graph learning, which comprises the following steps: The method comprises the steps of 1, marking the original data of an intelligent contract extracted from an Ethernet block chain subjected to data cleaning by using a Pongward intelligent contract label, and constructing a control flow graph of the intelligent contract by using a basic block and a control flow transfer instruction which are analyzed by static analysis; Step 1.1: API interface slave Ethernet for calling Etherscan extracting intelligent contract original data from a block chain; marking the original data subjected to data cleaning by using a Pongshi fraud intelligent contract label; Step 1.2.1, removing invalidation, redundancy or errors in original data by adopting a data cleaning mode; marking the cleaned data by using a Pongshi cheating intelligent contract label to form marked data; Step 1.3, constructing a control flow graph of the intelligent contract by utilizing the basic blocks and the control flow transfer instructions after the annotation data are analyzed; analyzing the marked data by utilizing a static analysis technology to form a byte code consist