Search

CN-121997328-A - Block chain intelligent contract byte code level security detection method and system

CN121997328ACN 121997328 ACN121997328 ACN 121997328ACN-121997328-A

Abstract

The invention relates to the technical field of blockchain security and discloses a blockchain intelligent contract byte code level security detection method and a system, wherein the method comprises the following steps of orienting control flow and data flow dependence in the intelligent closing and executing process, executing a semantic model by using contract byte codes, and constructing a contract execution dependency graph; the method comprises the steps of utilizing a dual-channel contract semantic coding mechanism to mine contract mixed granularity execution track coding, utilizing a variation self-encoder enhanced execution track reconstruction model to conduct self-supervision reconstruction learning on a non-vulnerability execution sample to obtain potential semantic distribution characteristics, utilizing a reconstruction error judgment mechanism to conduct reconstruction error evaluation on an intelligent contract byte code and an execution track input in a detection stage according to the potential semantic distribution characteristics, and judging whether the intelligent contract has potential security vulnerabilities or not. The invention improves the accuracy, the robustness and the generalization performance of the intelligent contract byte code level security detection of the block chain.

Inventors

  • WANG RUI
  • XU SHANJIE
  • YU HAO
  • SU ZHOU
  • LIU YILIANG
  • ZHANG FANGZHE
  • ZHAO FUHUI
  • Sun Mengqian
  • LIU XIN
  • LIU DONGLAN
  • ZHANG HAO
  • CHANG YINGXIAN
  • MA LEI
  • YAO HONGLEI
  • JIN YUHUI
  • SU BING

Assignees

  • 国网山东省电力公司电力科学研究院

Dates

Publication Date
20260508
Application Date
20251204

Claims (16)

  1. 1. A blockchain intelligent contract bytecode level security detection method, comprising: The control flow and the data flow are dependent in the intelligent closing and executing process, a semantic model is executed by utilizing a preconfigured contract byte code, and a contract execution dependency graph is constructed; Taking the contract execution dependency graph as input, and mining contract mixed granularity execution track coding by utilizing a double-channel contract semantic coding mechanism; performing track coding based on contract mixed granularity, and performing self-supervision reconstruction learning on a non-vulnerability execution sample by using an execution track reconstruction model enhanced by a variation self-encoder to obtain potential semantic distribution characteristics; And according to the potential semantic distribution characteristics, performing reconstruction error evaluation on the execution track of the intelligent contract byte codes input in the detection stage by utilizing a reconstruction error judging mechanism, and judging whether the intelligent contract has potential security holes.
  2. 2. The blockchain intelligent contract bytecode level security detection method of claim 1, wherein the building of the contract execution dependency graph by utilizing a pre-configured contract bytecode execution semantic model for control flow and data flow dependencies during execution of the intelligent contract comprises: Aiming at the intelligent contract byte code instruction sequence, establishing nodes for each instruction to form a node set; extracting running context attributes for nodes in the node set to form feature vectors, and obtaining a node attribute matrix based on the feature vectors: respectively analyzing a control flow and a data flow of the intelligent contract in the execution process to obtain a control dependent side set and a data dependent side set; Respectively weighting the dependency intensities of the control dependency edge set and the data dependency edge set to obtain a control dependency adjacency matrix and a data dependency adjacency matrix; and constructing a contract execution dependency graph based on the node attribute matrix, the control dependency edge set, the data dependency edge set and the unified execution semantic adjacency matrix.
  3. 3. The blockchain intelligent contract bytecode-level security detection method of claim 1, wherein mining contract mixed granularity execution trace coding using a two-pass contract semantic coding mechanism comprises: Extracting control flow cross-function dependency characteristics by using a global channel contract semantic analysis method based on a contract execution dependency graph; extracting data stream state migration characteristics by using a local channel contract semantic analysis method based on a contract execution dependency graph; based on the control flow cross-function dependent features and the data flow state migration features, extracting multi-scale dependent features from coarse-granularity semantics to fine-granularity behaviors by utilizing a mixed granularity feature fusion device; and performing semantic layering alignment and dynamic aggregation processing on the intelligent contracts based on the multi-scale dependency features to generate contract mixed granularity execution track codes.
  4. 4. The blockchain intelligent contract bytecode-level security detection method of claim 3, wherein extracting control flow cross-function dependency features using a global channel contract semantic analysis method based on a contract execution dependency graph comprises: for contract execution dependency graphs, a set of control paths is extracted along a set of control flow dependency edges: constructing a global semantic input sequence of any path in the control path set, unifying paths with different lengths to a fixed length, and introducing a mask mechanism to obtain a normalized global semantic input sequence; Modeling the path length Cheng Yilai of the normalized global semantic input sequence based on a multi-head self-attention mechanism, layer normalization and a feedforward network, and outputting global path dependent features; Performing layer graph meaning propagation based on control adjacency relation, injecting graph-level global context features, and performing depth integration on the weighted multipath feature vectors and the global context features through residual fusion to obtain graph-level aggregation control features after residual fusion; and converting the aggregation characteristics through nonlinear mapping to obtain the control flow cross-function dependency characteristics of the global channel.
  5. 5. The blockchain intelligent contract bytecode-level security detection method of claim 3, wherein extracting data stream state migration features using a local channel contract semantic analysis method based on a contract execution dependency graph comprises: Extracting a data flow path set along the data flow dependency edge set for the contract execution dependency graph; Constructing a local time sequence input sequence of any path in the data flow path set, unifying paths with different time scales to a fixed time scale, and introducing a mask mechanism to obtain a normalized local time sequence input sequence; Based on the gating circulation unit, the Sigmoid activation function and the Hadamard product, carrying out data stream state migration time sequence modeling on the normalized local time sequence input sequence, outputting local data stream state migration dependency characteristics, The method comprises the steps of carrying out enhancement processing on local data stream state migration dependent features by using a one-dimensional cavity convolution layer, stabilizing feature distribution by a layer normalization method to obtain convolution enhancement features, and integrating the convolution enhancement features with original time sequence features by a residual fusion method to obtain path time sequence state residual fusion features; Based on the obtained path time sequence state residual fusion characteristics, aggregating data migration key fragments by utilizing a multi-head time attention mechanism to obtain key migration fragment aggregate vectors; And converting the multipath robust aggregation characteristic through nonlinear mapping to obtain the data flow state migration characteristic of the local channel.
  6. 6. The blockchain intelligent contract bytecode level security detection method of claim 3, wherein extracting multi-scale dependent features from coarse-scale semantics to fine-scale behaviors based on control flow cross-function dependent features and data flow state migration features by using a mixed granularity feature fusion device, and performing semantic hierarchical alignment and dynamic aggregation processing on intelligent contracts based on the multi-scale dependent features, generating contract mixed granularity execution trajectory encoding comprises: Channel alignment and projection are carried out on the control flow cross-function dependent features and the data flow states through the linear mapping matrix and the bias matrix to a common semantic space, so that aligned control flow cross-function dependent features and data flow state migration features are obtained; Performing nonlinear transformation on the aligned control flow cross-function dependent characteristics and data flow state migration characteristics based on k scale converters to obtain a multi-scale characteristic set from coarse-granularity semantics to fine-granularity behaviors; For the multi-scale feature set, calculating importance weights of all scales through attention parameters, modeling semantic association among different scale features based on a self-attention mechanism based on the importance weights, and obtaining multi-scale attention aggregation features; Generating a dimension-by-dimension gating coefficient for the multi-scale attention aggregation feature through a gating weight parameter and a Sigmoid activation function, and performing element-by-element screening by utilizing a Hadamard product to enhance the complementarity of the control flow feature and the data flow feature so as to obtain a channel gating fusion result; And carrying out residual superposition on the gating fusion result and the aligned original features, normalizing stable feature distribution by a combination layer, introducing nonlinear expression through GELU activation functions, and generating contract mixed granularity execution track coding.
  7. 7. The blockchain intelligent contract bytecode level security detection method of claim 1, wherein performing self-supervised reconstruction learning on non-vulnerability execution samples based on contract mixed granularity execution trajectory coding using a variational self-encoder enhanced execution trajectory reconstruction model, the obtaining potential semantic distribution features comprises: Performing track coding on the approximate mixed granularity, performing parametric modeling by using an encoder of a variation self-encoder, and outputting a mean vector and a variance vector of posterior distribution; Based on the mean vector and the variance vector, conducting conductive sampling by adopting heavy parameterization to obtain latent variables of contract execution track latent semantics; Modeling reconstruction distribution by using latent variables as input through a decoder of a variation self-encoder to generate a track reconstruction vector consistent with the original input dimension; Performing track coding based on the track reconstruction vector and the original contract mixed granularity, performing self-supervision optimization by maximizing the evidence lower bound, and iteratively updating the encoder parameters and the decoder parameters to obtain an optimized variation self-encoder; And inputting the non-vulnerability execution sample set into the optimized variation self-encoder for continuous iterative training, and fine-tuning parameters in a process of minimizing a potential spatial distribution constraint and a reconstruction error, and learning in a potential space to obtain potential semantic distribution characteristics of an intelligent contract normal execution track.
  8. 8. The blockchain intelligent contract bytecode level security detection method of claim 1, wherein the determining whether the intelligent contract has a potential security vulnerability by performing reconstruction error assessment on the execution track of the intelligent contract bytecode input in the detection stage by using a reconstruction error determination mechanism according to the potential semantic distribution characteristics comprises: Executing a semantic model and a double-channel contract semantic coding mechanism on intelligent contract byte codes input in the detection stage through the contract byte codes to generate mixed granularity execution track coding of the contract; inputting the mixed granularity execution track code into an execution track reconstruction model enhanced by a variable self-encoder to generate a track reconstruction vector conforming to the potential semantic distribution characteristics of the normal execution track; performing reconstruction error calculation on the original mixed granularity execution track coding of the intelligent contract byte codes and the generated track reconstruction vector to obtain a reconstruction error; Comparing the reconstruction error with a preset threshold value, and judging that the intelligent contract has potential security holes under the condition that the comparison result is that the reconstruction error is larger than the preset threshold value; The preset threshold is set according to the mean value and standard deviation of error distribution data reconstructed by the non-vulnerability sample in the training stage and the preset super-parameters.
  9. 9. A blockchain intelligent contract bytecode level security detection system, comprising: The contract byte code execution semantic module is used for constructing a contract execution dependency graph by utilizing a pre-configured contract byte code execution semantic model in the control flow and data flow dependence of the intelligent contract in the execution process; The double-channel contract semantic coding module is used for taking a contract execution dependency graph as input, and excavating contract mixed granularity execution track coding by utilizing a double-channel contract semantic coding mechanism; The variation self-encoder enhancement re-modeling is used for performing track coding based on contract mixed granularity, and performing self-supervision reconstruction learning on the non-vulnerability execution sample by utilizing the variation self-encoder enhancement execution track reconstruction model to obtain potential semantic distribution characteristics; and the reconstruction error vulnerability judging module is used for carrying out reconstruction error evaluation on the execution track of the intelligent contract byte code input in the detection stage by utilizing a reconstruction error judging mechanism according to the potential semantic distribution characteristics and judging whether the intelligent contract has potential security vulnerabilities.
  10. 10. The blockchain intelligent contract bytecode level security detection system of claim 9, wherein the contract bytecode execution semantic module, when relying on control flow and data flow in execution of intelligent contracts, utilizes a pre-configured contract bytecode execution semantic model to construct a contract execution dependency graph, Aiming at the intelligent contract byte code instruction sequence, establishing nodes for each instruction to form a node set; extracting running context attributes for nodes in the node set to form feature vectors, and obtaining a node attribute matrix based on the feature vectors: respectively analyzing a control flow and a data flow of the intelligent contract in the execution process to obtain a control dependent side set and a data dependent side set; Respectively weighting the dependency intensities of the control dependency edge set and the data dependency edge set to obtain a control dependency adjacency matrix and a data dependency adjacency matrix; and constructing a contract execution dependency graph based on the node attribute matrix, the control dependency edge set, the data dependency edge set and the unified execution semantic adjacency matrix.
  11. 11. The blockchain intelligent contract bytecode level security detection system of claim 9, wherein the dual-pass contract semantic coding module, when mining contract mixed granularity execution trace coding using a dual-pass contract semantic coding mechanism, Extracting control flow cross-function dependency characteristics by using a global channel contract semantic analysis method based on a contract execution dependency graph; extracting data stream state migration characteristics by using a local channel contract semantic analysis method based on a contract execution dependency graph; based on the control flow cross-function dependent features and the data flow state migration features, extracting multi-scale dependent features from coarse-granularity semantics to fine-granularity behaviors by utilizing a mixed granularity feature fusion device; and performing semantic layering alignment and dynamic aggregation processing on the intelligent contracts based on the multi-scale dependency features to generate contract mixed granularity execution track codes.
  12. 12. The blockchain intelligent contract bytecode level security detection system of claim 11, wherein the dual-pass contract semantic encoding module, when extracting control flow cross-function dependency features using global-pass contract semantic analysis methods based on contract execution dependency graphs, For contract execution dependency graphs, a set of control paths is extracted along a set of control flow dependency edges: constructing a global semantic input sequence of any path in the control path set, unifying paths with different lengths to a fixed length, and introducing a mask mechanism to obtain a normalized global semantic input sequence; Modeling the path length Cheng Yilai of the normalized global semantic input sequence based on a multi-head self-attention mechanism, layer normalization and a feedforward network, and outputting global path dependent features; Performing layer graph meaning propagation based on control adjacency relation, injecting graph-level global context features, and performing depth integration on the weighted multipath feature vectors and the global context features through residual fusion to obtain graph-level aggregation control features after residual fusion; and converting the aggregation characteristics through nonlinear mapping to obtain the control flow cross-function dependency characteristics of the global channel.
  13. 13. The blockchain intelligent contract bytecode level security detection system of claim 11, wherein the dual-pass contract semantic encoding module, when extracting data stream state migration features using a local-pass contract semantic analysis method based on a contract execution dependency graph, Extracting a data flow path set along the data flow dependency edge set for the contract execution dependency graph; Constructing a local time sequence input sequence of any path in the data flow path set, unifying paths with different time scales to a fixed time scale, and introducing a mask mechanism to obtain a normalized local time sequence input sequence; Based on the gating circulation unit, the Sigmoid activation function and the Hadamard product, carrying out data stream state migration time sequence modeling on the normalized local time sequence input sequence, outputting local data stream state migration dependency characteristics, The method comprises the steps of carrying out enhancement processing on local data stream state migration dependent features by using a one-dimensional cavity convolution layer, stabilizing feature distribution by a layer normalization method to obtain convolution enhancement features, and integrating the convolution enhancement features with original time sequence features by a residual fusion method to obtain path time sequence state residual fusion features; Based on the obtained path time sequence state residual fusion characteristics, aggregating data migration key fragments by utilizing a multi-head time attention mechanism to obtain key migration fragment aggregate vectors; And converting the multipath robust aggregation characteristic through nonlinear mapping to obtain the data flow state migration characteristic of the local channel.
  14. 14. The blockchain intelligent contract bytecode level security detection system of claim 11, wherein the dual-channel contract semantic coding module extracts multi-scale dependent features from coarse-scale semantics to fine-scale behaviors by utilizing a mixed granularity feature fusion device based on control flow cross-function dependent features and data flow state transition features, performs hierarchical alignment and dynamic aggregation processing of semantics on intelligent contracts based on the multi-scale dependent features, generates contract mixed granularity execution track coding, Channel alignment and projection are carried out on the control flow cross-function dependent features and the data flow states through the linear mapping matrix and the bias matrix to a common semantic space, so that aligned control flow cross-function dependent features and data flow state migration features are obtained; Performing nonlinear transformation on the aligned control flow cross-function dependent characteristics and data flow state migration characteristics based on k scale converters to obtain a multi-scale characteristic set from coarse-granularity semantics to fine-granularity behaviors; For the multi-scale feature set, calculating importance weights of all scales through attention parameters, modeling semantic association among different scale features based on a self-attention mechanism based on the importance weights, and obtaining multi-scale attention aggregation features; Generating a dimension-by-dimension gating coefficient for the multi-scale attention aggregation feature through a gating weight parameter and a Sigmoid activation function, and performing element-by-element screening by utilizing a Hadamard product to enhance the complementarity of the control flow feature and the data flow feature so as to obtain a channel gating fusion result; And carrying out residual superposition on the gating fusion result and the aligned original features, normalizing stable feature distribution by a combination layer, introducing nonlinear expression through GELU activation functions, and generating contract mixed granularity execution track coding.
  15. 15. The blockchain intelligent contract bytecode level security detection system of claim 9, wherein the variational self-encoder enhancement remodelling is performed on the basis of contract mixed granularity execution track coding, self-supervised reconstruction learning is performed on non-vulnerability execution samples by utilizing a variational self-encoder enhancement execution track reconstruction model, and when potential semantic distribution characteristics are obtained, Performing track coding on the approximate mixed granularity, performing parametric modeling by using an encoder of a variation self-encoder, and outputting a mean vector and a variance vector of posterior distribution; Based on the mean vector and the variance vector, conducting conductive sampling by adopting heavy parameterization to obtain latent variables of contract execution track latent semantics; Modeling reconstruction distribution by using latent variables as input through a decoder of a variation self-encoder to generate a track reconstruction vector consistent with the original input dimension; Performing track coding based on the track reconstruction vector and the original contract mixed granularity, performing self-supervision optimization by maximizing the evidence lower bound, and iteratively updating the encoder parameters and the decoder parameters to obtain an optimized variation self-encoder; And inputting the non-vulnerability execution sample set into the optimized variation self-encoder for continuous iterative training, and fine-tuning parameters in a process of minimizing a potential spatial distribution constraint and a reconstruction error, and learning in a potential space to obtain potential semantic distribution characteristics of an intelligent contract normal execution track.
  16. 16. The blockchain intelligent contract bytecode level security detection system of claim 9, wherein the rebuilding error vulnerability determination module utilizes a rebuilding error determination mechanism to rebuild error evaluation of the intelligent contract bytecode and execution track input in the detection stage according to the potential semantic distribution characteristics, and when determining whether the intelligent contract has a potential security vulnerability, Executing a semantic model and a double-channel contract semantic coding mechanism on intelligent contract byte codes input in the detection stage through the contract byte codes to generate mixed granularity execution track coding of the contract; inputting the mixed granularity execution track code into an execution track reconstruction model enhanced by a variable self-encoder to generate a track reconstruction vector conforming to the potential semantic distribution characteristics of the normal execution track; performing reconstruction error calculation on the original mixed granularity execution track coding of the intelligent contract byte codes and the generated track reconstruction vector to obtain a reconstruction error; Comparing the reconstruction error with a preset threshold value, and judging that the intelligent contract has potential security holes under the condition that the comparison result is that the reconstruction error is larger than the preset threshold value; The preset threshold is set according to the mean value and standard deviation of error distribution data reconstructed by the non-vulnerability sample in the training stage and the preset super-parameters.

Description

Block chain intelligent contract byte code level security detection method and system Technical Field The invention relates to the technical field of blockchain security, in particular to a blockchain intelligent contract byte code level security detection method and system. Background Along with the wide application of the blockchain technology in the fields of power grid transaction, cooperative computing of the Internet of things and the like, the intelligent contract serves as a core execution carrier in the blockchain system and plays an important role in automatic asset transfer and business logic processing. However, once the intelligent contract is deployed, the intelligent contract cannot be tampered, if a security hole exists in the byte code, the byte code can be utilized by an attacker to initiate malicious operations such as reentry attack, delegated call abuse, integer overflow and the like in an execution stage, so that serious economic loss and systematic risk are caused. However, the conventional security hole detection method represented by symbol execution, static rule matching and fuzzy test is limited to contract source code level analysis, and is difficult to effectively describe the dynamic control dependence and data interaction behavior mode of the intelligent contract in the execution process under the condition of lacking contract source code. The existing byte code level contract security detection method mainly depends on an analysis paradigm based on rules or supervised learning, however, 1) the existing method is generally based on a graph neural network or a text coding model to carry out supervised classification, a large number of marked vulnerability samples are needed to train a detection model, but in an actual blockchain environment, the number of the vulnerability samples is limited, the types are updated with hysteresis, so that the model is insufficient in generalization capability when facing unknown or evolving vulnerabilities, 2) the existing method is focused on a static control flow or a data flow structure, dynamic time sequence dependence and state migration characteristics in the contract execution process are difficult to capture, real vulnerabilities often appear as execution time sequence abnormality, and 3) the execution mode mining is limited, namely the existing method is mostly based on a local control flow subgraph or an isolated instruction sequence to carry out analysis, and lacks the combined modeling capability of long-range data dependence and global execution semantics, so that the multi-layer semantic expression of the model under complex contract execution tracks is insufficient, and variant attack and potential detection effects are affected. Therefore, how to provide a blockchain intelligent contract byte code level security detection scheme, improve the accuracy of intelligent contract byte code level vulnerability detection and generalization capability for unknown vulnerabilities, reduce the degree of dependence on labeling vulnerability contract byte codes, and enhance the self-adaptive expression and potential space learning capability of a model on executing semantics on the contract is a current urgent problem. Disclosure of Invention The embodiment of the invention provides a block chain intelligent contract byte code level security detection method and system, which are used for solving the technical problems in the prior art. According to a first aspect of an embodiment of the present invention, there is provided a blockchain intelligent contract bytecode level security detection method. In one embodiment, the blockchain intelligent contract bytecode level security detection method includes: The control flow and the data flow are dependent in the intelligent closing and executing process, a semantic model is executed by utilizing a preconfigured contract byte code, and a contract execution dependency graph is constructed; Taking the contract execution dependency graph as input, and mining contract mixed granularity execution track coding by utilizing a double-channel contract semantic coding mechanism; performing track coding based on contract mixed granularity, and performing self-supervision reconstruction learning on a non-vulnerability execution sample by using an execution track reconstruction model enhanced by a variation self-encoder to obtain potential semantic distribution characteristics; And according to the potential semantic distribution characteristics, performing reconstruction error evaluation on the execution track of the intelligent contract byte codes input in the detection stage by utilizing a reconstruction error judging mechanism, and judging whether the intelligent contract has potential security holes. In one embodiment, the control flow and data flow dependence of the smart contract in the execution process, constructing a contract execution dependency graph using a pre-configured contract bytecode executio