Search

CN-122027349-A - Industrial control network flow anomaly detection method based on deep learning

CN122027349ACN 122027349 ACN122027349 ACN 122027349ACN-122027349-A

Abstract

The invention discloses an industrial control network flow anomaly detection method based on deep learning, which comprises the following steps of obtaining industrial control network flow data and preprocessing, carrying out industrial control protocol analysis to obtain industrial control protocol semantic fields, constructing an event sequence, carrying out numerical coding on the event sequence, constructing an adjacent window pair set, obtaining window level anomaly scores and time step difference sequences, generating alarm triggers and constructing anomaly intervals, determining a contribution time step set, constructing an evidence event set, constructing an interpretation generation prompt, and obtaining a structured interpretation result by using a large model. According to the invention, through introducing TimeRCD anomaly modeling based on the relative context difference of adjacent windows and structural large model interpretation generation based on evidence events, unknown anomalies can be identified and field-level evidence and anomaly interval interpretation can be output under the scene that the industrial control network strong period, site difference and process fluctuation coexist, so that the accuracy of industrial control flow anomaly detection is remarkably improved.

Inventors

  • ZHANG YANSONG
  • DONG BAOJUN
  • FENG XUAN

Assignees

  • 西安厚助电子有限责任公司

Dates

Publication Date
20260512
Application Date
20260403

Claims (9)

  1. 1. The industrial control network flow abnormality detection method based on deep learning is characterized by comprising the following steps of: S1, acquiring industrial control network flow data and preprocessing the industrial control network flow data to construct a session data set; S2, carrying out industrial control protocol analysis on each session in the session data set to obtain an industrial control protocol semantic field, and constructing an event sequence according to the industrial control protocol semantic field; S3, performing numerical coding on the event sequence to form a multi-variable sequence matrix, performing window division on the multi-variable sequence matrix to construct an adjacent window pair set, and inputting TimeRCD an anomaly detection model into the adjacent window pair set to obtain a window level anomaly score and a time step difference sequence; s4, generating alarm trigger based on the window level anomaly score and constructing an anomaly interval; S5, determining a contribution time step set according to the abnormal interval, and constructing an evidence event set based on the contribution time step set; s6, constructing explanation generation prompts for each abnormal interval, inputting the explanation generation prompts into a thousand-query large model to obtain a structural explanation result, and outputting the structural explanation result and the window level abnormal score.
  2. 2. The method for detecting abnormal flow of industrial control network based on deep learning as claimed in claim 1, wherein the steps of obtaining industrial control network flow data and preprocessing the industrial control network flow data to construct a session data set are as follows: Collecting an original message set of an industrial control network, and analyzing each original message to obtain five-tuple, wherein the five-tuple comprises a source IP address, a destination IP address, a source port, a destination port and a transport layer protocol identifier; Extracting a time stamp and a message length from each original message, and sequencing the original message sets according to the time stamp ascending order to form a message ascending order sequence; and performing de-duplication processing and industrial control protocol identification based on the message ascending sequence, acquiring a protocol identification, and performing session division on the message ascending sequence based on the quintuple and the protocol identification to form a session data set, wherein each piece of session data in the session data set comprises the session identification, the starting time, the ending time and the session message sequence.
  3. 3. The method for detecting abnormal flow of industrial control network based on deep learning as claimed in claim 1, wherein, the industrial control protocol analysis is performed on each session in the session data set to obtain the industrial control protocol semantic field, and the event sequence is constructed according to the industrial control protocol semantic field, as follows: The conversation message sequence of each conversation data in the conversation data set is arranged in an ascending order according to the time stamp, and an ordered message sequence is obtained; extracting a general network field from each message in the ordered message sequence, wherein the general network field comprises a time stamp message length and a direction identifier; carrying out industrial control protocol analysis on each message based on the protocol identification to obtain an industrial control protocol semantic field, wherein the industrial control protocol semantic field comprises a command field, an object field and a state field; And calculating the time interval of the adjacent messages for the ordered message sequence, forming a session event by the time stamp, the message length, the direction identifier, the time interval of the adjacent messages, the command field, the object field and the status field corresponding to each message, and arranging all the session events in time sequence to form an event sequence.
  4. 4. The method for detecting abnormal flow of industrial control network based on deep learning according to claim 1, wherein the event sequence is digitally encoded to form a multi-variable sequence matrix, as follows: dividing each session event in the event sequence into a continuous field set, a discrete field set and a missing mark field set; Performing normalization processing on each continuous field in the continuous field set to obtain a normalized continuous field vector; Performing index mapping on each discrete field in the discrete field set to obtain a discrete index, and converting the discrete index into a discrete field embedded vector through linear transformation; Splicing the normalized continuous field vector, the discrete field embedding vector and the missing mark field set to obtain event vectors, and stacking all event vectors corresponding to the event sequence according to time sequence to form a multivariate sequence matrix.
  5. 5. The method for detecting abnormal flow of industrial control network based on deep learning according to claim 1, wherein the method is characterized in that the multivariate sequence matrix is subjected to window division to construct a set of adjacent window pairs, and the set of adjacent window pairs is input TimeRCD to an abnormality detection model to obtain a window level abnormality score and a time step difference sequence, as follows: calculating the number of windows according to the preset window length and step length, and performing sliding segmentation on each window based on the multi-variable sequence matrix to obtain a window matrix; constructing adjacent window pairs, and forming all adjacent window pairs into an adjacent window pair set; and inputting TimeRCD an anomaly detection model into each adjacent window pair in the adjacent window pair set to obtain a window level anomaly score and a time step difference sequence.
  6. 6. The method for detecting abnormal flow of industrial control network based on deep learning according to claim 1, wherein the alarm trigger is generated based on the window level abnormality score and an abnormal section is constructed as follows: Comparing each window level anomaly score with a preset anomaly score threshold, and generating alarm trigger when the window level anomaly score is greater than the anomaly score threshold; and merging adjacent windows for generating alarm triggers to obtain an abnormal section formed by windows for continuously generating alarm triggers.
  7. 7. The method for detecting abnormal flow of industrial control network based on deep learning according to claim 1, wherein the contributing time step set is determined according to the abnormal interval, and the evidence event set is constructed based on the contributing time step set, as follows: Acquiring time step difference sequences corresponding to all windows covered by each abnormal interval; Setting contribution quantity, selecting a contribution quantity time step with the largest difference value from each time step difference sequence as a contribution time step, and forming a contribution time step set; Mapping each contribution time step into an index of an event sequence, extracting a corresponding session event from the event sequence as an evidence event according to the index of the event sequence, and forming an evidence event set corresponding to the abnormal interval.
  8. 8. The method for detecting abnormal flow of industrial control network based on deep learning according to claim 1, wherein the construction explanation generation prompt for each abnormal section is as follows: Extracting general network fields and industrial control protocol semantic fields from each evidence event, and dividing the extracted fields into a numerical value field set and a discrete field set; Summarizing and counting each numerical value field in the numerical value field set in the evidence event set to obtain a summarizing result consisting of a minimum value, a maximum value, a mean value and a standard deviation; performing value statistics on each discrete field in the discrete field set in the evidence event set to obtain counting results of different values; Sequentially combining the summarized result and the counting result to form a numerical symbol prompt; Aiming at the contribution time steps in the abnormal interval, obtaining hidden characterizations of the corresponding contribution time steps in an encoder of a TimeRCD model, and mapping the hidden characterizations into discrete codebook indexes according to the nearest neighbor matching relationship; the discrete codebook indexes are arranged according to time sequence to form a step alignment prompt, and an explanation generation prompt is constructed based on the numerical symbol prompt and the step alignment prompt.
  9. 9. The method for detecting abnormal flow of industrial control network based on deep learning according to claim 1, wherein the interpretation generation prompt is input into a thousand-query large model to obtain a structured interpretation result, and the structured interpretation result and the window level abnormal score are output as follows: Inputting an explanation generation prompt into a thousand-question big model to obtain a generation result, and analyzing the generation result into a structured field set, wherein the structured field set comprises an exception type field, a severity level field, an associated asset field, an evidence list field and a confidence level field; binding the time range of the abnormal interval with the structured field set to obtain a structured interpretation result, and outputting the structured interpretation result and the window level abnormal score.

Description

Industrial control network flow anomaly detection method based on deep learning Technical Field The invention relates to the technical field of network traffic detection, in particular to an industrial control network traffic abnormality detection method based on deep learning. Background The industrial control system network usually bears monitoring and control instructions in the production process, communication of the industrial control system network has the characteristics of strong real-time performance, strong periodicity, obvious protocol semantics and the like, the existing industrial control network anomaly detection technology mainly comprises intrusion detection based on rules and feature libraries, flow deviation detection based on statistical thresholds and reconstruction or prediction type time sequence anomaly detection methods based on deep learning, the rules and feature libraries are manually maintained and difficult to cover unknown attack and variant behaviors, the statistical methods are sensitive to normal changes such as load fluctuation and process switching and are easy to generate false alarms, the deep learning reconstruction/prediction type methods often take reconstruction errors or prediction residual errors as anomaly bases, and the deep learning reconstruction/prediction type methods are easily affected by periodic polling, different site differences and protocol field combination changes in industrial control scenes, so that the problems of target mismatch and generalization deficiency occur. In addition, the prior art is generally insufficient in the aspect of alarm interpretability, most methods only output abnormal scores or simple labels, and cannot give evidence corresponding to industrial control protocol semantic fields and key time slices, so that operation and maintenance personnel are difficult to quickly locate abnormal sources and influence ranges, alarm treatment relies on manual review, the efficiency is low, the consistency is poor, and for abnormal intervals with unclear boundaries, the prior methods also lack a stable threshold comparison and continuous interval merging mechanism, fragmentation alarms are easy to generate, and the requirements of industrial field on audit tracing and closed loop treatment are difficult to be met. Disclosure of Invention In order to solve the technical problems in the background technology, the invention provides an industrial control network flow anomaly detection method based on deep learning. The invention provides an industrial control network flow anomaly detection method based on deep learning, which comprises the following steps: S1, acquiring industrial control network flow data and preprocessing the industrial control network flow data to construct a session data set; S2, carrying out industrial control protocol analysis on each session in the session data set to obtain an industrial control protocol semantic field, and constructing an event sequence according to the industrial control protocol semantic field; S3, performing numerical coding on the event sequence to form a multi-variable sequence matrix, performing window division on the multi-variable sequence matrix to construct an adjacent window pair set, and inputting TimeRCD an anomaly detection model into the adjacent window pair set to obtain a window level anomaly score and a time step difference sequence; s4, generating alarm trigger based on the window level anomaly score and constructing an anomaly interval; S5, determining a contribution time step set according to the abnormal interval, and constructing an evidence event set based on the contribution time step set; s6, constructing explanation generation prompts for each abnormal interval, inputting the explanation generation prompts into a thousand-query large model to obtain a structural explanation result, and outputting the structural explanation result and the window level abnormal score. Preferably, industrial control network traffic data are acquired and preprocessed to construct a session data set, as follows: Collecting an original message set of an industrial control network, wherein the original message set is stored in pacp files, and the original message set is expressed as , wherein,Is the firstThe original message is used for receiving the original message,The number of the original messages is the number; analyzing each original message to obtain a five-tuple, wherein the five-tuple comprises a source IP address, a destination IP address, a source port, a destination port and a transport layer protocol identifier; Extracting a time stamp and a message length from each original message, and sequencing the original message sets according to the time stamp ascending order to form a message ascending order sequence; Performing de-duplication processing on the message ascending sequence and performing industrial control protocol identification, and obtaining the protocol identification Perfo