CN-122001638-A - Network intrusion detection feature selection method based on multi-strategy improved neural network algorithm
Abstract
The invention discloses a network intrusion detection feature selection method based on a multi-strategy improved neural network algorithm, which comprises the steps of firstly cleaning, normalizing and feature encoding original network flow data, constructing an initial feature space, representing a feature selection scheme in a binary vector form, then introducing a Tent chaotic map to initialize a population, enhancing the diversity of an initial solution by using the ergodic property and the uniformity of a chaotic sequence, updating the position of a leader by adopting a Levy flight strategy in the feature optimizing process, enhancing the global searching capability by adopting a searching mechanism combining a long step length and a short step length, avoiding sinking into local optimum, designing a follower updating strategy based on self-adaptive inertia weight, dynamically balancing global exploration and local development capability, accelerating the algorithm convergence speed, further constructing a weighted fitness function taking a classification error rate and a feature subset scale as double targets, evaluating candidate feature subsets, and determining the global optimum feature subset by a elite reservation and iteration termination mechanism.
Inventors
- DING RAN
- PENG LU
- ZHANG XING
Assignees
- 南京机电职业技术学院
Dates
- Publication Date
- 20260508
- Application Date
- 20260127
Claims (6)
- 1. The network intrusion detection feature selection method based on the multi-strategy improved neural network algorithm comprises the following specific steps of: step 1, constructing a feature space, performing cleaning, normalization and feature coding processing on original network flow data, constructing an initial feature space, setting a binary variable corresponding to each dimension feature, and taking a vector formed by combining the binary variables as a position variable for representing whether the feature is selected into a feature subset or not; initializing a population based on the Tent chaotic map, generating an initial population by utilizing the Tent chaotic map, initializing a characteristic space through the ergodic property and the uniformity of a chaotic sequence, enhancing the diversity of an initial solution, and avoiding the population from being concentrated in a local area; Step 3, constructing a leader position updating mechanism based on Levy flight, introducing a Levy flight strategy to replace a traditional random disturbance mode in a leader updating stage of a goblet-sea squirt swarm algorithm, and enhancing global searching capability of the algorithm on a complex feature space through a searching mechanism combined with a long step length, thereby effectively avoiding sinking into local optimum; Step 4, designing a follower updating strategy based on self-adaptive inertia weights, and dynamically adjusting the inertia weights according to individual fitness difference so as to improve global exploration and local development capacity of the follower in the searching process; Step 5, constructing an fitness function and performing feature evaluation, constructing a weighted fitness function by taking a classification error rate and a feature subset scale as double targets, balancing classification performance and dimension reduction degree by adjusting a weight coefficient, and performing performance evaluation on the current feature subset by using a classifier; And 6, determining an optimal feature subset, repeatedly executing the processes of population updating, fitness calculation and individual selection until the maximum iteration number or convergence condition is met, and finally outputting a global optimal feature subset.
- 2. The method for selecting network intrusion detection features based on the multi-strategy-improved neural network algorithm of claim 1, wherein initializing the population based on the Tent chaotic map in step 2 is expressed as: Aiming at the problem that the traditional random initialization mode is easy to cause uneven distribution of initial individuals and early fall into a local area in search, the Tent chaotic mapping is introduced to generate an initial chaotic sequence, and the unified expression of the Tent chaotic mapping is as follows: ; Wherein, the When the chaotic sequence of the ith individual in the jth characteristic dimension is the t chaotic iteration, the value range is (0, 1); The value of the chaotic variable of the t+1st time obtained after the Tent chaotic mapping is the chaotic control parameter, and u is 0.7; obtaining the D-dimension Tent chaotic sequence Then, the feature is mapped to a feature search space, and the search range of the jth feature is set as The initial position of the ith individual in this dimension is defined as: ; Wherein, the A search lower bound for the jth feature dimension; search upper bound for the j-th feature dimension; Initial successive position values in the j-th feature dimension for the i-th individual; an initial chaotic variable value in a j dimension for an i-th individual; the initial population construction based on the Tent chaotic map is completed through the steps, the ergodic property and uniformity of the chaotic sequence are fully utilized by the initialization strategy, and a search starting point is provided for a leader updating mechanism for subsequently introducing Levy flight and a follower updating mechanism for self-adaptive inertia weight.
- 3. The network intrusion detection feature selection method based on the multi-strategy improvement neural network algorithm according to claim 1, wherein the leader location update mechanism based on Levy flight constructed in step 3 can be expressed as follows: After the initial population construction based on the Tent chaotic mapping is completed, optimizing a leader position updating mode in the improved ecteinascidia group algorithm by introducing a Levy flight strategy, so as to enhance the global searching capability of the algorithm in a high-dimensional network intrusion detection feature space and effectively avoid sinking into local optimum; Let the individual with the optimal fitness in the population at the t-th iteration be the leader, and the position vector is expressed as ; Wherein, the The leader position vector in the t-th iteration corresponds to the individual with the optimal fitness in the current population; Representing the position of the leader in the jth feature dimension; In the traditional sea squirt swarm algorithm, the leader position update depends on uniform random parameters, the searching step length is fixed, and the lack of directionality in the searching process is easy to cause, so that the Levy flying mechanism is introduced, the step length is subjected to heavy tail distribution, the organic combination of short-distance fine searching and long-distance jump searching can be realized, in the specific implementation, the Levy step length is generated by adopting a Mantegna method, and the expression is as follows: ; Wherein, the The Levy flight step length is used for controlling the moving distance of the leader in the search space; is a stability parameter of Levy flight; V is a denominator part for generating step sizes; Respectively obeys normal distribution with v: ; Wherein, the The method is characterized by comprising the steps of introducing Levy flight step length into a leader position updating formula on the basis of the Gamma function and the extension of a factorial function on a real number domain, setting the searching boundary of the jth characteristic dimension as The leader updates its position in the t+1st iteration in the following way: ; Wherein, the Representing the position of the leader in the j-th dimension at the t+1st iteration; A convergence control factor for the t-th iteration is used for adjusting the searching step length; l is Levy flight step length; Representing the position of the leader in the jth feature dimension at t iterations; Through the updating mechanism, a leader can quickly explore potential high-quality areas of the feature space by means of larger step sizes in the initial searching stage, and fine searching is carried out on candidate optimal areas by smaller step sizes in the later iteration stage, so that the global optimizing capability and convergence accuracy of an algorithm are improved.
- 4. The network intrusion detection feature selection method based on the multi-policy improvement neural network algorithm according to claim 1, wherein designing the follower update policy based on the adaptive inertia weight in step 4 is expressed as follows: Aiming at the problems that the follower update in the traditional ecteinascidia group algorithm only depends on the positions of adjacent individuals, ignores the difference of fitness and easily causes premature convergence of the population, a self-adaptive inertial weight mechanism is introduced, and the follower position update process is improved on the basis of unified variables and upper and lower marks; Let the position of the ith follower in the jth feature dimension be at the t-th iteration The position of the previous adjacent individual is The corresponding fitness function values are respectively And (3) with ; In order to make the follower take account of global searching and local developing capability in the position updating process, the searching weight is dynamically adjusted according to the individual fitness, and the position updating strategy is defined as follows: ; Wherein, the For the t+1th iteration, the position of the ith follower in the jth feature dimension; For the t iteration, the position vector of the ith sea squirt body represents a candidate feature subset; The self-adaptive inertial weight is used for adjusting the searching strength of the current iteration stage; the fitness function value of the corresponding feature subset of the ith individual at the t iteration; The fitness function value of the corresponding feature subset of the ith individual at the t iteration is given to the ith-1 th individual; As an fitness function for evaluating the feature subset, the definition of which is given in step 5; The self-adaptive inertia weight is nonlinear changed along with the iteration times, and the expression is as follows: ; Wherein, the The self-adaptive inertia weight is used for adjusting the searching intensity of the current stage when the t iteration is carried out; and (3) with Respectively representing an initial value and a termination value of the inertia weight, t is the current iteration number, For the maximum number of iterations to be performed, Is an arcsine function; In the initial stage of iteration, the inertia weight is larger The method has the advantages that the follower keeps stronger exploration capability in the searching process, the searching range is enlarged, local optimum is jumped out, along with iteration, the inertia weight is gradually reduced, the searching behavior of the follower is more concentrated in the dominant region, and therefore the local development capability is improved, and the algorithm convergence speed is accelerated.
- 5. The network intrusion detection feature selection method based on the multi-strategy improvement neural network algorithm according to claim 1, wherein the constructing of the fitness function and the feature evaluation in the step 5 are represented as follows: Step 5.1, binary mapping of continuous position vectors; let the successive position vectors of the ith individual in the feature space after the t-th iteration be: ; Wherein, the When the iteration is the t time, the continuous position vector of the ith sea squirt is used for representing the candidate solution searched currently; Because the feature selection problem belongs to the binary combination optimization problem, probability mapping is carried out on continuous positions by adopting a Sigmoid function: ; Wherein, the Is that Sigmoid function values of (a); the binarization process is then completed according to the random threshold: ; Wherein, the For the t iteration, the ith individual is a binary decision variable in the jth characteristic dimension; Indicating the selection of the j-th feature, Representing that feature is culled, whereby the subset of features corresponding to the ith individual may be represented as: ; step 5.2 construction of fitness function In order to simultaneously consider the classification performance and the feature compression rate in the feature selection process, an adaptability function integrating the classification errors and the feature quantity is constructed, and a feature subset is provided The data set is composed of The fitness function is defined as: ; Wherein, the The fitness function value of the ith individual at the t iteration; Is composed of feature subset A dimension reduction training data set is formed; To be in the data set The classification error rate obtained by the upper training classifier; d is the total number of the original features; The error weight coefficient is classified; Is the weight coefficient of the feature quantity, satisfies The method is used for balancing the classification precision and the feature dimension reduction requirement.
- 6. The network intrusion detection feature selection method based on the multi-policy improvement neural network algorithm of claim 1, wherein the determining in step 6 the optimal feature subset is represented as follows: Step 6.1 individual evaluation and optimal solution update Calculating fitness values of all individuals in the population, sorting from small to large according to fitness, and selecting the individual with the optimal fitness as an optimal solution of the current iteration: ; Wherein, the The feature subset is the feature subset with the optimal fitness in the t-th iteration; to take the optimal solution operator that minimizes the fitness function value; meanwhile, comparing the current iterative optimal solution with the historical global optimal solution, and if the adaptability is better, updating the global optimal feature subset; Step 6.2 iteration control and termination Condition determination Let the current iteration number be t, the maximum iteration number be After each iteration is finished, judging whether to terminate the algorithm according to the following conditions: ; Wherein, the Representing the global optimum fitness value obtained by the t-th iteration, If the termination condition is not satisfied, t=t+1 is made to enter the next iteration; step 6.3 elite reservation and Global optimal solution update In order to prevent the loss of high-quality solution in the evolution process, an elite retention strategy is introduced, and an individual with optimal fitness in the t-th iteration is set as The fitness value is as follows: ; Associating it with a historical globally optimal individual Corresponding fitness value Comparison is performed: ; through the mechanism, the algorithm is ensured to always keep a feature subset with optimal performance in the whole evolution process; step 6.4 Global optimal feature subset determination And outputting a global optimal feature subset after the algorithm meets the termination condition: ; The feature subset corresponds to the minimum fitness value obtained in all iterative processes, and can effectively reduce feature redundancy on the premise of ensuring classification performance; Through iterative evolutionary control and global optimal solution screening in the step 6, the improved goblet sea squirt swarm algorithm can stably converge in a high-dimensional network flow characteristic space, and outputs an optimal characteristic subset with the strongest discrimination capability and the lowest redundancy, and characteristic input is provided for training and deployment of a subsequent intrusion detection model.
Description
Network intrusion detection feature selection method based on multi-strategy improved neural network algorithm Technical Field The invention belongs to the technical field of network security and intelligent optimization, and particularly relates to a network intrusion detection feature selection method based on a multi-strategy improved neural network algorithm. Background With the rapid development of cloud computing, internet of things and big data technology, the network scale and structure are increasingly complex, and network attack means show development trends of diversification, concealment and high frequency, so that a serious challenge is brought to network security protection. The network intrusion detection system is taken as an important component in a network security defense system, and bears the key task of monitoring and identifying abnormal behaviors and malicious attacks in real time, and the detection performance of the network intrusion detection system is directly related to the security and stability of the network system. The existing network intrusion detection method mainly comprises a detection method based on feature matching and a detection method based on machine learning. The intrusion detection method based on machine learning has strong adaptability in unknown attack recognition and complex attack scenes by modeling and classifying network flow characteristics. However, with the continuous increase of the dimension and scale of network traffic data, the original data often contains a large number of redundant features and irrelevant features, and directly inputting the original data into the classification model not only can obviously increase the calculation cost, but also can easily cause the increase of model training time, detection delay and false alarm rate, thereby restricting the real-time performance and accuracy of the intrusion detection system. In order to reduce data dimension and improve detection performance, feature selection techniques are widely used in the field of network intrusion detection. By screening out key features with strong discrimination capability on the attack behaviors from the original feature set, the detection precision and generalization capability of the classification model can be improved while the consumption of computing resources is reduced. In recent years, the swarm intelligent optimization algorithm is widely used for characteristic selection problems, such as a particle swarm algorithm, a genetic algorithm, a gray wolf optimization algorithm and the like, due to the advantages of simplicity, strong robustness and the like. However, the existing swarm intelligent algorithm still has the defects of low search efficiency, easy sinking into local optimum, low convergence speed and the like in a high-dimensional complex feature space, and is difficult to meet the dual requirements of high precision and high instantaneity in a large-scale network traffic scene. The Zun sea squirt swarm algorithm is used as a novel swarm intelligent optimization algorithm, has the advantages of simple structure, fewer parameters and the like, but still faces the problems of insufficient initial population diversity, easy premature convergence in the searching process, limited global searching capability and the like in the feature selection application. Especially in the high-dimensional network intrusion detection feature space, the traditional sea squirt swarm optimization is difficult to combine global exploration and local development, so that the stability and the optimization precision of screening results are limited. Therefore, a feature selection method capable of effectively improving global searching capability and convergence performance of the ecteinascidia group algorithm is needed to realize efficient screening of network intrusion detection features, and detection accuracy and real-time response capability of an intrusion detection system are further improved while feature redundancy and calculation complexity are reduced. The present invention has been made in view of the above-mentioned background. Disclosure of Invention In order to solve the above problems, the present invention provides a network intrusion detection feature selection method based on a multi-strategy improved neural network algorithm, which specifically comprises the following steps, and is characterized in that: step 1, constructing a feature space, performing cleaning, normalization and feature coding processing on original network flow data, constructing an initial feature space, setting a binary variable corresponding to each dimension feature, and taking a vector formed by combining the binary variables as a position variable for representing whether the feature is selected into a feature subset or not; initializing a population based on the Tent chaotic map, generating an initial population by utilizing the Tent chaotic map, initializing a characteristic spa