CN-116992935-B - Neural network model training method and system based on improved random configuration algorithm

CN116992935BCN 116992935 BCN116992935 BCN 116992935BCN-116992935-B

Abstract

The invention belongs to the technical field of neural network model training, and discloses a neural network model training method and system based on an improved random configuration algorithm, wherein the neural network model training method comprises the following steps of screening proper neurons as candidate hidden layer nodes through inequality constraint conditions of the random configuration algorithm under the condition that the output root mean square error of a neural network model is reduced; the method comprises the steps of selecting K neurons which enable training errors to drop most rapidly from candidate hidden layer nodes, selecting neurons which are least correlated with the previous L-1 hidden layer nodes from the K neurons to serve as optimal hidden layer nodes, obtaining output weights through least square calculation, updating a structure of a random configuration network, and judging whether the network structure is built or not by utilizing the maximum allowable node number and the maximum allowable output errors of hidden layers. The method can reduce the overall calculation amount of the algorithm while guaranteeing the prediction precision, and is suitable for application scenes with high requirements on real-time performance.

Inventors

WANG DIANHUI
Dang gang

Assignees

东北大学
中国矿业大学
江苏锐策智能科技有限公司

Dates

Publication Date: 20260505
Application Date: 20230807

Claims (6)

1. A neural network model training method based on an improved random configuration algorithm, comprising: under the condition that the reduction of the root mean square error of the neural network model output is met, screening proper neurons as candidate hidden layer nodes through inequality constraint conditions of a random configuration algorithm; screening the candidate hidden layer nodes to reduce training errors to the fastest Neurons, from which are selected Selection and pre-selection among neurons The nerve cell with the least correlation of the hidden layer nodes is used as the optimal hidden layer node; Calculating to obtain output weight by a least square method, and updating the structure of the random configuration network; The neural network model training method based on the improved random configuration algorithm specifically comprises the following steps: Firstly, preprocessing sample set data, setting parameters of a random configuration network and initializing the random configuration network; preprocessing a large amount of sensor data collected from a smelting process, including denoising, standardization and normalization; Substituting the obtained candidate hidden layer nodes into inequality constraint conditions, screening to obtain candidate hidden layer nodes meeting inequality constraint, and using sensor data in a smelting process as input; Thirdly, performing secondary optimization on the candidate hidden layer nodes meeting the inequality constraint according to the semi-orthogonalization constraint to obtain optimal candidate hidden layer nodes; Step four, the weight and bias of the optimal candidate hidden layer node are used as new hidden layer nodes to be added into a hidden layer of the neural network, the output weight of the neural network model is calculated according to ideal output by using a least square method, and the neural network model is updated; Step five, judging whether the root mean square error output by the current network model is larger than the maximum expected output error tolerance value, if so, turning to step six, and if so, turning to step seven; Step six, judging whether the hidden layer node is smaller than the maximum allowable number of the hidden layer node, if the hidden layer node is smaller than the maximum allowable number of the hidden layer node, returning to the step two, and if the hidden layer node is equal to the maximum allowable number of the hidden layer node, turning to the step seven; step seven, after training is finished, outputting a neural network model meeting the constraint condition of the random configuration theory; The setting parameters of the random configuration network and initializing the random configuration network comprises the following steps: Setting maximum allowable number of hidden layer nodes Maximum expected output error tolerance Maximum candidate hidden layer node number Distribution of input weights Wherein Is a positive number, and Initializing an output error vector Model output error scaling factor , wherein, The desired output is indicated as such, ; Representing the number of samples; representing the dimension of the output sample, The output samples are represented as such, ; The constructing candidate hidden layer nodes according to the improved weight and bias definition method comprises the following steps: distribution of slave input weights In which the input weight is selected randomly Bias of Substitution into an activation function Obtaining candidate hidden layer nodes; Wherein, the The input samples are represented as such, ; Representing the input sample dimension; The inequality constraint is as follows: ; Wherein, the Non-negative real number sequence Satisfy the following requirements And (2) and 。
2. The neural network model training method based on the improved random configuration algorithm of claim 1, wherein the performing secondary optimization on the candidate hidden layer nodes satisfying the inequality constraint according to the semi-orthogonalization constraint to obtain the optimal candidate hidden layer nodes comprises: Defining a set of variables Screening out the candidate hidden layer nodes meeting inequality constraint Maximum top K candidate nodes ; Calculating the front in the hidden layer of K candidate nodes Unit cosine value of individual node From the slave Selection among candidate nodes The smallest node is used as the optimal candidate hidden layer node.
3. The neural network model training method based on the improved random configuration algorithm of claim 1, wherein calculating the output weight of the neural network model from the ideal output using the least square method, and updating the neural network model comprises: the output weight of the neural network model is obtained through the calculation of the output value and the ideal output value of each hidden layer node by using the least square method, the root mean square error of the output of the neural network model is calculated, and the output error of the model is updated Hidden layer node number 。
4. A neural network model training system based on an improved random configuration algorithm for implementing the neural network model training method based on an improved random configuration algorithm as claimed in any one of claims 1 to 3, characterized in that the neural network model training system based on an improved random configuration algorithm comprises: The initialization module is used for preprocessing sample set data, setting parameters of a random configuration network and initializing the random configuration network; The node screening module is used for constructing candidate hidden layer nodes according to the improved weight and bias definition method, substituting the obtained candidate hidden layer nodes into inequality constraint conditions, and screening to obtain candidate hidden layer nodes meeting inequality constraint; the node secondary optimization module is used for carrying out secondary optimization on the candidate hidden layer nodes meeting the inequality constraint according to the semi-orthogonalization constraint to obtain optimal candidate hidden layer nodes; The model updating module is used for adding the weight and the bias of the optimal candidate hidden layer node as a new hidden layer node into a hidden layer of the neural network, calculating the output weight of the neural network model according to ideal output by using a least square method, and updating the neural network model; And the output module is used for outputting the neural network model meeting the constraint condition of the random configuration theory.
5. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the improved random configuration algorithm based neural network model training method of any of claims 1-3.
6. An information data processing terminal for implementing the neural network model training system based on the improved random configuration algorithm of claim 4.

Description

Neural network model training method and system based on improved random configuration algorithm Technical Field The invention belongs to the technical field of neural network model training, and particularly relates to a neural network model training method and system based on an improved random configuration algorithm. Background The random configuration network is an incremental neural network model, and different from other random neural network methods, the random configuration network introduces a supervision mechanism in the construction process to distribute hidden layer parameters so as to ensure the ten thousand office approximation characteristics of the random configuration network, and has the advantages of easy realization, high convergence speed, good generalization performance and the like. The method is widely applied to the fields of process parameter prediction, fault diagnosis, self-adaptive control and the like. However, these fields all require online real-time measurement, and thus have high requirements on the prediction accuracy and the operation speed of the network model. The prior art has the following defects: (1) The neural network model based on error back transmission is difficult to set a proper network structure, the network connection weight and bias are required to be adjusted according to a gradient descent method in the training process, the calculation amount in the training process is large, the time consumption is long, local minimum points are easy to sink, the neural network model is sensitive to parameter selection, and the realization efficiency is low. (2) The random configuration network utilizes a supervision mechanism to configure the input weight and bias of the hidden layer node, and calculates the output weight through a least square method. However, repeated traversal of parameters may be generated in the configuration process, resulting in increased calculation amount of the model and reduced error convergence speed, and the configuration mode may not obtain better hidden layer output characteristics, thereby affecting generalization performance of the model. Through the analysis, the prior art has the problems and defects that the prior neural network training process is large in calculated amount, long in time consumption, easy to sink into local minimum points, sensitive in parameter selection, low in realization efficiency, incapable of obtaining good hidden layer output characteristics and affecting the generalization performance of the model. The drawbacks of the prior art in industrial applications mainly include the following: 1. Real-time and efficiency problems in many industrial applications, such as fault diagnosis and adaptive control, there are high demands for real-time and response speed. However, neural network models based on error back-propagation may not meet these real-time requirements due to their extensive computational requirements, long training procedures, and the characteristic of being prone to local minima. 2. Structure and parameter selection traditional neural network models such as error back propagation networks require manual or experimental determination of appropriate network structures such as layer numbers, node numbers per layer, etc. Furthermore, the initialization and selection of parameters has a large impact on the convergence speed and performance of the model, which makes deployment and maintenance of the network difficult in an industrial environment. 3. The problem of local minimum points is that the error back-propagation network is easy to sink into the local minimum points in the learning process, which means that even after long-time training, the best model performance can not be achieved. In industrial applications, this may lead to reduced or unstable performance of the system. 4. The generalization performance is affected by the fact that the random configuration network may generate repeated traversal of parameters and cannot obtain better hidden layer output characteristics in the configuration process. In industrial applications, models require good predictive and responsive capabilities for unseen data or conditions that may otherwise lead to failure or inefficiency of the system. Aiming at the defects, the technical problems which need to be solved in the prior art are as follows: 1. Optimizing training algorithm in order to raise the training speed of network and avoid trapping local very small points, it is necessary to research more efficient training algorithm or combine other optimizing technology, such as momentum, regularization, etc. 2. Adaptive network architecture-how to automatically or semi-automatically determine the network architecture to accommodate different applications and data. 3. Enhancing generalization performance-enhancing the generalization capability of a network by introducing new network structures, activation functions, or training strategies. 4. To meet the real-ti