CN-121981189-A - Pulse neural network training method based on optimization of membrane potential and output consistency

CN121981189ACN 121981189 ACN121981189 ACN 121981189ACN-121981189-A

Abstract

The invention discloses a pulse neural network training method based on membrane potential and output consistency optimization, and belongs to the fields of brain-like calculation and machine learning. The method introduces a leachable membrane potential smoothing coefficient into pulse neuron dynamics, carries out self-adaptive smoothing fusion on membrane potential distribution before charging in a specific time step and membrane potential after smoothing in a previous time step, reduces membrane potential difference of a crossing time step, and simultaneously carries out distillation learning between pulse neural network time sequence outputs in adjacent time steps to promote output consistency of the crossing time step. Compared with the prior art, the method can remarkably improve the output consistency and the overall performance of the pulse neural network in the cross-time step, and remarkably improve the recognition performance in one-dimensional voice, two-dimensional images and three-dimensional point cloud recognition tasks.

Inventors

ZUO LIN
DING YONGQI
Deng Hanpu
LI LINZE

Assignees

电子科技大学

Dates

Publication Date: 20260505
Application Date: 20260114

Claims (4)

1. The pulse neural network training method based on optimization of membrane potential and output consistency is characterized by comprising the following steps of: Step 1, training data set for training impulse neural network model Inputting the data characteristics into a pulse neural network model, wherein the pulse neural network model can comprise a convolution layer/full connection layer, a batch normalization layer and a membrane potential smoothing pulse neuron layer; step 2, membrane potential smoothing pulse neuron layer in time step Continuously extracting potential time characteristics, accumulating membrane potential And generating a binary pulse sequence ; Step 3, the pulse sequence generated by the membrane potential smoothing pulse neuron propagates to the convolution layer/full-connection layer, the normalization layer and the membrane potential smoothing pulse neuron layer in the iterative forward direction in the pulse neural network model, and finally is generated by the full-connection layer if the pulse sequence is a classification task Output value of each time step Wherein And M represents the batch size and the class number during model training, respectively, and if a regression task is used, an output is generated ; Step 4, if the task is a classification task, outputting the impulse neural network obtained in the step 3 Classification task-specific distillation learning in the time dimension while using data sets The label data in the pulse neural network is used for optimizing and updating parameters in the pulse neural network by using a proxy gradient function and a gradient descent method; Step 5, if the task is a regression task, outputting the pulse neural network obtained in the step 3 Regression task-specific distillation learning in the time dimension while using data sets The label data in the pulse neural network is used for optimizing and updating parameters in the pulse neural network by using a proxy gradient function and a gradient descent method; And 6, repeating the steps until the impulse neural network model converges or reaches a preset training round to obtain an optimized impulse neural network model.
2. The method according to claim 1, characterized in that the membrane potential smoothing pulse neuron in step2 has the steps of: Step 2-1 for the received input current Wherein Representing the dimensions of the input current path, And Representing the height and width of the input current, respectively, the membrane potential of the membrane potential smoothing pulse neurons at the initial time step (index 0) Initializing to 0 while smoothing the 0 th time step to the film potential Initially 0; Step 2-2, in the 0 th time step, the membrane potential smoothing pulse neuron absorbs the input current of the corresponding time step obtained in the step 3-1 into the membrane potential for charging, and the charged membrane potential is obtained : , Wherein 0 represents the time step index, Is the input current corresponding to time step 0; step 2-3, film potential after 0 th time step charge And pulse emission threshold Generating pulse by contrast, generating pulse when the film potential is higher than threshold value after charging Otherwise, no pulse is generated, i.e ; Step 2-4, the neuron generates a pulsed reset membrane potential from which the pulsed current equal to the threshold value is subtracted: ; Step 2-5, the neuron iterates to the next time step (index 1), and the membrane potential leaks from the previous time step: , Wherein, the Representing the membrane potential time constant, and controlling the leakage degree of the membrane potential of the pulse neuron along with time; step 2-6, smoothing the membrane potential and the membrane potential smoothing coefficient with the last time step Smoothing the membrane potential for the current time step: , Wherein, the Representing the film potential smoothing coefficient, Is a learnable parameter updated with the training of the impulse neural network model; Step 2-7, based on the smoothed film potential Absorb the corresponding input current Charging, namely, step 2-2; step 2-8, comparing the charged membrane potential with a threshold value to generate pulses and reset the membrane potential, as in step 2-3 and step 2-4; step 2-9, neuron iterating through steps 2-5 to 2-8 to generate pulses until a defined pulse neural network run time step is reached Finally, output pulse is generated Wherein Representing a binary set.
3. The method according to claim 1, characterized in that the time-dimensional distillation learning in step 4 has the following steps in the classification task: step 4-1, for pulsed neural network generation Output of individual time steps Order-making Calculate the first Time step and th Distillation loss between the output of each time step is calculated, and the classification probability of the output of the two time steps is calculated as follows: , Wherein, the And Respectively represent two time step output correspondences The probability of classification of the individual categories, Coefficients representing the softening output profile; step 4-2, calculating a time step using the KL divergence And time step Distillation loss between the corresponding outputs: ; step 4-3, step 4-1 and step 4-2 to step 4 are iterated in time steps gradually increasing Obtaining a common A distillation loss; Step 4-4, for the co-produced in step 4-3 Distillation loss with the maximum retention loss value and a certain probability for the residual distillation loss Discarding (the weight corresponding to the discard penalty is reset to 0, otherwise 1); Step 4-5, accumulating all distillation losses with weight of 1 to obtain final time dimension distillation loss ; Step 4-6, calculating the final loss Loss for distillation Sum of cross-class entropy loss: , Wherein, the Is a weight coefficient for controlling the distillation loss, For pulse neural network Cross entropy loss calculated by average output of each time step and label information; Step 4-7, counter propagation And optimizing weights and learnable parameters in the impulse neural network by adopting a proxy gradient function and a gradient descent method.
4. The method according to claim 1, characterized in that the time-dimensional distillation learning in step 5 has the following steps in the regression task: step 5-1, for pulsed neural network generation Output of individual time steps Wherein Regression numerical dimension representing output of impulse neural network is 1, let Calculate the first Time step and th The mean square error loss between the individual time step outputs is: ; step 5-2, step of time gradually increasing iterating steps 4-1 to Obtaining a common A distillation loss; step 5-3, for the co-produced in step 4-2 Distillation loss with the maximum retention loss value and a certain probability for the residual distillation loss Discarding (the weight corresponding to the discard penalty is reset to 0, otherwise 1); step 5-4, accumulating all distillation losses with weight of 1 to obtain final time dimension distillation loss ; Step 5-5, calculating the final loss Loss for distillation Sum of cross-class entropy loss: , Wherein, the Is a weight coefficient for controlling the distillation loss, For pulse neural network Mean square error loss calculated from the average output of the individual time steps and the tag information. Step 5-6, counter propagation And optimizing weights and learnable parameters in the impulse neural network by adopting a proxy gradient function and a gradient descent method.

Description

Pulse neural network training method based on optimization of membrane potential and output consistency Technical Field The invention belongs to the fields of brain-like calculation and machine learning, and particularly relates to a pulse neural network training method based on membrane potential and output consistency optimization. Background The artificial neural network (ARTIFICIAL NEURAL NETWORKS, ANN) has remarkable effects in tasks such as image recognition, video understanding, language questions and answers, automatic driving and the like, and is widely applied to real scenes. However, training and reasoning for deep ANNs requires significant computational resources and energy consumption, which is difficult to deploy in edge devices. In particular, with the advent of large language models, the energy consumption implications faced by artificial intelligence algorithms are further exacerbated. The impulse neural network (Spiking Neural Networks, SNN) is called a third generation neural network, and simulates an information transmission mechanism of a biological nervous system by binary and discrete impulses, and when the impulse neural network is deployed on a nerve morphology chip, heavy multiplication operation in the ANN can be replaced by sparse addition operation, so that the calculation power consumption is greatly reduced. In addition, the bionic pulse neurons have membrane potential dynamics evolving along with time, can extract potential time characteristics in data, and have wide application prospects in time sequence scenes. However, excessive differences in membrane potential across time steps within a pulsed neuron in SNN and resulting output differences negatively impact the performance of multi-time step SNN, which makes it difficult to deploy on a large scale in real scenes. In order to improve the actual availability of the SNN, a high-efficiency and high-performance SNN training method for improving the task performance of the SNN needs to be further explored. Disclosure of Invention In order to improve the task performance of SNN, the invention provides a pulse neural network training method based on membrane potential and output consistency, which introduces a leachable membrane potential smoothing coefficient into pulse neuron dynamics, carries out self-adaptive smoothing fusion on membrane potential distribution before charging in a specific time step and membrane potential after smoothing in a previous time step, and reduces the membrane potential difference of a cross-time step. The technical problems proposed by the invention are solved as follows: A pulse neural network training method based on membrane potential and output consistency comprises the following steps: step 1, training data set for training SNN model Inputting the forward propagation extracted data characteristics into an SNN model, wherein the SNN model can comprise a convolution layer/full connection layer, a batch normalization layer, a membrane potential smoothing pulse neuron layer and the like; step 2, membrane potential smoothing pulse neuron layer in time step Continuously extracting potential time characteristics, accumulating membrane potentialAnd generating a binary pulse sequence; Step 3, the pulse sequence generated by the membrane potential smoothing pulse neuron is iterated and transmitted to the convolution layer/full-connection layer, the normalization layer and the membrane potential smoothing pulse neuron layer in the SNN model, and finally generated by the full-connection layer if the pulse sequence is a classification taskOutput value of each time stepWhereinAnd M represents the batch size and the class number during model training, respectively, and if a regression task is used, an output is generated; Step 4, if the task is a classification task, outputting the SNN obtained in the step 3Task-specific distillation learning of classification between adjacent time steps in the time dimension while using a datasetThe label data in the SNN is used for optimizing and updating parameters in the SNN by using a proxy gradient function and a gradient descent method; step 5, if the task is a regression task, outputting the SNN obtained in the step 3 Regression task-specific distillation learning between adjacent time steps in the time dimension while using the data setThe label data in the SNN is used for optimizing and updating parameters in the SNN by using a proxy gradient function and a gradient descent method; And 6, repeating the steps until the SNN model converges or reaches a pre-designated training round to obtain an optimized SNN model. The step 2 of the invention comprises the following steps: Step 2-1 for the received input current WhereinRepresenting the dimensions of the input current path,AndRepresenting the height and width magnitudes of the input current, respectively, the membrane potential of the membrane potential smoothing pulse neurons at the initial time step (index 0)Initia