CN-122024313-A - Gesture recognition system based on self-adaptive batch channel normalization impulse neural network

CN122024313ACN 122024313 ACN122024313 ACN 122024313ACN-122024313-A

Abstract

A gesture recognition system based on a self-adaptive batch channel normalized pulse neural network is characterized by comprising a data acquisition module, a preprocessing module and a gesture recognition module which are sequentially connected, wherein the gesture recognition module is provided with an input layer, a pulse neural network, a full-connection layer FC and a self-adaptive weighted kernel norm layer AWNN which are sequentially connected, the pulse neural network is provided with a convolution input layer, the convolution input layer is connected with a batch channel normalized network module, the batch channel normalized network module is provided with m network blocks which are sequentially connected end to end, the data acquisition module is used for acquiring original gesture data with time sequence attributes, the preprocessing module is used for preprocessing the original gesture data to obtain standard gesture data, and the gesture recognition module is used for performing gesture recognition operation on the standard gesture data to obtain a gesture recognition result. The method has the advantages that redundant time sequence noise can be restrained on the whole, and power consumption is reduced.

Inventors

DUAN SHUKAI
ZHONG MEILING
WANG LIDAN

Assignees

西南大学

Dates

Publication Date: 20260512
Application Date: 20260126

Claims (8)

1. A gesture recognition system based on a self-adaptive batch channel normalized pulse neural network is characterized by being provided with a data acquisition module, a preprocessing module and a gesture recognition module which are connected in sequence; The gesture recognition module is provided with an input layer, a pulse neural network SNN, a full-connection layer FC and a self-adaptive weighting nuclear norm layer AWNN which are sequentially connected, wherein the pulse neural network SNN is provided with a convolution input layer which is connected with a batch channel normalization network module, and the batch channel normalization network module is provided with m network blocks which are sequentially connected end to end; the data acquisition module is used for acquiring original gesture data with time sequence attributes; the preprocessing module is used for preprocessing the original gesture data and obtaining standard gesture data through processing; The gesture recognition module is used for performing gesture recognition operation on the standard gesture data to obtain a gesture recognition result.
2. The gesture recognition system based on the self-adaptive batch channel normalization pulse neural network according to claim 1, wherein m network blocks are identical in Block structure, and are respectively provided with a first convolution layer, a first batch channel normalization layer, a first neuron layer, a second convolution layer, a second batch channel normalization layer, a second neuron layer, a pooling layer and an addition unit which are sequentially connected, and a third batch channel normalization layer is connected between the input ends of the first convolution layer and the addition unit in parallel.
3. The gesture recognition system based on the adaptive batch channel normalized impulse neural network of claim 2, wherein the first neuron layer and the second neuron layer adopt LIF iterative models, and the iterative expression is: ; Wherein, the A constant representing the decay of the membrane potential, The membrane potential at time t is indicated, A presynaptic input representing time t; ; Wherein, the A binary pulse output representing neuron j at time t, As the weight of the material to be weighed, For the neuron index, b represents the bias, , The axis of the batch process is shown, The axis of the channel is shown as such, Representing a spatial axis; When (when) At the time, the neuron pulses and resets the membrane potential to 0, i.e. The LIF iterative model expression in the spatial and temporal domains is therefore: ; ; Wherein, the Representing the threshold of the transmission, Represents the membrane potential of the n-th layer neuron at the time t, Representing the binary pulse output of the n-th layer neuron at time t, Representing the presynaptic input of the nth layer neuron at time t, To activate the function.
4. The gesture recognition system based on the adaptive batch channel normalized impulse neural network of claim 2, wherein the first batch channel normalized layer and the second batch channel normalized layer are identical in structure, and are respectively provided with a batch-time joint dimension processing unit and a channel-space dimension processing unit in parallel, and the output ends of the batch-time joint dimension processing units and the channel-space dimension processing units are connected with the input ends of the same weighting fusion unit.
5. The gesture recognition system of claim 4, wherein the batch-time joint dimensional processing unit first calculates a current batch channel normalization layer input signal along an (NxT, H, W) axis Mean of (2) Sum of variances The expression is: ; ; Then use the mean value Sum of variances For input signals And carrying out normalization processing, wherein the expression is as follows: ; Wherein, the An input signal representing the normalized layer of the current batch of channels, , Representing the total time step, i representing the batch-time index, m representing the total batch-time, m ε N T; Is a super parameter; a constant set to ensure the stability of the numerical value, Representing the normalized output of the batch-time joint dimension processing unit; The channel-space dimension processing unit first calculates the current batch channel normalized layer input signal along the (C, H, W) axis Mean of (2) Sum of variances The expression is: ; ; Then use the mean value Sum of variances For input signals And carrying out normalization processing, wherein the expression is as follows: ; Wherein, the Representing the total number of channels, The channel index is represented as a function of the channel index, Representing the normalized output of the channel-space dimension processing unit.
6. The gesture recognition system based on the adaptive batch channel normalized impulse neural network of claim 5, wherein the weighted fusion unit adopts learning parameters The normalized outputs along the (n×t, H, W) and (C, H, W) axes are weighted fused, expressed as: ; thus, the final normalized output expression for the batch channel normalization layer is: ; Wherein, the Is a learning parameter.
7. The gesture recognition system based on the adaptive batch channel normalized impulse neural network of claim 1, wherein the loss of the gesture recognition module comprises a temporal effective training loss And adaptive weighted kernel norm loss The expression is: ; ; ; Wherein, the The total time step is indicated and, For the time step index (tsf), The target tag is represented by a number of tags, Representing the cross-entropy loss, Is the output of the time step t, The shape of (a) is [ N, D ], N is the batch size, D is the output feature dimension; For the indexing of the batch(s), ; As a total number of singular values, The superscript T is a matrix transposition for singular value index; Representing global adaptive weights; , ; Representation of Is used for the matrix rank of (a), ; A time-feature matrix representing a kth sample; Adding a super parameter lambda to adjust the proportion of the regularization term so that the total loss function of the gesture recognition module The method comprises the following steps: 。
8. the gesture recognition system based on the adaptive batch channel normalized impulse neural network of claim 7, wherein the total loss of the gesture recognition module versus SNN synaptic weight matrix The gradient expression of (2) is: ; ; ; Wherein, the For the output value of the presynaptic input of the output layer after the softmax, For target labels Is a unique thermal encoding of (2); An input pulse vector representing a kth sample at time step t; ; An input pulse which is the t time step; representing an SNN synaptic weight matrix satisfying ; Thus, the gradient of the total loss versus weight W is: 。

Description

Gesture recognition system based on self-adaptive batch channel normalization impulse neural network Technical Field The invention relates to the technical field of impulse neural networks, in particular to a gesture recognition system based on a self-adaptive batch channel normalization impulse neural network. Background During training of deep pulse neural networks, the feature distribution received by the hidden layer continuously shifts during training iterations due to the randomness of parameter initialization and the variation of the input pulse sequence in time and sample dimensions, a phenomenon commonly referred to as internal covariate offset ICS. In SNN, the ICS problem presents a more complex form that the distribution drift is not only reflected on the change of the activation amplitude, but also directly influences the time sequence characteristics of the membrane potential dynamics and the pulse release rate, and is manifested by the phenomena of unbalance of pulse release, time scale deviation, and deep neuron 'fire down' or overdischarge and the like. This phenomenon weakens the specific timing characteristics of the impulse neural network and reduces the stability and accuracy of SNN training. Normalization techniques are widely recognized as a key component in alleviating the problem of covariate excursions within deep SNN training networks, as well as stabilizing training and accelerating convergence. However, normalization techniques involving SNN have been proposed in only a few ways. Summary the existing research finds that the current SNN normalization method is mainly divided into two types, namely, the first type is independent normalization only in time dimension or channel dimension. For example, the batch normalization timing transformer BNTT is mainly implemented by performing independent normalization on each discrete time step, and the method ignores the structure information between channels, so that the method is not friendly to a sparse pulse mechanism and small batch training. Wu et al propose a neuron normalization technique NeuNorm, which only relies on spatial position normalization along a channel dimension, effectively alleviates the problem of pulse emission imbalance in SNN, but cannot acquire characteristic changes in time and sample dimensions, and is difficult to process a network with complex time structure, significant sample difference or deep layer. The second type is mainly to perform joint normalization by aggregating data in a batch dimension and a time dimension, such as methods of threshold-dependent batch normalization tdBN, time-efficient batch normalization TEBN, and the like. The method can effectively acquire the information of the cross-sample and the cross-time step, inhibits the distribution drift caused by time sequence accumulation to a certain extent, and possibly ignores the structural information in a single sample under the condition of a deep network or a short time step. In the pulse neural network, batch normalization mainly captures global pulse distribution information and membrane potential changes across samples and time dimensions, and is a global steady state adjustment mechanism, while channel normalization focuses on pulse distribution relative intensity relations among different channels inside a single sample, so that the unique key feature structure of the single sample can be enhanced while pulse sparsity is maintained. The two are complementary in function, but no method for simultaneously normalizing channel, time and batch dimension collaborative fusion exists in the SNN field at present. The prior art has the disadvantage that the impulse neural network SNNs has great potential in the neuro-mimicry calculation due to the event-driven and low-power consumption characteristics. However, the discrete pulse delivery mechanism and time accumulation dynamics in SNN make it subject to problems such as gradient instability, timing covariate offset, and feature redundancy during direct training. The prior study mainly normalizes balanced neuron pulse distribution through a single batch dimension, and ignores global structural correlation of SNN in time sequence-characteristic dimension. Disclosure of Invention The gesture recognition system based on the self-adaptive batch channel normalized impulse neural network provided by the invention can inhibit redundant time sequence noise on the whole and reduce power consumption. In order to achieve the above purpose, the gesture recognition system based on the self-adaptive batch channel normalized impulse neural network provided by the invention is characterized by comprising the following steps: the gesture recognition system is provided with a data acquisition module, a preprocessing module and a gesture recognition module which are sequentially connected; The gesture recognition module is provided with an input layer, a pulse neural network SNN, a full-connection layer FC and a se