US-12626109-B2 - Event-driven accelerator supporting inhibitory spiking neural network
Abstract
A spiking neural network acceleration method includes constructing an approximate computation model according to a spiking neuron model and a spiking coding mode. The approximate computation model utilizes the characteristic that spiking frequency domain coding ignores the time semantics of a spiking sequence, compresses the distribution of spiking signals on a time step, and greatly reduces a spiking routing process and a subsequent neural computation process. The time-driven accelerator replans the computation process of the spiking neural network, sets a deduplication queue and a bitmap to solve the problem of spiking jitter, and realizes efficient support on the inhibitory spiking neural network.
Inventors
- Rui Han
- Jun Chen
- Chao Wang
- Lei GONG
- Huarong XU
- Donglian QI
- Yunfeng YAN
- Chi Zhang
- Yiming ZHENG
- Wang LUO
- Ping QIAN
- Yong Zhang
- Zheren DAI
Assignees
- Electric Power Research Institute of State Grid Zhejiang Electric Power Co., Ltd
Dates
- Publication Date
- 20260512
- Application Date
- 20220801
- Priority Date
- 20220106
Claims (8)
- 1 . An event-driven accelerator supporting an inhibitory spiking neural network, wherein the event-driven accelerator performs a spiking neural network acceleration method, wherein the spiking neural network acceleration method comprises: constructing an approximate computation model to eliminate a computation dependency relationship of a spiking neural network at different time steps, the constructing of the approximate computation model comprises: step 1: collecting all spiking signals in a coarse-grained time period into the same time step; step 2: computing a membrane voltage gain of a neuron in the coarse-grained time period according to the spiking signals in step 1; step 3: computing a spiking firing frequency through the membrane voltage gain in step 2; and step 4: obtaining a computation result of the spiking neural network by utilizing the spiking firing frequency in step 3; wherein in step 2, the membrane voltage gain of the neuron in the coarse-grained time period depends on a spiking stimulation intensity brought by a presynaptic neuron in the coarse-grained time period, and a specific computation formula is as follows: ∑ t = 1 T ′ X t j = ∑ t = 1 T ′ 1 τ ( ∑ i = 0 N W i , j ⋆ S t i + leak ) ≈ 1 τ ( ∑ i = 0 N W i , j ⋆ freq i + leak ) wherein, X t i denotes a membrane voltage gain of a neuron j at time t, τ denotes a time constant, W i,j denotes a connection weight between a neuron i and the neuron j, S t i denotes whether the neuron i transmits a spiking signal at time t or not, leak denotes a leak item, and freq i denotes the spiking firing frequency of the neuron i in the coarse-grained time period; in step 3, the spiking firing frequency of the neuron is approximately proportional to the membrane voltage gain in the coarse-grained time period, and the membrane voltage gain is divided by a spiking transmission threshold to obtain an approximate spiking firing frequency, and a specific computation formula is as follows: freq j ∑ t = 1 T ′ S t j ≈ ( ∑ t = 1 T ′ X t j ) / V thrd ; wherein, V thrd denotes a spiking firing threshold of a neuron; the event-driven accelerator comprises hardware circuitry including a spike-input module, a control module, a state module, a compute module, and a spike-output module; wherein the spike-input module comprises a queue implemented by hardware memory for storing spike-input packets, the spike-input packets being composed of a binary group of a neuron label and a spiking activation frequency, and the spike-input module receiving spike-input packets derived from image input data in an external memory and from output data of the spike-output module; the control module comprises a graph controller and a scheduler; wherein the graph controller comprises a router and on-chip storage that stores, for each neuron, a number of postsynaptic edges and an offset of the postsynaptic edges, the router accessing a Double Data Rate (DDR) synchronous dynamic random access memory that stores a topological structure of the spiking neural network in a Control and Status Register (CSR) mode and, according to neuron labels in the spike-input packets, reading the offset and the number from the on-chip storage and taking out from the Double Data Rate synchronous dynamic random access memory all postsynaptic edges including postsynaptic neurons and corresponding connection weights; the scheduler comprises hardware scheduling circuitry that receives the postsynaptic neurons and corresponding connection weights from the graph controller and employs a set-associated strategy in which each computation unit is assigned to update a specific set of synaptic neuron states, the scheduler routing relevant data of the postsynaptic neurons to a corresponding computation unit according to the set-associated strategy; the state module comprises a set of neuron state storage units, wherein each neuron state storage unit stores state information of a set of neurons, the state information comprising membrane potential levels and spiking transmission thresholds; the compute module comprises a set of neural computation units, wherein each neural computation unit comprises a multiplier-adder and a comparator and is mapped in a one-to-one relationship to a corresponding one of the neuron state storage units so as to read and update the state information of the set of neurons stored in the corresponding neuron state storage unit; when a neural computation unit receives spike input of postsynaptic neurons from the scheduler, the neural computation unit uses the multiplier-adder to update membrane potential levels in the corresponding neuron state storage unit and uses the comparator to compare the updated membrane potential levels with the spiking transmission thresholds to determine neurons for which a spiking signal is to be transmitted; the spike-output module comprises a set of deduplication queues for writing input packets into different positions according to whether the neurons are output neurons or not, and encapsulating the output neurons into spike-input packets and transmitting the spike-input packets to the spike-input module; wherein the deduplication queue comprises an output queue, a bitmap, and a computation submodule for computing a spiking frequency; wherein the output queue is configured to store all of the neuron labels with a membrane potential level at an intermediate state exceeding a threshold; the bitmap identifies whether neurons are already present in the output queue to avoid the neurons from being repeatedly pressed into the output queue; and the computation submodule comprises a submodule cal_freq that, when all data in the spike-input module have been processed, reads final membrane voltage states of the neurons corresponding to the neuron labels stored in the output queue, determines whether the neurons in the output queue transmit the spiking signals or not based on the final membrane voltage states, computes frequencies of the spiking signals, and encapsulates the neuron labels and the frequencies into spike-input packets supplied to the spike-input module.
- 2 . The event-driven accelerator according to claim 1 , wherein the spike-input packet is of a binary structure and comprises a neuron label and a spiking activation frequency; wherein the neuron label indicates a source of one of the spiking signals; and the spiking activation frequency reflects a number of times that neurons are activated in the coarse-grained time period to realize the support on the approximate computation model.
- 3 . The event-driven accelerator according to claim 2 , wherein a computation process of the accelerator supporting the inhibitory spiking neural network is divided into a first stage and a second stage, wherein the first stage is a process of performing a spiking routing and updating the postsynaptic neurons according to the spike-input packet; and the second stage is a process that the spike-output module computes the spiking firing frequency according to a final membrane potential level of the neurons in the output queue.
- 4 . The event-driven accelerator according to claim 1 , wherein the neural computation units and the neuron state storage units have a one-to-one mapping relationship, and when the neural computation units receive a spike input of the postsynaptic neurons, the neural computation units read and update the state information of the neurons stored in the neuron state storage unit, and outputs a spike-output packet to a spike-output queue when an updated membrane potential level of a neuron exceeds a corresponding spiking transmission threshold.
- 5 . The event-driven accelerator according to claim 4 , wherein a computation process of the accelerator supporting the inhibitory spiking neural network is divided into a first stage and a second stage, wherein the first stage is a process of performing a spiking routing and updating the postsynaptic neurons according to the spike-input packet; and the second stage is a process that the spike-output module computes the spiking firing frequency according to a final membrane potential level of the neurons in the output queue.
- 6 . The event-driven accelerator according to claim 1 , wherein each deduplication queue utilizes the bitmap to identify whether the neurons are already present in the output queue to avoid a same neuron from being repeatedly pressed into the output queue, and wherein a transmission time of the spiking signal is delayed and the spiking firing frequency is computed by the computation submodule cal_freq by combining final membrane voltage states of the neurons.
- 7 . The event-driven accelerator according to claim 6 , wherein a computation process of the accelerator supporting the inhibitory spiking neural network is divided into a first stage and a second stage, wherein the first stage is a process of performing a spiking routing and updating the postsynaptic neurons according to the spike-input packet; and the second stage is a process that the spike-output module computes the spiking firing frequency according to a final membrane potential level of the neurons in the output queue.
- 8 . The event-driven accelerator according to claim 1 , wherein a computation process of the accelerator supporting the inhibitory spiking neural network is divided into a first stage and a second stage, wherein the first stage is a process of performing a spiking routing and updating the postsynaptic neurons according to the spike-input packet; and the second stage is a process that the spike-output module computes the spiking firing frequency according to a final membrane potential level of the neurons in the output queue.
Description
CROSS REFERENCES TO THE RELATED APPLICATIONS This application is based upon and claims priority to Chinese Patent Application No. 202210010882.4 filed on Jan. 6, 2022, the entire contents of which are incorporated herein by reference. TECHNICAL FIELD The present invention relates to a spiking neural network acceleration method and an event-driven accelerator of a spiking neural network, and belongs to the technical field of spiking neural network acceleration. BACKGROUND Artificial neural networks are widely used in the fields of image recognition, target detection, natural language processing, and the like. However, in recent years, artificial neural networks have been developed towards deeper network levels and more complex network topologies. The computation delay and power consumption problems brought by it severely limit its further development. The spiking neural network is known as a third-generation neural network, is a local activation network in the running process, and naturally has the characteristics of low delay and low power consumption. Therefore, the spiking neural network is a key technology for breaking through the development bottleneck of the traditional artificial neural network, and has huge application prospects in the real-time field and the embedded field. The spiking neural network and the artificial neural network have similar network topology, and the difference is mainly reflected in a neuron model. In FIG. 1, the artificial neuron model and the spiking neuron model are compared, and the formulas in the figure are computation processes of two neurons, respectively. In the figure, X denotes input data, W denotes connection weights, y denotes output data, V denotes neuron mode voltages, and Vthrd denotes neuron spiking firing thresholds. Both neuron models accept multiple input data and both require a weighted sum of the input data. In contrast, the artificial neuron performs a weighted sum activation operation to obtain a final real value output. The spiking neuron utilizes the weighted sum to update the neuron membrane voltage, and determines whether to output the spiking signal or not by comparing the membrane voltage with an emission threshold. Since the spiking neurons may not be able to be successfully activated, the spiking neural network naturally has the characteristic of sparse network, which is also a key technology for breaking through the development bottleneck of artificial neural network. However, there is an additional time dimension due to the spiking neural network, and the updating of neuron states has computation dependency relationships in different time steps. This results in the neuron state being updated possibly many times in the entire time domain, with a computation amount even greater than that of an artificial neural network having the same topology. In order to ensure the accuracy of the computation result of the spiking neural network, the traditional spiking neural network accelerator does not consider optimizing the computation process of the spiking neural network from the aspect of the control method, which obviously causes the spiking neural network accelerator to run inefficiently. Furthermore, the spiking neural network hardware accelerator is divided into a time-driven accelerator and an event-driven accelerator according to implementation. The time-driven spiking neural network accelerator scans all neuron states at the end of each time step and determines whether the neurons transmit spiking signals or not. This approach is logically simple to implement, but there are a large number of redundant computations. When the neuron membrane voltage exceeds a threshold, the event-driven spiking neural network accelerator transmits spiking signals, and the computation amount can be reduced by fully utilizing the spiking sparsity in the spiking neural network. However, because the connection weight in the inhibitory spiking neural network has a negative value, two error conditions may occur when the traditional event-driven accelerator runs the inhibitory spiking neural network: 1) the neuron membrane voltage floats near a threshold, so that the spiking neuron transmits spiking signals for multiple times in the same time step; and 2) the neuron membrane voltage exceeds a threshold at a certain intermediate state, but the final state is below the threshold, causing the neuron to erroneously transmit a spiking signal. The above two cases are collectively called spiking jitter problem. The spiking jitter problem is an error state in the computation process of the spiking neural network, which may cause the error output of the final result. Therefore, the existing event-driven network accelerator cannot support the inhibitory spiking neural network. SUMMARY For the defects of the prior art, the first objective of the present invention is to provide a spiking neural network acceleration method for constructing an approximate computation model according t