CN-116992934-B - Efficient memristor neural network in-situ training system
Abstract
The invention discloses a high-efficiency memristive neural network in-situ training system which comprises a model database, an input module and a model editing module, wherein a memristive neural network processing model corresponding to input data is extracted from the model database aiming at the input data, an optimized memristive neural network in-situ training model is built and trained, and the model is stored in a memory for standby. The memristor neural network in-situ training model adopts a double memristor array to represent network weights, and realizes simulation and emulation of nonideal characteristics of memristors through Vteam memristor neural networks. The effectiveness and the robustness of the proposed in-situ training scheme are obtained through experimental simulation. Through an optimization algorithm combining hardware and neural network training, the high-accuracy high-efficiency memristor neural network in-situ training with simple operation is realized.
Inventors
- WANG LIDAN
- SHEN SIYUAN
- GUO MINGJIAN
- DUAN SHUKAI
Assignees
- 西南大学
Dates
- Publication Date
- 20260505
- Application Date
- 20230801
Claims (7)
- 1. The high-efficiency memristor neural network in-situ training system is characterized by comprising a model database, an input module and a model editing module, wherein Vteam memristor neural networks and a series of memristor neural network processing models are stored in the model database; the input module is used for sending the preprocessed input data to the model editing module; the model editing module is used for extracting a memristor neural network processing model corresponding to the input data from a model database, extracting Vteam memristor neural networks, building and training an optimized memristor neural network in-situ training model, and storing the model in a memory for later use; The memristor array module is arranged in the memristor neural network in-situ training model, the memristor array is a double memristor array, the double memristor array is formed by two nxn memristor cross arrays with completely consistent structures in parallel, the two n memristor cross arrays are commonly connected with a front-end neuron set, and the output end of the double memristor array is connected with a rear-end neuron set; An in-situ trainer is arranged in the double memristor array, and is provided with: Means for analytically encoding the preprocessed data to obtain an encoded voltage; The device is used for carrying out nerve morphology calculation on the coding voltage and differentiating the current output by the corresponding coordinate memristor unit in the double memristor array; Means for performing voltage encoding on the current subjected to the difference to obtain an encoded voltage, and outputting the encoded voltage to a memristor unit of a next layer; means for storing the output current of the last layer of memristive cells; means for calculating an update value aw; means for determining whether the update value aw is greater than a threshold value; Means for connecting said updated value Δw for dynamic accumulation and incorporating said updated value Δw into the updated value Δw calculated for the next batch of training data if it is equal or less; And if the SET or RESET pulse is larger than the SET or RESET pulse, performing weight updating, and connecting the device for applying the SET or RESET pulse to the corresponding memristor array to perform weight updating.
- 2. The system of claim 1, wherein the preprocessing in the input module is a process of performing analog-to-digital conversion on an input signal to obtain input data having characteristics of the input signal.
- 3. The efficient memristive neural network in-situ training system of claim 1, wherein: Before preprocessing data and inputting the data into a memristor neural network in-situ training model, a memristor array module in the memristor neural network in-situ training model needs to be initialized, and an initialization flow is as follows: (1) Setting all memristors to a low conductance state; (2) Initializing weights Weight division is carried out from-1 to 1 at intervals of 2/N, and the serial number of each interval is (1, K); (3) Judging the interval of the current weight And calculating the corresponding pulse number: When (when) Determining the interval of the current weight Obtaining the corresponding pulse number And applying corresponding synergistic pulses to the negative array when Determining the interval of the current weight Obtaining the corresponding pulse number And applying corresponding potential pulses to the positive array when Maintaining the current state; Wherein, the Represented as a weight of the first layer, The number of neurons represented as layer l-1, N as the initialized conductivity state, K is similarly defined as N for ease of distinction, And after initialization, quantizing the weight of each layer, encoding the quantized weight into pulses with different numbers, and applying the pulses to the memristor array.
- 4. The efficient memristor neural network in-situ training system of claim 1, wherein in the nxn memristor cross array, all memristor array units are set to be Vteam memristor neural networks, the Vteam memristor neural networks are composed of Vteam memristors, and state variables w of Vteam memristors are changed by adding random noise; the voltage-current relationship of the Vteam memristors and the internal state variable change are shown as formula 1, common 2; (1) (2) Wherein w is denoted as a state variable inside the memristor, v (t) is a voltage applied across the memristor, i (t) is denoted as a current flowing through the memristor, G (w, v) is denoted as a conductance of the memristor, and t is time; The change in state variable of the Vteam memristor has a threshold voltage, as shown in fig. 3; (3) Wherein, the 、 、 、 Is a constant value, and is a function of the constant, And Is a voltage threshold; And Is a window function related to memristor state variable w, controls the change speed of the state variable w and limits the value of the state variable w, wherein the state variable w and the conductance The relationship of (2) is set to an exponential relationship as shown in the publication 4; (4) the Vteam memristor neural network changes the state variable w of the Vteam memristor by adding random noise as follows; (5) The Vteam memristor is capable of achieving random variations in non-ideal characteristics of the memristor by adding random noise to the control variable w, wherein the non-ideal characteristics include device-to-device variations, cycle-to-cycle variations, and pulse-to-pulse variations.
- 5. The in-situ training system of the high-efficiency memristive neural network of claim 1, wherein in the means for voltage coding the differential current, a coding rule of the voltage coding is shown in a common sign 6; (6) Wherein, x is the current after differential, f (x) is the current after differential through an activation function A subsequent value; And Respectively via an activation function The subsequent maximum and minimum values, the activation function in the whole network is Relu, And Respectively setting the maximum value and the minimum value of the voltage interval, namely, 0.5 and 0;x as currents after differential; in the means for calculating the updated value Δw, an error of the model output current value and the actual current value is calculated using a binary error function, which is shown in formula 7; (7) Wherein E represents an error value, Representing the output current of the last layer of coordinate memristor unit, t represents the input real label, and formula 7 is performed The formula of the bias guide backward propagation is shown as follows; (8) The gradient of each layer is conducted by equation 7, and then according to the chain law, the gradient of each layer is shown in equation 9; (9) where En represents an error of the output current value of the nth layer from the actual current value, Represents the output of the nth layer, where n is equal to 1,2,..o-1, The weight of the n+1th layer is represented, and the weight update value of each layer is calculated by using the gradient value of each layer according to the formula 10 and the formula 11 ; (10) (11) Wherein, the Represented as a weight of the n-th layer, An input encoded voltage value for the n-th layer, n being 1, 2..o-1; in the means for dynamically accumulating the updated value Δw, a dynamic accumulated right updated value Δw formula is shown in formula 12; (12) Wherein the method comprises the steps of Represented as an updated value of the n-th layer weight in the p-th lot, In order for the rate of learning to be high, Represented as a regularization parameter of the network, Expressed as the momentum accumulation at the time of weight update in the p-1 th lot, The calculation mode of (2) is shown in formula 13; (13) Wherein, the An updated value expressed as the n-th layer weight in the p-1 th lot, and if the weight update amount of the p-1 th lot is smaller than the threshold value, the updated value is accumulated in the next weight update; In the means for determining whether the update value aw is greater than a threshold value, the threshold value The dynamic threshold function is determined by a dynamic threshold function, the dynamic threshold function can realize that the network converges towards the correct direction, the dynamic threshold function comprises an SDT scheme and a CDT scheme, wherein the change of the threshold value of the SDT scheme increases along with the increase of the iteration number, the threshold value setting for limiting the update initiation of frequent memristive weight is not 0, and the change of the threshold value of the CDT scheme decreases along with the increase of the iteration number.
- 6. The system of claim 1, wherein the memory stores spare data to enable rapid recovery of the original data via reverse operation.
- 7. The efficient memristive neural network in-situ training system of claim 1, wherein the series memristive neural network processing model is either an image memristive neural network processing model, a sine wave memristive neural network processing model, or a three-dimensional image memristive neural network processing model.
Description
Efficient memristor neural network in-situ training system Technical Field The invention belongs to an artificial intelligence technology, and particularly relates to an in-situ training system based on a memristive neural network. Background In recent years, algorithms and applications based on artificial intelligence technology have made tremendous progress thanks to the rapid increase in computing power. However, storage walls and increasingly prominent issues with moore's law, computing platforms based on conventional von neumann architectures exhibit certain limitations. Because of the feature of memory and computation fusion in neuromorphic computation, it is considered a new approach to replace the traditional von neumann structure. Therefore, the scheme of implementing the neural network by hardware has received a great deal of attention about low power consumption and fast parallel computing capability. In the neural network, a large number of matrix multiplication operations exist, and in the memristive-based neural network, the multiplication and accumulation operations of the matrices can be effectively realized by utilizing ohm's law and kirchhoff's law, and meanwhile, the weights of the neural network can be effectively stored in the memristive cross array. However, existing memristors often have a large number of non-ideal characteristics, such as device-to-device variations, cycle-to-cycle variations, and pulse-to-pulse variations, which can lead to uncertainty in the memristive neural network weight encoding, resulting in catastrophic failure of the memristive neural network in situ training. Therefore, the high-efficiency memristor neural network in-situ training system is designed aiming at the problems that memristor conductivity updating is difficult to control and the like in-situ training, and the system combines hardware and neural network training to optimize an algorithm, so that the high-efficiency memristor neural network in-situ training is realized. Disclosure of Invention In order to solve the problems in the background art, the invention provides a high-efficiency memristor neural network in-situ training system. The technical scheme is as follows: The efficient memristor neural network in-situ training system is characterized by comprising a model database, an input module and a model editing module, wherein Vteam memristor neural networks and a series of memristor neural network processing models are stored in the model database. The input module is used for sending the preprocessed input data to the model editing module. The model editing module is used for extracting a memristor neural network processing model corresponding to the input data from a model database, extracting Vteam memristor neural networks, building and training an optimized memristor neural network in-situ training model, and storing the model in a memory for later use. Preferably, the input module is used for preprocessing or performing analog-to-digital conversion on the input signal to obtain the digital signal with the characteristics of the input signal. By adopting the structure, the system can build a corresponding memristor neural network in-situ training model aiming at the input signal type, the model can accurately and efficiently perform in-situ training, the pretreated digital signal is subjected to nerve morphology calculation, and the data of the input signal are saved in a memory in a current mode for standby. Before input data is input into a memristor neural network in-situ training model, a memristor array module in the memristor neural network in-situ training model needs to be initialized, and the initialization setting is as follows: (1) All memristors are set to a low conductance state. (2) Initializing weightsWeight division is performed from-1 to 1 at intervals of 2/N, and the number of each interval is (1, K). (3) Judging the interval K l where the current weight is located, and calculating the corresponding pulse number. When W l is less than 0, determining a section K l where the current weight is located, obtaining a corresponding pulse number |K l - (N+1)/2| and applying a corresponding synergistic pulse to the negative data group, when W l is more than 0, determining a section K l where the current weight is located, obtaining a corresponding pulse number K l - (N+1)/2, applying a corresponding potential pulse to the positive data group, and when W l =0, keeping the current state. Wherein W l is represented as the weight of the first layer, N l-1 is represented as the number of neurons of 1l1 layer, N is represented as an initialized conductivity state, K is similar to N in definition for convenience of distinction, K l is represented as the weight interval number to which the current weight belongs, after initialization, the weight of each layer is quantized, the quantized weight is encoded into a corresponding number of pulses, and the pulses are applied to the memristor