Search

CN-122021742-A - Variable-precision Tanh activation function fitting method

CN122021742ACN 122021742 ACN122021742 ACN 122021742ACN-122021742-A

Abstract

The invention discloses a fitting method of a variable-precision Tanh activation function, which divides the input of a positive half shaft into a linear region (0,1.156) and a saturated region (1.156, ++) based on the characteristics of the Tanh function, sets target absolute precision epsilon 0 to determine a demarcation point x 0 of a linear function y=x and a trapezoidal function, adopts y=x fitting in the linear region, ensures that the error does not exceed epsilon 0 through piecewise iteration in the trapezoidal region, adopts an optimization constant temp_set fitting in the saturated region, and realizes negative half shaft fitting or negative number discarding by utilizing the characteristics of the Tanh odd function.

Inventors

  • SHA JIN
  • JIAO CHENGLONG

Assignees

  • 南京大学

Dates

Publication Date
20260512
Application Date
20251106

Claims (5)

  1. 1. A fitting method of a variable-precision Tanh activation function is used for a neural network terminal deployment scene and meets the requirements of a neural network model reasoning stage on low power consumption, small area and high speed realization of the Tanh activation function; the method is characterized by comprising the steps of (1) dividing an input area of the Tanh function into a linear area (0,1.156) and a saturation area (1.156, ++) by dividing the input of the Tanh function positive half-axis, wherein the linear area is a main function area of which the Tanh function is approximately linear, and the saturation area is an area of which the function value is approximately 1. (2) Setting a target absolute precision epsilon 0 , calculating a demarcation point x 0 when the error of a linear function y=x and an original Tanh function is equal to epsilon 0 , adopting y=x to directly fit in a linear region (0, x 0 ), dividing an input into n=2 k parts in a trapezoid region (x 0 , 1.156), iteratively adjusting the number of segments, ensuring that the fitting error of each segment does not exceed epsilon 0 , and adopting a fixed constant temp_set to fit in a saturation region. (3) And (3) negative half-shaft fitting processing, namely, symmetrically transforming positive half-shaft fitting results to obtain negative half-shaft output by utilizing the odd function property of the Tanh function, or discarding negative number processing according to hardware resource requirements.
  2. 2. The method for fitting a variable-precision Tanh activation function according to claim 1, wherein the boundary point 1.156 between the linear region and the saturated region corresponds to an input value of a Tanh function value 0.8197, the linear region is fitted by y=x, and the saturated region is a piecewise trapezoidal function.
  3. 3. The method for fitting a variable-precision Tanh activation function according to claim 1, wherein the piecewise trapezoidal function is generated by determining absolute error precision epsilon 0 as required, determining the step size of x in abscissa, and then iterating the steps to ensure that the absolute error does not exceed epsilon 0 .
  4. 4. The method of claim 1, wherein the saturation region fixed constant temp_set is determined by performing a test experiment on an AI model, and traversing a plurality of constant values to optimize the model output.
  5. 5. The method for fitting a variable-precision Tanh activation function according to claim 1, wherein the negative number processing can be omitted and the original function can be fitted when the hardware resource requirement is high.

Description

Variable-precision Tanh activation function fitting method 1. Technical field The invention relates to the crossing field of neural network technology and integrated circuit design, in particular to a fitting method and hardware reality of a Tanh activation function with variable precision, which are particularly suitable for neural network deployment scenes with severe constraints on power consumption, area and delay of an internet of things terminal, portable equipment and the like, and can be applied to the integrated circuit design of tasks such as image classification, handwriting digital recognition and the like of an AI model using the Tanh activation function. 2. Background art The neural network model has shown excellent feature extraction and classification capability in the deep learning field, and terminal deployment (such as mobile phones, wearable equipment and sensors of the Internet of things) becomes an industry development trend. However, the terminal device is limited by battery capacity, physical size, and other factors, and puts a core demand of "low power consumption, small area, high speed" on the hardware implementation of the neural network. The activation function is a core unit of the neural network and is responsible for mapping the input after linear transformation into nonlinear output so as to enhance the expression capacity of the model. The Tanh function as a common activation function has the advantages of 0 output mean value (avoiding data distribution deviation), excellent nonlinear characteristics (derivative range is (0, 1), gradient extinction alleviation) and the like, but has the analytic formula ofThe calculation complexity is high, a large number of multipliers, adders and storage modules are needed for realizing hardware, so that the resource consumption is high and the operation delay is high. The existing Tanh activation function approximation method has the following defects: 1. The Taylor expansion method has higher precision near x=0, but the absolute error exceeds 0.2 at x= 1.156 (the boundary point between the linear region and the saturation region), and the five-order Taylor expansion requires 4 multipliers and 2 adders, so that the hardware resource consumption is high; 2. the piecewise linear method is to fit through a multi-section linear function, but each section needs 1 multiplier and 1 adder, the hardware cost is increased sharply due to the increase of the number of the piecewise, and the terminal scene is difficult to adapt; 3. The step function method is realized by storing function values through a lookup table, but the relative error is overlarge near x=0 (a Tanh main function area), if the precision needs to be ensured, the number of segments needs to be greatly increased, and the storage resource is wasted; 4. The method has the advantages that the fitting precision cannot be flexibly adjusted according to the terminal requirements, and the trade-off relation between 'model accuracy' and 'hardware performance' is difficult to balance. Therefore, a method for fitting the Tanh activation function with variable precision and high efficiency realized by hardware is needed to meet the comprehensive requirements of the deployment of the neural network terminal. 3. Summary of the invention Object of the invention Aiming at the defects of the prior art, the invention provides a fitting method and hardware realization of a variable-precision Tanh activation function, solves the problems of fixed precision, large hardware resource consumption and low operation speed of the existing method, realizes the aims of adjustable precision, low resource consumption and high speed improvement, and adapts to the requirements of different terminal equipment. (II) technical scheme 1. Variable-precision Tanh activation function fitting method 1) Dividing the area: based on the nature of the Tanh function image, the positive half-axis input is divided into a linear region (0,1.156) and a saturation region (1.156, +++). The linear region is a main function region with an approximate linearity of the Tanh function, is sensitive to transformation of the data region, has a saturation region function value approaching 1, and can be approximated by a constant; 2) Setting precision parameters, namely setting target absolute precision epsilon 0 (such as 0.03, 0.05 and 0.1), wherein the smaller epsilon 0 is, the higher the fitting precision is, and the higher the hardware resource consumption is; 3) Calculating the boundary point, namely calculating the absolute error epsilon= |tanh (x) -x|ofthe linear function y=x and the original Tanh function, and when epsilon=epsilon 0, obtaining the corresponding x value as the boundary point x 0 of the linear region and the trapezoid region. For example, epsilon 0 =0.03 with x 0=0.4375,ε0 =0.1 with x 0 =0.75; 4) Segment fitting: -linear region (0, x 0) by directly fitting y=x without additional hardware modules, the error not exceeding