CN-121997851-A - Low-power consumption design method for application specific integrated circuit
Abstract
The invention relates to the technical field of semiconductor integrated circuits and discloses a low-power consumption design method of an application specific integrated circuit. The method comprises the steps of setting an analog domain linear operation unit at the front end to directly process a sensor analog signal, generating a sparse tensor by a sparsity detection and structuring coding module after high-precision analog-to-digital conversion, executing operation on non-zero elements only by a rear-end sparse calculation engine, and independently regulating and controlling power supply and frequency for each module by a dynamic voltage frequency regulation controller according to task load, sparsity and temperature. According to the invention, through the analog-digital cooperative architecture and the whole-flow sparse processing, the power consumption is reduced, the actual measurement energy efficiency ratio is improved, and the overall power consumption is reduced.
Inventors
- ZENG HONG
- ZENG LIUTING
- LUO YUAN
- LIU KAI
- Wei Guanzhen
- DOU FENG
Assignees
- 广西中乾信息技术有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260129
Claims (10)
- 1. The low power consumption design method of the application specific integrated circuit is characterized by comprising the following steps: an analog domain linear operation unit is arranged in a front end sensing module of the application specific integrated circuit and is used for directly performing linear operations such as convolution, matrix multiplication or filtering on an original analog signal from a sensor; converting an output signal of the analog domain linear operation unit into a digital signal through an analog-to-digital converter, and inputting the digital signal to a sparsity detection and structuring encoding module; The sparsity detection and structuring encoding module performs non-zero element identification, position index generation and compression encoding on the received digital signals to form a sparse tensor data structure; Inputting the sparse tensor data structure to a back-end digital processing core, wherein the back-end digital processing core is configured with a sparse computing engine, and performing subsequent nonlinear activation, pooling or classification operation only on non-zero elements and corresponding position indexes thereof; And a dynamic voltage frequency regulation controller is arranged in the application specific integrated circuit, and the dynamic voltage frequency regulation controller respectively applies independent power supply voltage and working frequency regulation instructions to the analog domain linear operation unit, the analog-to-digital converter and the sparse calculation engine according to the current task load intensity, the sparse rate and the real-time temperature feedback.
- 2. The method for designing low power consumption of asic according to claim 1, wherein the setting of the analog domain linear operation unit in the front-end sensing module of the asic comprises: The analog domain linear operation unit is configured to be composed of a transconductance amplifier array, a programmable capacitor network and a current mirror copy circuit; The transconductance amplifier array receives a differential analog voltage signal from an image sensor or a radio frequency front end and converts the differential analog voltage signal into an output current proportional to an input voltage; The programmable capacitance network is configured at an output node of the transconductance amplifier, and the weighted summation of input current is realized through a capacitance weight matrix, so that convolution kernel operation of an analog domain is completed; The current mirror copying circuit copies the weighted total current according to a preset proportion and feeds the total current to the input end of a subsequent cascade analog processing unit or an analog-to-digital converter.
- 3. The method according to claim 2, wherein converting the output signal of the analog domain linear operation unit into a digital signal via an analog-to-digital converter, comprises: Connecting a current output end of the analog domain linear operation unit to an input end of an analog-to-digital converter through a transimpedance amplifier, and converting a current signal into a voltage signal; The analog-to-digital converter adopts a successive approximation register architecture, and the reference voltage range is dynamically adjustable; And transmitting the converted digital samples to a sparsity detection and structuring coding module through a parallel bus.
- 4. The method for designing low power consumption of asic according to claim 3, wherein the sparsity detection and structuring encoding module performs non-zero element identification, position index generation and compression encoding on the received digital signal, comprising: The digital sample of each data channel is compared with a preset activation threshold point by point through a threshold comparator array, and a binarization activation mark is output; generating a three-dimensional coordinate index by an address encoder according to the channel index and the time sequence frame number corresponding to the activation mark being true; And writing all non-zero values and three-dimensional coordinate indexes thereof into the continuous storage blocks according to the row priority order through the compression buffer memory to form a data structure of a compressed sparse row format.
- 5. The method of claim 4, wherein inputting the sparse tensor data structure to a back-end digital processing core comprises: analyzing control header information in the compressed sparse line format data structure through a sparse instruction decoder to generate a read address sequence aiming at a non-zero data cache and a position index cache; storing a non-zero value by a non-zero data cache; storing the three-dimensional coordinate index through a position index cache; Subsequent nonlinear activation, pooling, or classification operations are performed by a sparse operation unit array consisting of 16 parallel processing units, each processing unit including a multiply-add operator, a look-up table, and conditional jump logic.
- 6. The method of claim 5, wherein the sparse operation unit array only starts operation when valid non-zero data and corresponding location index are received, and otherwise enters a low power sleep state, comprising: closing clock gating of the processing unit and cutting off a power domain in a dormant state; The multiply-add operator supports 16-bit fixed point number operation; The look-up table stores piecewise linear approximations of the nonlinear activation functions with a query delay of less than 2 nanoseconds.
- 7. The method of claim 6, wherein the dynamic voltage frequency adjustment controller applies independent power supply voltage and operating frequency adjustment instructions to each functional module according to current task load intensity, sparseness and real-time temperature feedback, respectively, and the method comprises: Counting the number of active processing units of the sparse calculation engine in unit time through a task load monitor, and outputting a load intensity level; Calculating the sparsity of the current data flow in real time by a sparsity calculator according to the ratio of the total number of non-zero elements in the compression buffer memory to the total number of theoretical maximum elements; Reading digital outputs of 3 on-chip temperature sensors distributed in a chip key hot zone through a temperature sensor interface; Corresponding power supply parameters are searched from a preset voltage-frequency comparison table through a multi-domain power supply management unit according to the task load intensity level, the sparseness ratio interval and the temperature threshold, and independent power supply voltage and synchronous clock signals are respectively output to an analog domain linear operation unit, an analog-digital converter and a sparse calculation engine through an on-chip low-dropout regulator.
- 8. The method for designing low power consumption of an asic according to claim 7, wherein the programmable capacitor network of the analog domain linear operation unit is configured with weights by a nonvolatile memory cell array, comprising: the nonvolatile memory cell array adopts a ferroelectric capacitor structure, and each memory cell is formed by connecting a ferroelectric capacitor and a selection transistor in series; the weight writing is realized by applying the polarization inversion voltage pulse, the weight information is kept unchanged after power-off, and reloading is not needed after power-on.
- 9. The method for designing low power consumption of asic according to claim 8, wherein the back-end digital processing core of asic further comprises a sparse data stream scheduler, and the sparse data stream scheduler rearranges the processing sequence of the non-zero elements according to the dimension information of the sparse tensor and the topology structure of the computation graph, so that adjacent processing units access the data blocks with strong spatial locality as much as possible in time, and data handling energy consumption of the on-chip interconnection network is reduced.
- 10. The method of claim 9, wherein the asic integrates an off-chip high bandwidth memory interface on the package substrate for temporarily storing a portion of data exceeding the on-chip buffer capacity into the external memory during processing of the super-large-scale sparse tensor, and implementing background data prefetching and write-back by the direct memory access controller to avoid stalling of the processing pipeline due to data waiting.
Description
Low-power consumption design method for application specific integrated circuit Technical Field The invention belongs to the technical field of semiconductor integrated circuits, and particularly relates to a low-power consumption design method of an application specific integrated circuit. Background Along with the evolution of the semiconductor integrated circuit technology to deep submicron and ultra-deep submicron, an Application Specific Integrated Circuit (ASIC) is widely applied to vision and signal processing scenes such as image classification, target detection, radar echo processing and the like by virtue of high computing efficiency and low delay, and particularly in battery power supply equipment such as mobile terminals, internet of things equipment and the like, low power consumption becomes a core constraint for determining equipment endurance, heat dissipation cost and working stability, and strict requirements are put forward on ASIC energy efficiency ratio optimization. The traditional ASIC mostly adopts an all-digital processing architecture, an analog-to-digital converter (ADC) is needed to convert an original analog signal of a sensor into a digital signal, then a digital operation unit is used for completing linear operations such as convolution, matrix multiplication and the like, and the traditional low-power consumption optimization means is mostly single-dimension dynamic voltage frequency adjustment (DVFS), or partial mixed signal architecture is tried to be introduced into an analog calculation unit, but a complete optimization system is not formed, and the whole data processing is still the main. The traditional all-digital architecture has the following problems that a large amount of dynamic power consumption is brought by frequent analog-digital conversion and high-bit-width digital operation, the inherent sparsity of data is not utilized, the resource waste is caused by invalid calculation and data handling of zero elements, the limitation of the existing optimization means is prominent, the single DVFS lacks differential regulation and control and multidimensional parameter feedback on an analog domain and a digital domain, a part of mixed signal architecture does not construct a complete sparse processing channel, the on-chip interconnection network data handling energy consumption ratio is high, meanwhile, the weight configuration of an analog unit consumes time and power, pipeline stagnation is easily caused by insufficient on-chip cache during the ultra-large scale data processing, the cooperative optimization of calculation precision, power consumption and throughput rate is difficult to realize, and the high-performance low-power consumption scene requirement cannot be met. Disclosure of Invention The invention provides a low-power consumption design method of an application-specific integrated circuit, which aims to solve the technical problems that the power consumption of a traditional full-digital vision or signal processing application-specific integrated circuit is too high in the process of executing front-end dense linear operation, and the inherent sparsity of data cannot be effectively utilized to reduce the energy consumption of subsequent processing. According to the method, a mixed signal processing architecture of cooperative work of an analog domain and a digital domain is constructed, an analog computing unit with high energy efficiency ratio is introduced in a front-end sensing and preliminary computing stage, and a dynamic sparse activation mechanism and a structured data stream scheduling strategy are combined, so that fine management and control of computing resources and power consumption are realized. The invention provides a low-power consumption design method of an application specific integrated circuit, which comprises the following steps: an analog domain linear operation unit is arranged in a front end sensing module of the application specific integrated circuit and is used for directly performing linear operations such as convolution, matrix multiplication or filtering on an original analog signal from a sensor; converting an output signal of the analog domain linear operation unit into a digital signal through an analog-to-digital converter, and inputting the digital signal to a sparsity detection and structuring encoding module; The sparsity detection and structuring encoding module performs non-zero element identification, position index generation and compression encoding on the received digital signals to form a sparse tensor data structure; Inputting the sparse tensor data structure to a back-end digital processing core, wherein the back-end digital processing core is configured with a sparse computing engine, and performing subsequent nonlinear activation, pooling or classification operation only on non-zero elements and corresponding position indexes thereof; And a dynamic voltage frequency regulation controller is arranged in the