CN-122021913-A - Simulated edge reasoning system and manufacturing method thereof
Abstract
The invention discloses a simulated edge reasoning system and a manufacturing method thereof, wherein the system comprises at least one weight parameter plate, a plurality of physical solidified operation units are distributed on a substrate of each weight parameter plate, the physical geometric characteristic or material characteristic of each operation unit corresponds to a preset simulated weight value, at least one strip-shaped packaging chip is packaged on the edge or the surface of the weight parameter plate, the length of each strip-shaped packaging chip is larger than the width, and the strip-shaped packaging chip is provided with a high-density simulation interface array and forms physical compression joint with the operation units of the weight parameter plate. The invention does not need to frequently access an external memory, greatly reduces operation power consumption and delay, avoids the problem of a memory wall in the traditional digital architecture, realizes high-efficiency coordination of analog calculation and digital state management, reserves high-energy efficiency of the analog calculation, and has flexibility and precision of a digital system.
Inventors
- HUANG JIAMING
Assignees
- 晶测自动化(深圳)有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260203
Claims (10)
- 1. A simulated edge reasoning system, comprising: at least one weight parameter plate, wherein a plurality of physical curing operation units are distributed on a substrate of each weight parameter plate, and the physical geometric characteristics or material characteristics of each operation unit correspond to a preset simulation weight value; At least one strip-shaped packaging chip which is packaged at the edge or the surface of the weight parameter plate, wherein the length of each strip-shaped packaging chip is larger than the width, is provided with a high-density simulation interface array, and forms physical compression joint with the operation unit of the weight parameter plate; the strip-type packaging chip comprises a mixed signal processing circuit, wherein the mixed signal processing circuit is used for collecting analog operation signals from the weight parameter board, processing the analog operation signals by the digital state operation unit, converting the analog operation signals into analog driving signals and feeding the analog driving signals back to the weight parameter board.
- 2. The simulated edge reasoning system of claim 1 wherein the simulated weight values on the weight parameter board are cured by any of: a laser trimming technology, wherein a plurality of thin film resistor units are arranged on the surface of the weight parameter plate, and each unit determines weight through laser ablation; the electron beam or ion beam ablates or stacks the resistor body with different shape thickness; convex groove structures with different thicknesses formed by the nanoimprint technology; film deposition and photolithography techniques form resistive films of different shapes, thicknesses or compositions; Wherein the analog weight value has an analog resolution of at least 4 or more orders.
- 3. The analog edge reasoning system of claim 1, wherein the mixed signal processing circuit of the elongated packaged chip further comprises: a plurality of analog-to-digital converters for converting the current or voltage signals output by the weight parameter boards into digital data, the analog-to-digital converters having linear or logarithmic sampling characteristics; the digital register and logic units are used for storing the hidden state of the neural network and executing gating logic operation of a nonlinear activation function or a state space model; And the digital-to-analog converters or the pulse width modulators are used for restoring the processed digital signals into analog signals and outputting the analog signals to the weight parameter plates.
- 4. The analog edge inference system of claim 3, wherein the digital register and logic unit comprises a nonlinear operation module, the nonlinear operation module comprising: the plurality of lookup table units are used for storing preset nonlinear function curves; A plurality of hardware multipliers for performing element-by-element gated multiplications; the nonlinear operation module is positioned between the analog-to-digital converter and the digital-to-analog converter and is used for correcting the linear accumulation result from the weight parameter plate.
- 5. The analog edge reasoning system of claim 1, wherein the sliver package chip includes at least one set of high-speed serial interfaces for implementing: data synchronization among a plurality of strip-type packaged chips; the layering among the weight parameter boards is connected in series to form a hidden layer of the large-scale language model; And exchanging data with an external input/output device.
- 6. The simulated edge reasoning system of claim 1 wherein the strip-type packaged chip and the weight parameter board are physically crimped and electrically connected using a flip-chip-on-glass technique or an anisotropic conductive film.
- 7. The system of claim 1, wherein the operation architecture of the system is a state space model or a linear attention mechanism, wherein the weight parameter board is responsible for performing a static matrix multiply-add operation, and the strip-type packaged chip is responsible for performing a time-step dynamic state update.
- 8. The simulated edge reasoning system of claim 1 wherein the weight parameter board comprises a shielding structure comprising: the grounded isolation walls are distributed among the plurality of operation units and are formed by embossing deep trenches on a substrate and filling conductive medium; The multiple differential operation architectures, wherein the logic weight of each operation unit is composed of a pair of physically symmetrical resistance units, and the logic weight is used for counteracting the environmental electromagnetic interference; And the embedded metal shielding layer is positioned below the operation unit and used for isolating high-frequency digital noise from the strip-type packaged chip.
- 9. The simulated edge reasoning system of claim 1 wherein the simulated weight values on the weight parameter plate are determined by an active feedback micro-trimming process comprising: Energy beam processing, namely adopting any one or combination of laser beams, electron beams or ion beams to change physical characteristics of an operation unit on the weight parameter plate; The subtraction and addition are parallel, including ablation by energy beam to increase the resistance value and material accumulation by energy beam to decrease the resistance value, thereby bi-directionally trimming the simulation weight; And the strip-shaped packaging chip is used as a monitoring unit in the process, an excitation signal is applied to the operation unit, feedback current is collected, and real-time closed-loop control is formed with the energy beam processing device.
- 10. A method of manufacturing a simulated edge reasoning system as claimed in claim 1 comprising the steps of: The strip-shaped packaging chip is pressed and connected on a weight parameter board substrate preset with a resistance film; starting the strip-type packaging chip to enter a calibration mode, and sampling parallel or serial electrical properties of the synaptic units on the weight parameter plate; the synapse units are ablated and piled by using a controlled energy beam, and meanwhile, the resistance change is fed back in real time by the strip-shaped packaging chip to form closed-loop control; After the burning of the weight parameters of the whole board is completed, the strip-type packaging chip is switched to an inference mode, and AI model operation is executed.
Description
Simulated edge reasoning system and manufacturing method thereof Technical Field The invention relates to the technical field of artificial intelligent computing, in particular to a simulated edge reasoning system and a manufacturing method thereof. Background Current artificial intelligence inference systems rely primarily on von neumann architecture, with significant "memory wall" problems and high power bottlenecks when performing computationally intensive tasks such as large-scale language models. Although analog computation has a certain advantage in terms of energy efficiency, matrix multiplication and addition operation can be directly realized through a physical law (such as ohm law and kirchhoff law), so that energy consumption is greatly reduced, the existing analog computation system still has the following defects that firstly analog signals are easy to interfere and attenuate in the transmission process, multilayer and large-scale expansion are difficult to realize, secondly, an efficient dynamic state storage and management mechanism is lacked, and a modern neural network structure such as a state space model and the like which needs to frequently update a hidden state is difficult to support. Therefore, there is a need for an edge reasoning system that combines analog computation energy efficiency with digital state management flexibility to implement efficient execution of complex AI models under low power consumption conditions. Disclosure of Invention The invention aims to provide a simulated edge reasoning system and a manufacturing method thereof, which aim to solve the technical problems of signal attenuation, difficult expansion and difficult dynamic state storage faced by the existing large model operation so as to realize the efficient execution of a complex AI model under the condition of low power consumption. To achieve the above object, the present invention provides a simulated edge reasoning system comprising: at least one weight parameter plate, wherein a plurality of physical curing operation units are distributed on a substrate of each weight parameter plate, and the physical geometric characteristics or material characteristics of each operation unit correspond to a preset simulation weight value; At least one strip-shaped packaging chip which is packaged at the edge or the surface of the weight parameter plate, wherein the length of each strip-shaped packaging chip is larger than the width, is provided with a high-density simulation interface array, and forms physical compression joint with the operation unit of the weight parameter plate; the strip-type packaging chip comprises a mixed signal processing circuit, wherein the mixed signal processing circuit is used for collecting analog operation signals from the weight parameter board, processing the analog operation signals by the digital state operation unit, converting the analog operation signals into analog driving signals and feeding the analog driving signals back to the weight parameter board. In a preferred embodiment, the simulated weight values on the weight parameter plate are cured by any of the following means: a laser trimming technology, wherein a plurality of thin film resistor units are arranged on the surface of the weight parameter plate, and each unit determines weight through laser ablation; the electron beam or ion beam ablates or stacks the resistor body with different shape thickness; convex groove structures with different thicknesses formed by the nanoimprint technology; film deposition and photolithography techniques form resistive films of different shapes, thicknesses or compositions; Wherein the analog weight value has an analog resolution of at least 4 or more orders. In a preferred embodiment, the mixed signal processing circuit of the elongated packaged chip further includes: a plurality of analog-to-digital converters for converting the current or voltage signals output by the weight parameter boards into digital data, the analog-to-digital converters having linear or logarithmic sampling characteristics; the digital register and logic units are used for storing the hidden state of the neural network and executing gating logic operation of a nonlinear activation function or a state space model; And the digital-to-analog converters or the pulse width modulators are used for restoring the processed digital signals into analog signals and outputting the analog signals to the weight parameter plates. In a preferred embodiment, the digital register and logic unit comprises a nonlinear operation module, wherein the nonlinear operation module comprises: the plurality of lookup table units are used for storing preset nonlinear function curves; A plurality of hardware multipliers for performing element-by-element gated multiplications; the nonlinear operation module is positioned between the analog-to-digital converter and the digital-to-analog converter and is used for correcting the linear accum