Search

CN-121985046-A - Dynamic multi-mode packaging method, device and chip

CN121985046ACN 121985046 ACN121985046 ACN 121985046ACN-121985046-A

Abstract

The application relates to the technical field of artificial intelligent chip data transmission and provides a dynamic multi-mode packaging method, a device and a chip, wherein a data transmitting end receives an input data stream, analyzes semantic features of multiple dimensions of the input data stream in real time, and dynamically decides a target packaging strategy and effective data count from multiple predefined packaging modes according to multi-dimensional feature information obtained by analysis; and the data receiving end receives the compressed data packet, selects a corresponding unpacking mode according to the target packing strategy identification, and restores the packed data block to the original data stream. The application can sense the semantic features of the data in real time and dynamically select the optimal packing strategy, thereby remarkably improving the utilization rate and the transmission efficiency of the data transmission bandwidth.

Inventors

  • WANG DAN
  • JIA SHUYUN
  • ZHOU XINLAI
  • ZHANG FAN

Assignees

  • 翼华科技(北京)股份有限公司

Dates

Publication Date
20260505
Application Date
20260401

Claims (10)

  1. 1. A dynamic multi-modal packaging method for use in an on-chip or inter-chip data transfer process, the method comprising: The data transmitting terminal receives an input data stream, analyzes semantic features of multiple dimensions of the input data stream in real time, and dynamically decides a target packing strategy and effective data count from multiple predefined packing modes according to multi-dimensional feature information obtained by analysis; The data sending end packs the effective data in the input data stream according to the target packing strategy and the effective data count to generate a packing data block, and adds a packing information head in front of the packing data block to obtain a compressed data packet, wherein the packing information head comprises an identification of the target packing strategy and the effective data count; And the data receiving end receives the compressed data packet, analyzes the packaging information header to obtain the target packaging strategy identifier and the effective data count, selects a corresponding unpacking mode according to the target packaging strategy identifier, and restores the packaged data block to an original data stream according to the effective data count.
  2. 2. The method of claim 1, wherein the semantic features of the plurality of dimensions include at least a sparseness feature and a quantized bit width feature, wherein the sparseness feature is obtained by detecting a length of consecutive zero values in the input data stream by a consecutive zero value counter, and wherein the quantized bit width feature is obtained by detecting a length of high order extension bits of the input data by a high order sign bit detector to determine an actual effective bit width of the data.
  3. 3. The method of claim 2, wherein dynamically deciding the target packaging policy from among a plurality of predefined packaging modes comprises: Determining a packaging strategy corresponding to the current data block through a preset decision logic or a lightweight decision tree according to the detected sparsity characteristic and the detected quantized bit width characteristic; wherein the plurality of predefined packaging modes includes at least: Sparse packing mode, quantized packing mode, mixed packing mode and straight-through packing mode; the hybrid packing mode is a combined packing mode that utilizes both sparsity and quantization characteristics.
  4. 4. The method of claim 3, wherein said packing valid data in said input data stream when said target packing strategy is a sparse packing mode comprises: zero value data is filtered, only non-zero data is captured, and its position coordinates in the original data stream are generated for each non-zero data, packaged in the form of { coordinates, non-zero data } tuples.
  5. 5. The method of claim 3, wherein said packing valid data in said input data stream when said target packing strategy is a quantized packing mode comprises: and continuously splicing a plurality of low-precision data into a high-bit-width data packet according to the detected effective bit width, and packaging.
  6. 6. The method of claim 3, wherein said packing valid data in said input data stream when said target packing policy is a hybrid packing mode comprises: Filtering zero value data, splicing and packaging the captured non-zero low-precision data according to the effective bit width, and packaging in a form of a tuple of { coordinate offset, spliced data }; The generation logic of the coordinate offset is that when the non-zero data is detected, the sequence number of the non-zero data in the non-zero data sequence is generated according to the number of the non-zero data which is captured currently, and the sequence number and the spliced data form a tuple together, so that a receiving end can determine the accurate position of each non-zero data in the original data stream according to the sequence number and the effective data count.
  7. 7. The method of claim 1, wherein selecting a corresponding unpacking mode according to the target packing policy identification, recovering the packed data block into an original data stream according to the valid data count, comprises: Routing the packed data block to a corresponding unpacking unit according to the target packing strategy identifier; If the unpacking unit is the unpacking unit corresponding to the sparse packing mode, reading { coordinates and non-zero data } tuples according to the effective data count, and writing the non-zero data back to the original position according to the coordinates, wherein a gap is filled with zero; if the unpacking unit is the unpacking unit corresponding to the quantized packing mode, splitting the spliced data into low-precision data of the original quantity according to the effective data count, and outputting the low-precision data in sequence; If the data are unpacking units corresponding to the mixed packing mode, reading { coordinate offset and spliced data } tuples according to the effective data count, splitting the spliced data into low-precision data, and writing the low-precision data back to the original position according to the coordinate offset; And if the compressed data is the unpacking unit corresponding to the straight-through packing mode, expanding the compressed data according to the original bit width according to the effective data count.
  8. 8. The method according to claim 1, wherein the method further comprises: and reordering the unpacked data to restore the sequence of the original data stream.
  9. 9. A dynamic multi-modal packaging apparatus for performing the method of any one of claims 1 to 8, the apparatus comprising: The feature perception front end is arranged at the data sending end and is used for receiving an input data stream, analyzing semantic features of multiple dimensions of the input data stream in real time, dynamically deciding and outputting a target packing strategy and effective data count according to the multi-dimensional feature information obtained by analysis; The multi-mode packaging engine is arranged at the data sending end and connected with the feature perception front end, and is used for packaging the effective data in the input data stream according to the target packaging strategy and the effective data count to generate a packaging data block, and a packaging information head containing the target packaging strategy identifier and the effective data count is added in front of the packaging data block to obtain a compressed data packet; The configurable unpacker is arranged at the data receiving end and is communicated with the multi-mode packing engine, and is used for receiving the compressed data packet, analyzing the packing information header to obtain the target packing strategy identifier and the effective data count, selecting a corresponding unpacking mode according to the target packing strategy identifier, and recovering the packed data block into an original data stream according to the effective data count.
  10. 10. A chip comprising the dynamic multi-mode packaging device of claim 9, wherein the dynamic multi-mode packaging device is used for realizing an on-chip or inter-chip data transmission process.

Description

Dynamic multi-mode packaging method, device and chip Technical Field The application relates to the technical field of chips, in particular to a dynamic multi-mode packaging method, a dynamic multi-mode packaging device and a chip. Background In AI chips, particularly accelerators for Large Language Model (LLM) reasoning, on-chip or inter-chip data transfer bandwidth is one of the core bottlenecks. Two types of data compression or packaging schemes are mainly adopted in the current industry, namely a packer based on data bit width compression, which indicates the effective data position when the effective data is shot according to an input strb signal, and the effective data in an input data bus is extruded to low-order output. In addition, the academic world also provides a plurality of compression technologies oriented to AI acceleration, such as schemes of entropy sensing cache compression, labeling precision detection and the like. However, the prior art has obvious 'semantic blind areas', and semantic features of data cannot be perceived. For example, when consecutive zeros occur in the data stream (sparse mode), the conventional scheme still transmits these zeros, wasting bandwidth, and when the actual effective bit width of the data is much smaller than the bus bit width (e.g., INT4 quantized data), the conventional scheme still transmits at full bit width, with low bandwidth utilization. Meanwhile, the existing scheme generally adopts a single fixed compression or packing strategy, cannot process sparse and quantized mixed data modes at the same time, and is difficult to adapt to the data characteristics of dynamic change in the LLM reasoning process. In addition, the existing receiving end can only unpack data according to a preset fixed format, and can not know what kind of packing strategy is adopted by the sending end, so that the system can not adaptively adjust the transmission mode according to the real-time data characteristics. Disclosure of Invention The embodiment of the application aims to provide a dynamic multi-mode packaging method, a dynamic multi-mode packaging device and a dynamic multi-mode packaging chip, which are used for solving the technical problems. In a first aspect, the present application provides a dynamic multi-mode packaging method applied to an intra-chip or inter-chip data transmission process, the method comprising: The data transmitting terminal receives an input data stream, analyzes semantic features of multiple dimensions of the input data stream in real time, and dynamically decides a target packing strategy and effective data count from multiple predefined packing modes according to multi-dimensional feature information obtained by analysis; The data sending end packs the effective data in the input data stream according to the target packing strategy and the effective data count to generate a packing data block, and adds a packing information head in front of the packing data block to obtain a compressed data packet, wherein the packing information head comprises an identification of the target packing strategy and the effective data count; And the data receiving end receives the compressed data packet, analyzes the packaging information header to obtain the target packaging strategy identifier and the effective data count, selects a corresponding unpacking mode according to the target packaging strategy identifier, and restores the packaged data block to an original data stream according to the effective data count. In an alternative embodiment, the semantic features of the multiple dimensions at least comprise sparseness features and quantized bit width features, wherein the sparseness features are obtained by detecting lengths of continuous zero values in an input data stream through a continuous zero value counter, and the quantized bit width features are obtained by detecting lengths of high-order extension bits of the input data through a high-order sign bit detector to determine actual valid bit widths of the data. In an alternative embodiment, the dynamically deciding the target packing policy from a plurality of predefined packing modes includes: Determining a packaging strategy corresponding to the current data block through a preset decision logic or a lightweight decision tree according to the detected sparsity characteristic and the detected quantized bit width characteristic; wherein the plurality of predefined packaging modes includes at least: Sparse packing mode, quantized packing mode, mixed packing mode and straight-through packing mode; the hybrid packing mode is a combined packing mode that utilizes both sparsity and quantization characteristics. In an optional embodiment, when the target packing policy is a sparse packing mode, the packing the valid data in the input data stream includes: zero value data is filtered, only non-zero data is captured, and its position coordinates in the original data stream are generated for each non-zero