CN-122021718-A - Event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision

CN122021718ACN 122021718 ACN122021718 ACN 122021718ACN-122021718-A

Abstract

The invention belongs to the technical field of artificial intelligent chips and reconfigurable computation, and particularly relates to an event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision. Aiming at the problems of low storage utilization rate, obvious data access bottleneck, limited parallelism, over-high calculation redundancy and the like of the traditional GNN accelerator in an event vision scene, the invention provides a novel framework combining efficient graph feature storage, hierarchical graph construction and redundancy elimination convolution calculation. By introducing parallel read-write optimization, low-dependency graph structure generation and a reusable calculation caching mechanism, the invention remarkably reduces calculation delay and improves the overall throughput rate while keeping controllable resource consumption, thereby realizing sub-microsecond real-time reasoning capability. The method is suitable for application scenes such as unmanned, intelligent monitoring, robot navigation and the like which need low-delay visual computation on the edge equipment.

Inventors

HA YAJUN
LIU TIANHANG
ZHANG SHEN

Assignees

上海科技大学

Dates

Publication Date: 20260512
Application Date: 20260108

Claims (9)

1. The event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision is characterized by comprising a graph construction module, a graph feature storage module and a graph convolution module, wherein when an event camera output event ev= (x, y, t, p) arrives, the graph construction module searches historical events in a preset space window in parallel and generates a neighbor buffer zone by combining time constraint to form a local subgraph and stores the local subgraph into the graph feature storage module, the (x, y) is the space coordinate of an event node, t is a time stamp and p is a polarity, the graph convolution module reads neighbor node features or cache feature items from the graph feature storage module in parallel according to the neighbor buffer zone, completes message generation, aggregation and activation update to obtain a target node update feature, and the graph convolution module writes the target node update feature back to the graph feature storage module for multiplexing of subsequent events while the target node update feature outputs an inference result through a reading layer and a full connection layer.
2. The event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision as set forth in claim 1, wherein the graph construction module and the graph feature generated by the graph convolution module store the graph feature in the graph feature storage module by adopting a graph feature storage mechanism, the graph feature storage mechanism comprises the steps of storing the graph feature by adopting a fine-grained space tiling mapping mechanism, dividing the graph feature into repeated space blocks according to a fixed size in a two-dimensional space by taking space coordinates (x, y) of an event node as input, mapping positions in different space blocks into storage blocks of different graph feature storage modules according to a periodic rule, guaranteeing that candidate features in any one neighborhood search window are stored in a plurality of storage blocks in a scattered mode, adopting a depth addressing and compression storage mechanism, only keeping up-to-date historical events and corresponding features for each space position in a time dimension, adopting a sparse sensing compression storage mode, organizing the inside each storage block into features written in an event arrival sequence in the space dimension, only storing the features in the space blocks in a sparse sensing mode, firstly storing the features in a buffer area, and obtaining the depth index of a corresponding to a depth index buffer, and obtaining the current position of a buffer, and obtaining the depth index of the current position of the buffer, and storing the depth index in the buffer.
3. The event-driven asynchronous graphic neural network FPGA accelerator for real-time edge vision as claimed in claim 2, wherein when the graphic construction module or the graphic convolution module reads the historical event feature in a certain local window based on the fine-grained spatial tiling mapping mechanism, corresponding memory block numbers and addresses in memory blocks are generated in parallel according to coordinates of each position in the local window, and candidate features are synchronously read out from a plurality of memory blocks in the same clock cycle.
4. An event driven asynchronous graphic neural network FPGA accelerator for real time edge vision as defined in claim 3 wherein said local window employs a rectangular neighborhood search window having a width and height aligned with said memory block arrangement.
5. The event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision as claimed in claim 2, wherein the graph construction module adopts a low-dependency hierarchical graph construction mechanism, completes the construction of neighborhood retrieval and the local subgraph when each event ev= (x, y, t, p) arrives, comprises the steps of establishing a space neighborhood search window with a fixed size by taking a space coordinate (x, y) as a center after a new event ev= (x, y, t, p) arrives, enabling candidate events in the space neighborhood search window to be stored in different storage blocks in a scattered mode through the fine-granularity space tiling mapping mechanism, enabling the graph construction module to read candidate event information from a plurality of storage blocks in parallel in a single period, writing the candidate event information into a neighbor buffer, and simultaneously completing time selection and valid event screening in the process of the candidate event information into the neighbor buffer, calculating time difference dt=t-tau for each candidate event, judging whether a current candidate value meets a time sequence association condition according to a threshold value_DT, simultaneously combining valid event validity and invalid time constraint and valid event filtering, and simultaneously obtaining a neighbor node valid constraint and a current neighbor node.
6. The event driven asynchronous graphic neural network FPGA accelerator for real-time edge vision as claimed in claim 5, wherein the graphic construction module adopts a hierarchical write strategy to divide the neighbor buffer into a plurality of write levels and pre-allocate slots, wherein high-priority slots support direct parallel writing, medium-priority slots realize partial parallel writing by adopting round robin allocation, and low-priority slots guarantee writing integrity by adopting a sequential complement mode.
7. The event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision as set forth in claim 1, wherein the graph convolution module obtains the target node update feature by using a redundancy-free parallel graph convolution calculation mechanism, and comprises the graph convolution module simultaneously reading cache feature items of a plurality of neighbor nodes by using a multi-storage-block parallel reading mechanism of the graph feature storage module according to the effective neighbor node index recorded in the neighbor buffer And parallel computing corresponding geometric terms Adding the two to generate multi-path neighborhood information Multiple neighborhood information using pipelining aggregation structure Executing aggregation operation, and sequentially completing offset addition, quantization and activation function processing on an aggregation result to obtain a feature vector updated by a target event node 。
8. The event driven asynchronous graphic neural network FPGA accelerator for real-time edge vision of claim 7, wherein the neighborhood message is processed by The generation process is divided into two parts of neighbor feature items and geometric items, so that the neighborhood information Represented as Wherein, the neighbor feature items Is only determined by the characteristics of the neighbor nodes and expressed as Independent of the target node, A learnable weight matrix as a feature term, a geometric term Is determined by the relative position characteristics of the target node and the neighbor nodes, and is changed along with the target node, Is a matrix of learnable weights for geometric terms.
9. The event driven asynchronous graphic neural network FPGA accelerator for real-time edge vision of claim 7, wherein the graphic convolution module performs a layer k convolution and obtains the updated feature vector of the target node The feature items required by the target node as the neighbor nodes in the next-layer convolution are pre-calculated immediately at the tail of the pipeline And to take the characteristic item When the neighbor node j is used in the subsequent event convolution calculation, the graph convolution module directly reads the corresponding neighbor feature item from the feature item buffer area As the neighborhood message Generating input while only for geometric terms that vary with the target node Performing online calculation and performing And (3) with Adding to obtain the neighborhood information 。

Description

Event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision Technical Field The invention relates to an event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision, belonging to the technical field of artificial intelligent chips and reconfigurable computation. Background Event-based Vision (Event-based Vision) exhibits significant advantages in real-time visual scenarios such as unmanned, robotic, high-speed monitoring, etc., with its microsecond time resolution and sparse data output. Event-driven asynchronous graph neural network (Event-driven Asynchronous GNN) can capture the fine space-time dependency relationship between events through dynamic graph structure construction and a convolution mode based on message transmission, is an important method for realizing real-time and high-robustness edge vision, and is more suitable for asynchronous and sparse Event stream processing requirements ([ 1] [2 ]) than a traditional frame type deep learning model. In the hardware level, the FPGA is used as a reconfigurable computing platform, and is widely used for the research of the GNN accelerator due to the advantages of strong parallelism, customizable data path, low energy consumption and the like. Various FPGA GNN acceleration schemes such as SpGCN (3), FP-GNN (4) and DYNASPARSE (5) improve reasoning performance by optimizing sparse calculation modes, pipeline structures and parallel memory access. However, they are generally based on static or batch graphs, and cannot adapt to continuously updated dynamic graph structures in event driven scenarios. For event vision scenarios, hardware studies have also emerged that are partially directed to event driven GNNs. If EFGCN (6) is adopted, the graph scale is obviously reduced through three-dimensional pooling, a deep network can be deployed on the FPGA, and EvGNN (7) is adopted to realize event-level local subgraph construction and asynchronous convolution and obviously reduce end-to-end delay. The researches lay a foundation for event-driven GNN acceleration, but have key bottlenecks of low memory access efficiency, insufficient parallelism, large redundancy, difficult delay reduction to sub-microsecond level and the like. Reference to the literature [1] G. Gallego, T. Delbr¨uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis et al., "Event-based vision: A survey," IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 1, pp. 154–180, 2020. [2] P. Lichtsteiner, C. Posch, and T. Delbruck, "A 128× 128 120 db 15 μs latency asynchronous temporal contrast vision sensor," IEEE Journal of Solid-State Circuits, vol. 43, no. 2, pp. 566–576, 2008. [3] X. Xu, Q. Liu, W. Huang, W. Peng, and Y. Huang, "Spgcn: An fpga-based graph convolutional network accelerator for sparse graphs," in 2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 2024, pp. 216–216. [4] T. Tian, L. Zhao, X. Wang, Q. Wu, W. Yuan, and X. Jin, "Fp-gnn: adaptive fpga accelerator for graph neural networks," Future Generation Computer Systems, vol. 136, pp. 294–310, 2022. [5] B. Zhang and V. Prasanna, "Dynasparse: Accelerating gnn inference through dynamic sparsity exploitation," in 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2023, pp. 233–244. [6] K. Jeziorek, P. Wzorek, K. Blachut, A. Pinna, and T. Kryjak, "Embedded graph convolutional networks for real-time event data processing on soc fpgas," arXiv preprint arXiv:2406.07318, 2024. [7] Y. Yang, A. Kneip, and C. Frenkel, "Evgnn: An event-driven graph neural network accelerator for edge vision," IEEE Transactions on Circuits and Systems for Artificial Intelligence, vol. 2, no. 1, pp. 37–50, 2025. Disclosure of Invention The invention aims to provide an event-driven asynchronous GNN FPGA acceleration scheme which can be integrally optimized from the aspects of storage structure, parallel graph construction, convolution redundancy elimination and the like and supports sub-microsecond event processing. In order to achieve the above purpose, the technical scheme of the invention is to disclose an event-driven asynchronous graph neural network FPGA accelerator for real-time edge vision, which is characterized by comprising a graph construction module, a graph feature storage module and a graph convolution module: When event camera output event ev= (x, y, t, p) arrives, the graph construction module searches historical events in parallel in a preset space window, generates a neighbor buffer zone by combining time constraint to form a local subgraph, and stores the local subgraph into the graph feature storage module, wherein (x, y) is the space coordinate of an event node, t is a time stamp, and p is a polarity; the graph rolling module reads neighbor node characteristics or cache characteristic items from the graph characteristic sto