KR-20260063736-A - Method and Apparatus for Predicting On-Chip Interconnect Performance of Network on Chip

KR20260063736AKR 20260063736 AKR20260063736 AKR 20260063736AKR-20260063736-A

Abstract

The present disclosure provides a method and apparatus for predicting on-chip interconnect performance that can quickly and accurately predict performance according to various OCI structures for each application, comprising the steps of: obtaining traffic composed of multiple flows representing paths through which packets are transmitted between multiple nodes based on network-on-chip (hereinafter NoC) environment information in which an on-chip interconnect (hereinafter OCI) method is configured according to the configuration of multiple nodes included in a chip that executes the application and the configuration of multiple nodes included in the application and the packet exchange method between multiple nodes; performing neural network operations with a neural network model to estimate node congestion based on packet delay time occurring at each node included in each of the multiple flows included in the traffic and flow performance for each flow reflecting node congestion; and aggregating the multiple flow performances obtained for each of the multiple flows to predict the configured OCI performance.

Inventors

정의영
이상협
김도현
안성진
엄성재

Assignees

연세대학교 산학협력단

Dates

Publication Date: 20260507
Application Date: 20241031

Claims (20)

A step of obtaining traffic composed of multiple flows representing paths for packet transmission between multiple nodes, based on network-on-chip (hereinafter NoC) environment information in which an on-chip interconnect (hereinafter OCI) method is configured according to the configuration of multiple nodes included in an authorized application and a chip executing said application and a packet exchange method between said multiple nodes; A step of performing neural network operations with a neural network model to estimate node congestion based on packet delay time occurring at each node included in each of the multiple flows included in the traffic, and flow performance for each flow reflecting said node congestion; and A method for predicting in-chip interconnect performance comprising the step of predicting the set OCI performance by combining the multiple flow performances obtained for each of the multiple flows.
In paragraph 1, the node congestion is A method for predicting in-chip interconnect performance estimated as queuing delay representing the packet transmission delay of each node included in multiple flows.
In paragraph 2, the node congestion is A method for predicting in-chip interconnection performance by receiving a flow PIR (Packet Injection Rate) set for each flow containing a node for which node congestion is to be calculated among a plurality of flows included in the above traffic, and performing neural network computation.
In paragraph 1, the flow performance is A method for predicting in-chip interconnection performance, which is extracted by performing neural network operations after receiving the number of hops and PIR of each flow set for each of the above flows and the node congestion for each of the multiple nodes included in each flow.
In paragraph 1, the flow performance is A method for predicting in-chip interconnect performance obtained from the delay time and power consumption while at least one packet generated at the source node of each flow is transmitted to the destination node.
In paragraph 1, the above traffic is When executing the application based on the information regarding the configuration of the plurality of nodes included in the above NoC environment information, a plurality of source nodes that each generate and transmit packets among the plurality of nodes and a plurality of destination nodes that receive and process the packets generated from each source node are designated, and A method for predicting in-chip interconnection performance obtained by generating multiple flows by determining a node included on the path through which a packet generated at each of the multiple source nodes is transmitted to the corresponding destination node.
In Clause 6, the above flow is A method for predicting in-chip interconnection performance determined by the routing algorithm of a router included in each of the plurality of nodes, the buffer size, and the PIR based on the speed of a packet generated and transmitted from the source node.
In paragraph 1, the OCI performance is A method for predicting in-chip interconnection performance obtained as the cumulative sum of flow performances estimated for each of a plurality of flows constituting the above traffic.
In paragraph 1, the neural network model A method for predicting in-chip interconnect performance based on the difference between the OCI performance estimated from the training traffic during training and the OCI performance obtained by inputting the training traffic into a simulator in which a simulation environment is established based on the NoC environment.
In Clause 9, the above learning traffic is A method for predicting in-chip interconnection performance composed of a plurality of arbitrarily set flows based on the above NoC environment.
In any one of claims 1 to 10, the method for predicting in-chip interconnection performance A method for predicting in-chip interconnect performance performed by the processor of a device including memory and a processor.
A device comprising: a memory; and a processor that executes at least a portion of an operation according to a program stored in the memory, The above processor A step of obtaining traffic composed of multiple flows representing paths for packet transmission between multiple nodes, based on network-on-chip (hereinafter NoC) environment information in which an on-chip interconnect (hereinafter OCI) method is configured according to the configuration of multiple nodes included in an authorized application and a chip executing said application and a packet exchange method between said multiple nodes; A step of performing neural network operations with a neural network model to estimate node congestion based on packet delay time occurring at each node included in each of the multiple flows included in the traffic, and flow performance for each flow reflecting said node congestion; and An in-chip interconnect performance prediction device that performs the step of predicting the set OCI performance by aggregating the multiple flow performances obtained for each of the multiple flows.
In Clause 12, the above processor An in-chip interconnect performance prediction device that estimates the node congestion using a queuing delay representing the packet transmission delay of each node included in a plurality of flows.
In Clause 13, the above processor An in-chip interconnection performance prediction device that receives a flow PIR (Packet Injection Rate) set for each flow containing a node for which node congestion is to be calculated among a plurality of flows included in the above traffic, and extracts the node congestion by performing neural network computation.
In Clause 13, the above processor An in-chip interconnection performance prediction device that estimates the flow performance by receiving the number of hops and PIR for each flow set for each of the above flows, and the node congestion for each of the multiple nodes included in each flow, and performing neural network calculations.
In Clause 12, the above processor An in-chip interconnect performance prediction device for obtaining flow performance including delay time and power consumption while at least one packet generated at the source node of each flow is transmitted to the destination node.
In Clause 12, the above processor When executing the application based on the information regarding the configuration of the plurality of nodes included in the above NoC environment information, a plurality of source nodes that each generate and transmit packets among the plurality of nodes and a plurality of destination nodes that receive and process the packets generated from each source node are designated, and An in-chip interconnection performance prediction device that determines a node included on the path through which a packet generated at each of the plurality of source nodes is transmitted to the corresponding destination node, generates the plurality of flows, and acquires the traffic.
In Clause 17, the above processor An in-chip interconnection performance prediction device that determines the flow based on the routing algorithm of the router included in each of the plurality of nodes, the buffer size, and the PIR according to the speed of the packet generated and transmitted from the source node.
In Clause 12, the above processor An in-chip interconnect performance prediction device that obtains the OCI performance as the cumulative sum of the estimated flow performance for each of the plurality of flows constituting the above traffic.
In Clause 12, the above processor An in-chip interconnect performance prediction device that trains the neural network model based on the difference between the OCI performance estimated from training traffic composed of a plurality of randomly set flows based on the NoC environment and the OCI performance obtained by inputting the training traffic into a simulator in which a simulation environment is established based on the NoC environment during training of the neural network model.

Description

Method and Apparatus for Predicting On-Chip Interconnect Performance of Network on Chip The present disclosure relates to a method and apparatus for predicting network-on-chip performance, and more specifically, to an artificial neural network-based method and apparatus for predicting network-on-chip performance for optimal interconnector design. In a System on Chip (SoC), multiple circuit modules performing specific functions are integrated onto a single chip. Furthermore, due to advancements in semiconductor manufacturing technology, the number of circuit modules integrated within a single chip is steadily increasing, while linewidths are decreasing. Consequently, conventional SoCs that transmit various data between circuit modules using a shared bus have found it difficult to provide the required performance due to bottlenecks and various delay factors. Additionally, even when attempting to modify only some of the multiple circuit modules, the entire chip requires redesign because the data formats transmitted between the modules must match. In other words, there was a limitation in that it was difficult to reuse previously designed circuit blocks. To overcome these limitations, Network on Chip (NoC) technology is currently being applied, in which multiple circuit modules within an SoC operate as network nodes and perform data communication by exchanging packets with each other. In NoC technology, each of the multiple circuit modules within a single chip is equipped with a router, and the router configures other nodes to forward packets to according to a routing algorithm. When NoC technology is applied, multiple circuit modules communicate according to the packet switching method, which can improve the scalability and energy efficiency of the chip compared to existing bus-based data transmission methods. In addition, since multiple circuit modules can receive or transmit data to other circuit modules regardless of the data format, it is easy to replace or change each circuit module within the chip, thereby significantly improving the reusability of the circuit modules. However, in chips utilizing NoC technology, multiple circuit modules transmit data via packet switching; therefore, the On-Chip Interconnect (OCI) structure—as well as the performance of each individual circuit module—becomes one of the key factors determining the overall performance of the SoC. Nevertheless, since the OCI structure can be configured in highly diverse ways depending not only on the configuration of the circuit modules included in the chip but also on the applications to be executed on it, it has largely been set based on the experience of experts. Consequently, cases where the optimal OCI structure suitable for each application was not established frequently occurred, leading to reduced chip efficiency. To overcome these issues, a method was proposed and utilized to select an OCI structure exhibiting superior performance by simulating the performance of various OCI structures for each application using a simulator. However, as the number of circuit modules integrated within the chip increases, the scale and complexity of OCI structures have grown. Consequently, even when using a simulator, performing simulations for each individual OCI structure specific to each application is time-consuming and leads to lower performance prediction accuracy. Furthermore, there is a problem in that the simulator must re-perform simulations for multiple OCI structures whenever the application to be executed changes or the NoC environment changes. Meanwhile, research aimed at establishing OCI structures using artificial neural networks has been actively conducted recently. However, utilizing artificial neural networks requires a large amount of training data, which must be acquired using a simulator. Nevertheless, simulators have a limitation in that it is difficult to obtain large amounts of training data because they simulate and acquire performance data for only a single application based on a designated single OCI structure. Furthermore, there is a problem in that if the chip configuration changes, the OCI structure changes, or the application applied to the chip changes, appropriate training data must be acquired again according to the changes and the training must be performed again. Figure 1 is a diagram illustrating the NOC environment according to various OCI structures. FIG. 2 is a diagram illustrating the operation concept of an OCI performance prediction device of one embodiment. Figure 3 shows a configuration of an OCI performance prediction device classified according to operation according to one embodiment. Figure 4 is a diagram illustrating node congestion. Figure 5 shows an OCI performance prediction method according to one embodiment. FIG. 6 is a diagram illustrating a computing environment including a computing device according to one embodiment. Hereinafter, specific embodiments according to embodiments of the present di