CN-122027554-A - Intra-network computing method and related device
Abstract
The embodiment of the application discloses an intra-network computing method and a related device, wherein the method comprises the steps of judging whether a current flit entering an input port corresponding to a flit checking unit is a head flit or not through the flit checking unit, controlling the current flit according to task information carried by the current flit if the current flit is not the head flit, executing one computing unit of a virtual channel in one network of the input ports, executing a computing task corresponding to the task information carried by the current flit to obtain a first computing result, and forwarding the first computing result to a next router after authorization of the control unit is obtained. The application embeds the calculating unit in the virtual channel of the traditional router, so that the virtual channel supports the calculation in the network, and the data packet can finish the calculation operation in the transmission process, thereby being beneficial to reducing the communication delay and improving the overall performance of the system.
Inventors
- CHEN HUI
- Qi Cuiyu
- HAN LIXIA
- LIU WEIQIANG
Assignees
- 南京航空航天大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260330
Claims (10)
- 1. An in-network computing method, applied to a virtual channel-based router, the router including at least one flit inspection unit, at least one input port, and a control unit, each flit inspection unit being connected to a corresponding input port and then to the control unit, the input port including at least one in-network computing virtual channel, each in-network computing virtual channel including at least one computing unit, the method comprising: Judging whether the current flit entering the input port corresponding to the flit checking unit is a head flit or not through the flit checking unit; If the current flit is not a head flit, controlling the current flit according to task information carried by the current flit, and executing a computing task corresponding to the task information carried by the current flit by one computing unit of a computing virtual channel in one of the networks of the input port to obtain a first computing result; and forwarding the first calculation result to a next hop router after the control unit is authorized.
- 2. The method according to claim 1, wherein the method further comprises: if the current flit is a head flit, judging the data packet type of the head flit and judging whether the current router is a target router corresponding to the head flit; if the data packet type of the head flit is a common data transmission type or the current router is not a destination router of the head flit, forwarding the head flit to a next hop router; And if the data packet type of the head flit is the task data calculation type and the current router is the destination router of the head flit, executing post-processing operation.
- 3. The method of claim 2, wherein said determining the packet type of the header flit comprises: judging the data packet type of the head flit according to the bit state identification bit of the head flit, wherein the common data transmission type and the bit state identification bit corresponding to the task data calculation type are different in numerical value.
- 4. The method of claim 2, wherein the in-network computing virtual channel further comprises a post-processing unit, the performing post-processing operations comprising: the control unit is used for locking a target intra-network computing virtual channel, so that the target intra-network computing virtual channel is stopped to participate in forwarding of a subsequent computing result, and route configuration information carried by the head microchip is transmitted to a target post-processing unit, the target intra-network computing virtual channel is the intra-network computing virtual channel currently occupied by the head microchip, and the target post-processing unit is the post-processing unit included in the target intra-network computing virtual channel; Waiting for other computing units in the computing virtual channel in the target network to execute corresponding computing tasks according to task information carried by the body flit and the tail flit to obtain a second computing result, and transmitting the second computing result to the target post-processing unit, wherein the head flit, the body flit and the tail flit form a target data packet together; the second calculation result is processed through the target post-processing unit to obtain a final calculation result, and the source router address and the destination router address of the target data packet are exchanged according to the route configuration information to obtain new route configuration information; Packaging a new data packet by the target post-processing unit according to the final calculation result and the new route configuration information, wherein the data type of the new data packet is the common data transmission type; And releasing the occupation of the calculated virtual channel in the target network through the control unit, and forwarding the new data packet to a next-hop router.
- 5. The method of claim 4, wherein when the in-network computing method is used for neural network reasoning acceleration, the first microchip further carries bias parameters and weight parameters corresponding to a current neural network layer, and the processing, by the target post-processing unit, the second computing result to obtain a final computing result includes: Calculating the weighted products of each second calculation result and the corresponding weight parameters to obtain a plurality of weighted products; summarizing and accumulating the weighted products, and superposing the bias parameters to obtain independent variable values; and calculating a final calculation result according to the independent variable value and the activation function.
- 6. The method of claim 5, wherein the intra-network computing virtual channel further comprises at least one buffer, each buffer corresponding to one of the computing units, the buffer being configured to store input data, intermediate data, and a result of the computation required by the corresponding computing unit to perform the computing task.
- 7. The method of claim 5, wherein the router further comprises a crossbar, each of the output ports being connected to the crossbar, the crossbar further being connected to the control unit, the control unit being configured to assign the crossbar to a target input port such that flits or packets output by the target input port are forwarded to a next hop router through the crossbar.
- 8. An in-network computing device, for use in a virtual channel-based router, the router comprising at least one flit inspection unit, at least one input port, and a control unit, each flit inspection unit being connected to a corresponding input port and then to the control unit, the input port comprising at least one in-network computing virtual channel, each in-network computing virtual channel comprising at least one computing unit, the device comprising: the microchip type judging unit is used for judging whether the current microchip entering the input port corresponding to the microchip checking unit is a head microchip or not through the microchip checking unit; The task execution unit is used for controlling the current flit according to the task information carried by the current flit if the current flit is not a head flit, and executing a computing task corresponding to the task information carried by the current flit by one of the computing units of the virtual channel in one of the networks of the input port to obtain a first computing result; And the data forwarding unit is used for forwarding the first calculation result to a next hop router after the authorization of the control unit is obtained.
- 9. An electronic device is characterized by comprising a processor and a memory; The processor is connected to a memory, wherein the memory is adapted to store a computer program, the processor being adapted to invoke the computer program to perform the method according to any of claims 1-7.
- 10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method of any of claims 1-7.
Description
Intra-network computing method and related device Technical Field The invention relates to the technical field of crossing of an integrated circuit and artificial intelligence, in particular to an on-line computing method and a related device. Background With the wide application of artificial intelligence in complex tasks such as image recognition, voice processing, natural language understanding, etc., the model scale of deep neural networks (Deep Neural Network, DNN) is continuously expanding, and the computational and data dependencies thereof place higher demands on hardware acceleration platforms. To meet the deployment requirements of high throughput, low latency and low power consumption, network-on-Chip (NoC) based neural Network accelerators have become the dominant direction of research. Such architecture enables massively parallel computing by interconnecting a large number of processing units (Processing Elements, PEs) over the NoC, effectively supporting DNN reasoning tasks. However, in actual operation, especially in the face of large models, frequent inter-layer data exchanges lead to proliferation of NoC traffic, which is highly prone to local or global network congestion. At this point, the packets are queued for forwarding in the router input buffer, causing significant communication delays. This delay not only directly lengthens the end-to-end inference time, but also causes the back-end computation unit to idle due to "data starvation", severely degrading the overall energy efficiency of the system. Disclosure of Invention The embodiment of the application provides an intra-network computing method and a related device, which enable a virtual channel to support intra-network computing by embedding a computing unit in the virtual channel of a traditional router, so that a data packet can complete computing operation in a transmission process, thereby being beneficial to reducing communication delay and improving the overall performance of a system. An embodiment of the present application provides an in-network computing method, applied to a router based on a virtual channel, where the router includes at least one flit checking unit, at least one input port and a control unit, each flit checking unit is connected to the corresponding input port and then to the control unit, the input port includes at least one in-network computing virtual channel, and each in-network computing virtual channel includes at least one computing unit, and the method includes: Judging whether the current flit entering the input port corresponding to the flit checking unit is a head flit or not through the flit checking unit; If the current flit is not a head flit, controlling the current flit according to task information carried by the current flit, and executing a computing task corresponding to the task information carried by the current flit by one computing unit of a computing virtual channel in one of the networks of the input port to obtain a first computing result; and forwarding the first calculation result to a next hop router after the control unit is authorized. Optionally, the method further comprises: if the current flit is a head flit, judging the data packet type of the head flit and judging whether the current router is a target router corresponding to the head flit; if the data packet type of the head flit is a common data transmission type or the current router is not a destination router of the head flit, forwarding the head flit to a next hop router; And if the data packet type of the head flit is the task data calculation type and the current router is the destination router of the head flit, executing post-processing operation. Optionally, the determining the packet type of the header flit includes: judging the data packet type of the head flit according to the bit state identification bit of the head flit, wherein the common data transmission type and the bit state identification bit corresponding to the task data calculation type are different in numerical value. Optionally, the in-network computing virtual channel further includes a post-processing unit, and the performing post-processing operations includes: the control unit is used for locking a target intra-network computing virtual channel, so that the target intra-network computing virtual channel is stopped to participate in forwarding of a subsequent computing result, and route configuration information carried by the head microchip is transmitted to a target post-processing unit, the target intra-network computing virtual channel is the intra-network computing virtual channel currently occupied by the head microchip, and the target post-processing unit is the post-processing unit included in the target intra-network computing virtual channel; Waiting for other computing units in the computing virtual channel in the target network to execute corresponding computing tasks according to task information carried by the body flit an