CN-116455839-B - Cross bar switching network system based on shared cross node

CN116455839BCN 116455839 BCN116455839 BCN 116455839BCN-116455839-B

Abstract

The invention relates to a cross bar switching network system based on shared cross nodes, which comprises a plurality of input queue management modules, a common cross node buffer module, a shared cross node buffer module, an RR column arbitration module, a high-speed Aurora interface module, a WRR independent column arbitration module and a plurality of configuration interfaces which are arranged on each FPGA. The invention uses the shared cross node buffer memory to avoid the problem of reduced dispatching efficiency caused by time delay brought by inter-chip arbitration, theoretically, as long as the shared cross node buffer memory is large enough, the dispatching of the WRR independent column arbitration of the transverse split has the same efficiency as the dispatching of the non-split, thereby improving the dispatching efficiency of the whole switching system and further improving the link transmission efficiency.

Inventors

QIU ZHILIANG
ZHANG SUHONG
PAN WEITAO
GAO YIMING
LI XIAOWANG
LI DAZHUANG

Assignees

西安电子科技大学

Dates

Publication Date: 20260505
Application Date: 20230412

Claims (9)

1. A cross bar switching network system based on shared cross nodes is characterized by comprising a plurality of input queue management modules (10), a common cross node buffer module (20), a shared cross node buffer module (30), an RR column arbitration module (40), an inter-chip high-speed Aurora interface module (50), a WRR independent column arbitration module (60) and a plurality of configuration interfaces (70) which are arranged on each FPGA, The input queue management modules (10) are used for receiving data frames from physical ports, and tag heads are added to the heads of the data frames according to destination ports of the data frames; The common cross node buffer module (20) is used for receiving the data frame added with the tag header and temporarily storing the data frame added with the tag header into different common cross node buffers (201) according to the destination port; the shared cross node buffer memory module (30) is used for receiving and storing data frames from another FPGA in the inter-chip high-speed Aurora interface module (50), and participating in the arbitration of the WRR independent column arbitration module (60) with the data frames in the common cross node buffer memory (201) of the same column on the chip PFGA; The RR column arbitration module (40) is used for giving an arbitration result according to a transmission request of the common cross node buffer (201) on the same column, and moving a data frame added with a tag header in the chip FPGA from the corresponding common cross node buffer (201) to a transmission part of the inter-chip high-speed Aurora interface module (50) according to the arbitration result to form a data frame to be transmitted; The inter-chip high-speed Aurora interface module (50) is used for sending the data frame to be sent to a receiving part of another FPGA inter-chip high-speed Aurora interface module (50), receiving the data frame output by the sending part of the inter-chip high-speed Aurora interface module (50) in the other FPGA, and transmitting the data frame to the shared cross node buffer module (30) of the FPGA; The WRR independent column arbitration module (60) is used for receiving request signals sent by the common cross node cache module (20) and the cross node caches in the same column in the shared cross node cache module (30) and carrying out weighted polling scheduling, and moving data frames in the cross node caches scheduled by polling to an output port; The configuration interfaces (70) are used for configuring weights required by the WRR independent column arbitration module (60) for scheduling and a maximum and minimum threshold of the queues of the input queue management modules (10).
2. The shared-Crossbar-based transversal split Crossbar switching network system according to claim 1, characterized in that the normal Crossbar buffer module (20) comprises a number of normal Crossbar buffers (201), the shared Crossbar buffer module (30) comprises a number of shared Crossbar buffers (301), the RR column arbitration module (40) comprises a number of RR column arbitration sub-modules (401), the inter-chip high-speed Aurora interface module (50) comprises a number of first inter-chip high-speed Aurora interfaces (501) and a number of second inter-chip high-speed Aurora interfaces (502), the WRR independent column arbitration module (60) comprises a number of WRR independent column arbitration sub-modules (601), wherein, The common cross node caches (201) are distributed in an array, and the common cross node caches (201) of each row are connected with the input queue management module (10); Each shared cross node buffer (301) is connected with the common cross node buffer (201) in the same column, each RR column arbitration sub-module (401) is connected with the common cross node buffer (201) in the same column, and the sum of the numbers of the shared cross node buffer (301) and the RR column arbitration sub-module (401) is equal to the column number of the common cross node buffer (201); The first inter-chip high-speed Aurora interface (501) is connected with the RR column arbitration sub-module (401) of the FPGA of the chip and the receiving part of the high-speed Aurora interface module (50) between the other FPGA chips, and the second inter-chip high-speed Aurora interface (502) is connected with the shared cross node cache (301) of the FPGA of the chip and the transmitting part of the inter-chip high-speed Aurora interface module (50) in the other FPGA; each WRR independent column arbitration sub-module (601) is connected to the common cross node cache (201) of the same column and is located in the same column as the shared cross node cache (301).
3. The shared cross-node based transversal split Crossbar switching network system according to claim 2, wherein the first inter-chip high-speed Aurora interface (501) comprises a first Aurora IP core, a first clock-crossing module and a locallenk-to-AXI module, and the second inter-chip high-speed Aurora interface (502) comprises a second Aurora IP core, a second clock-crossing module and an AXI-to-locallenk module, wherein, The first clock crossing module is used for crossing the data frame subjected to the format conversion from a system main clock domain to a user side clock domain of an Aurora IP core to obtain a first clock crossing data frame, and the first Aurora IP core is used for sending the first clock crossing data frame to a receiving part of another FPGA inter-chip high-speed Aurora interface module (50); the second Aurora IP core is used for receiving a data frame output by a transmitting part of another FPGA inter-chip high-speed Aurora interface module (50) to obtain a received data frame, the second clock crossing module is used for crossing the received data frame from a user side clock domain of the Aurora IP core to a system main clock domain to obtain a second clock crossing data frame, and the AXI conversion localnk module is used for converting the format of the second clock crossing data frame into a locallink data format used in the system.
4. The Crossbar switching network system of claim 3, wherein the first Aurora IP core and the second Aurora IP core each adopt a 4-way binding form, and the line rate can reach 40Gbps.
5. The Crossbar switch network system according to claim 2, wherein the number of input queue management modules (10) is n/2, n is the number of buses, the number of normal Crossbar caches (201) is n 2 /2, the number of shared Crossbar caches (301) is n/2, the number of RR column arbitration sub-modules (401) is n/2, the number of first inter-chip high-speed Aurora interfaces (501) is n/2, the number of second inter-chip high-speed Aurora interfaces (502) is n/2, and the number of WRR independent column arbitration modules (60) is n/2.
6. The shared-Crossbar-based transversal split Crossbar switching network system according to claim 2, wherein the normal Crossbar cache (201) and the shared Crossbar cache (301) each comprise a first storage area and a second storage area, wherein, The first storage area is used for multicasting or broadcasting data, and the second storage area is used for storing unicast data.
7. The Crossbar switching network system of claim 1, wherein the tag header comprises a unicast identifier, a queue number, a frame length, and a destination port number.
8. The cross-node-based transversal split Crossbar switching network system according to claim 1, wherein the tag header-added data frame transmission path is divided into a present-slice transmission path and a cross-slice transmission path according to the destination port, wherein, The self-chip transmission path sequentially comprises an input queue management module (10) of the self-chip FPGA, a common cross node buffer module (20) of the self-chip FPGA, a WRR independent column arbitration module (60) of the self-chip FPGA and an output port corresponding to the target port; The cross-chip transmission path sequentially comprises an input queue management module (10) of the FPGA, a common cross node buffer module (20) of the FPGA, an RR column arbitration module (40) of the FPGA, a high-speed Aurora interface module (50) of another FPGA, a shared cross node buffer module (30) of another FPGA, a WRR independent column arbitration module (60) of the other FPGA and an output port corresponding to the target port.
9. The cross bar switch network system of claim 8, wherein the scheduling period of one data frame of the present slice transmission path includes a scheduling interval and a data transmission delay, and wherein the scheduling period of one data frame of the cross slice transmission path includes a scheduling interval and a data transmission delay.

Description

Cross bar switching network system based on shared cross node Technical Field The invention belongs to the technical field of communication, and particularly relates to a transverse split Crossbar switching network system based on a shared cross node. Background Currently, the mainstream single-stage switching structure is mainly divided into three structures of a shared bus, a shared cache and a Crossbar. All data of all input ports in the shared bus structure are transmitted on the bus in a time division multiplexed fashion, so the shared bus structure requires a bus rate greater than the sum of all port rates to ensure that no congestion occurs. This architecture is not generally used because the shared bus architecture switching capacity is limited by the bus rate and is not highly scalable. The shared buffer memory structure realizes the receiving and transmitting of the data frame by reading and writing the same buffer memory area, and compared with the shared bus structure, the shared buffer memory structure is easy to achieve the line speed processing of the data, but the exchange capacity of a single shared buffer memory is limited by the writing and reading speed of the buffer memory, and meanwhile, the shared buffer memory structure also has the problem of incapability of free expansion. The Crossbar switching structure can well solve the problem that the switching capacity of the shared bus structure and the shared cache structure is limited. The Crossbar switching structure realizes the on-off of multiple-input-to-multiple-output channels by using a high-speed Crossbar matrix circuit, and any channel switch from input to output does not influence other connected channels, so that strict non-blocking can be realized. The Crossbar switch network is divided into input queuing (Input Queued, IQ), output queuing (Out Queued, OQ), input-output joint queuing (Combined Input and Output Queued, CIOQ), cross-node joint queuing (Combined Input and Crosspoint Queued, CICQ) according to queuing policies. Wherein the input cross-node joint queuing can effectively isolate the input end from the output end, and is convenient for expanding the exchange capacity, the input cross-node joint queuing structure is therefore a switching network structure which is widely adopted at present. The input queue management module sharing cache and Crossbar cache in the CICQ switch fabric consumes a significant amount of memory resources. When the single FGPA is insufficient to support the storage resources required by the current exchange capacity, the problem of insufficient storage resources can be well solved by using a plurality of FPGAs. If the number of physical ports to be carried by the switch in the demand is larger than the number which can be carried by one FPGA, the function of the switch is completed by two FPGAs together, which is a feasible solution. However, a natural problem exists in using two FPGAs to implement a switching system, namely that there is a significant delay in the interaction of data or signals between the two FPGAs. Taking the transverse splitting as an example, the logic from an input bus to a same-line crossing node is reserved in the transverse splitting, so that the interaction between the input queue management module and the same-line crossing node cache is completed in the same FPGA, but when the output arbitration is carried out on the same-column crossing node cache, the transmitted data and interaction signals need to be transmitted across slices. How to alleviate the decrease of scheduling efficiency caused by transmitting data or signals across slices becomes a problem to be solved. Disclosure of Invention In order to solve the problems in the prior art, the invention provides a transverse split Crossbar switching network system based on a shared cross node. The technical problems to be solved by the invention are realized by the following technical scheme: the embodiment of the invention provides a cross bar switching network system based on shared cross nodes, which comprises a plurality of input queue management modules, a common cross node buffer module, a shared cross node buffer module, an RR column arbitration module, a high-speed Aurora interface module, a WRR independent column arbitration module and a plurality of configuration interfaces which are arranged on each FPGA, The input queue management modules are used for receiving data frames from the physical ports, and tag heads are added to the heads of the data frames according to the destination ports of the data frames; The common cross node buffer module is used for receiving the data frame added with the tag header and temporarily storing the data frame added with the tag header into different common cross node buffers according to the destination port; the shared cross node buffer memory module is used for receiving and storing the data frame from another FPGA in the inter-chip high-speed Aurora interface module,