CN-121984577-A - GPU optical interconnection dynamic reconstruction method and system

CN121984577ACN 121984577 ACN121984577 ACN 121984577ACN-121984577-A

Abstract

The invention provides a dynamic reconfiguration method and a dynamic reconfiguration system for GPU optical interconnection. The GPU optical interconnection dynamic reconstruction system comprises a plurality of GPU computing nodes, an optical circuit switching matrix and a controller, wherein the GPU optical interconnection dynamic reconstruction system predefines a first channel and a second channel, the first channel is a working channel, the second channel is a preparation channel, when a service flow is transmitted on a topology established on the first channel, if the optical interconnection dynamic reconstruction is carried out, the controller controls the optical circuit switching matrix to establish a new topology on the second channel, and the controller controls the GPU computing nodes to switch the working channel from the first channel to the second channel so that the service flow continues to be transmitted on the new topology established on the second channel. The real-time and dynamic reconstruction of the network topology can be realized on the premise of not interrupting the GPU or interrupting the computing service in nanosecond level, so that the network topology is always optimally matched with the communication mode of the computing task.

Inventors

Request for anonymity
ZHANG RUITAO
Request for anonymity

Assignees

光子算数(南京)科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260206

Claims (10)

1. The GPU optical interconnection dynamic reconstruction method is characterized by being applied to a GPU optical interconnection dynamic reconstruction system, wherein the GPU optical interconnection dynamic reconstruction system comprises a plurality of GPU computing nodes, an optical circuit switching matrix and a controller, the GPU optical interconnection dynamic reconstruction system predefines a first channel and a second channel, the first channel is a working channel, and the second channel is a preparation channel, and the method comprises the following steps: When the service flow is transmitted on the topology established on the first channel, if the optical interconnection dynamic reconstruction is carried out, the controller controls the optical circuit switching matrix to establish a new topology on the second channel; the controller controls the plurality of GPU computing nodes to switch the working channel from the first channel to the second channel so that traffic flows continue to be transmitted on a new topology established on the second channel.
2. The method according to claim 1, wherein the controller is connected to the optical circuit switching matrix and each of the GPU compute nodes via control command links, respectively; Each GPU computing node is connected with the optical circuit switching matrix through 2 optical fibers, 1 optical fiber is used for sending data, and the other 1 optical fiber is used for receiving data; Each GPU computing node is provided with a dual-channel intelligent light engine, and the dual-channel intelligent light engines are used for maintaining two independent logic channels in a single optical fiber.
3. The method of claim 1, wherein the working channel is used to carry traffic, the preparation channel is used to transmit management signals with lower priority, and the new topology established on the preparation channel does not interfere with traffic flows of the working channel.
4. The method of claim 1, wherein after the step of the controller controlling the plurality of GPU computing nodes to switch the working channel from the first channel to the second channel, the method further comprises: And releasing the first channel, and taking the first channel as a preparation channel for dynamic reconstruction of the optical interconnection of the next round.
5. The method according to any one of claims 1-4, further comprising: The controller identifies the communication mode of the current stage and predicts the communication mode of the next stage by monitoring the GPU communication flow or interacting with the task scheduler; The controller determines an optimal network topology based on the communication mode of the next stage; The controller records topological configurations of the first channel, the working channel and the preparation channel; the controller performs global clock synchronization through a designated protocol and controls the optical circuit switching matrix and the plurality of GPU computing nodes; The controller monitors communication quality of the working channel and the preliminary channel, and determines whether to switch the working channel based on the communication quality.
6. The method of claim 2, wherein the dual-channel intelligent light engine comprises a transmitting end and a receiving end; in the transmitting end, light generated by a laser source is divided into one path of transverse electric mode TE optical signal and the other path of transverse magnetic mode TM optical signal through a polarization beam splitter, the TE optical signal is modulated by a TE modulator, the TM optical signal is modulated by the TM modulator, and the two paths of modulated optical signals are combined into 1 transmitting optical fiber through a polarization beam combiner; In the receiving end, the optical signals of the receiving optical fibers are separated into one path of mode TE optical signals and the other path of TM optical signals through the polarization beam splitter, the TE optical signals are received by the TE detector, the TM optical signals are received by the TM detector, and one optical path signal is selected from the two paths of received optical signals based on the control instruction of the controller and is output to a plurality of GPU computing nodes.
7. The method of claim 6, wherein the optical circuit switching matrix comprises an input processing unit, a double switching matrix, and an output processing unit; The input processing unit is used for separating an input optical signal into a TE optical signal and a TM optical signal through the polarization beam splitter; the double switching matrix is used for switching TE optical signals through the TE switching matrix and switching TM optical signals through the TM switching matrix; The output processing unit is used for combining and outputting the exchanged TE optical signals and TM optical signals through the polarization beam combiner.
8. The method of claim 2, wherein the dual-channel intelligent light engine comprises a transmitting end, a receiving end and 2 lasers, wherein the 2 lasers are respectively used for generating optical signals with 2 wavelengths; In the transmitting end, 2 wavelength optical signals are combined to 1 optical fiber through a wavelength division multiplexer; in the receiving end, optical signals of 2 wavelengths are separated by a wavelength division demultiplexer.
9. The method of claim 8, wherein the optical circuit switching matrix is used for independent switching of wavelength channels by wavelength selective switching or arrayed waveguide gratings.
10. A GPU optical interconnect dynamic reconfiguration system, configured to perform the GPU optical interconnect dynamic reconfiguration method of any one of claims 1-9.

Description

GPU optical interconnection dynamic reconstruction method and system Technical Field The invention relates to the technical field of optical interconnection, in particular to a dynamic reconfiguration method and a dynamic reconfiguration system for GPU optical interconnection. Background With the explosive growth of the demands for computing power, such as artificial intelligence training and scientific computing, a large-scale computing cluster consisting of thousands of GPU (Graphics Processing Unit, graphics processor) accelerator cards has become mainstream. Currently, the mainstream clusters mostly adopt a topology based on electric switching, and have inherent defects: 1. The resource utilization rate is low, network resources are excessively configured for coping with the worst communication scene, and the resources are idle under most normal tasks, so that the cost is high. 2. The fault recovery is inflexible, and when a fixed physical link fails, the fixed physical link is difficult to dynamically bypass a failure point, or the performance is reduced and cannot be used as a backup. 3. And the expandability limit is that the energy consumption of the electric interconnection is increased sharply under the ultra-high bandwidth, the signal integrity challenge is huge, and the expandability is limited. 4. Optical interconnection technologies (such as silicon optical and photoelectric Co-packaging (CPO, co-packaged Optics ‌) and linear pluggable optical modules) can provide high-bandwidth, low-delay and low-power-consumption data transmission, but generally still operate in a static state, and cannot fundamentally solve the core problem of topology dynamic adaptation. Topology reconstruction requires millisecond time, while GPU computation requires nanosecond continuous data exchange, and service interruption cannot be recovered during reconstruction. Disclosure of Invention In view of the above, the present invention aims to provide a method and a system for dynamically reconstructing optical interconnection of a GPU, so as to implement real-time and dynamic reconstruction of a network topology on the premise of not interrupting the GPU or interrupting the computing service in nanoseconds, so that the network topology always maintains optimal matching with the communication mode of the computing task. In a first aspect, the embodiment of the invention provides a dynamic reconfiguration method for a Graphic Processing Unit (GPU) optical interconnection, which is applied to a dynamic reconfiguration system for the GPU optical interconnection, wherein the dynamic reconfiguration system for the GPU optical interconnection comprises a plurality of GPU computing nodes, an optical circuit switching matrix and a controller, the dynamic reconfiguration system for the GPU optical interconnection predefines a first channel and a second channel, the first channel is a working channel, the second channel is a preparation channel, when a service flow is transmitted on a topology established on the first channel, if the dynamic reconfiguration for the optical interconnection is carried out, the controller controls the optical circuit switching matrix to establish a new topology on the second channel, and the controller controls the plurality of GPU computing nodes to switch the working channel from the first channel to the second channel so that the service flow is continuously transmitted on the new topology established on the second channel. In an alternative embodiment of the application, the controller is respectively connected with the optical circuit switching matrix and each GPU computing node through a control command link, each GPU computing node is respectively connected with the optical circuit switching matrix through 2 optical fibers, 1 optical fiber is used for sending data, and the other 1 optical fiber is used for receiving data, and each GPU computing node is provided with a two-channel intelligent optical engine which is used for maintaining two independent logic channels in a single optical fiber. In an alternative embodiment of the present application, the working channel is used for carrying services, the preparation channel is used for transmitting management signals with lower priority, and the new topology established on the preparation channel does not interfere with the service flow of the working channel. In an alternative embodiment of the present application, after the step of controlling the plurality of GPU computing nodes to switch the working channel from the first channel to the second channel by the controller, the method further includes releasing the first channel, and taking the first channel as a preliminary channel for dynamic reconfiguration of the optical interconnection of the next round. In an alternative embodiment of the application, the method further comprises the steps that the controller identifies the communication mode of the current stage by monitoring GPU communic