Search

CN-116389543-B - QP sharing algorithm and hybrid connection pool optimization method and system for kernel mode RDMA

CN116389543BCN 116389543 BCN116389543 BCN 116389543BCN-116389543-B

Abstract

The invention provides a QP sharing algorithm and a mixed connection pool optimization method and a system for kernel mode RDMA, wherein RDMA is used in kernel mode, overflow of a NIC network card caused by sharing to a plurality of user processes is avoided through the RDMA QP sharing algorithm, and the design of the mixed connection pool is introduced, so that the connection time delay of the original millisecond level is reduced to the microsecond level on the premise of not affecting the original performance.

Inventors

  • WEI XINGDA
  • LU FANGMING
  • CHEN RONG
  • CHEN HAIBO

Assignees

  • 上海交通大学

Dates

Publication Date
20260512
Application Date
20230317

Claims (2)

  1. 1. The QP sharing algorithm and the hybrid connection pool optimization method for the kernel mode RDMA are characterized by comprising the following steps: step S1, when queue enqueuing operation is executed, judging whether a direct enqueuing condition is met according to a task list to be inserted into a work queue, if so, directly inserting, then executing step S3, otherwise, executing step S2; step S2, performing queue dequeuing operation, dequeuing all task batches at the head of the queue, and returning to the step S1 for continuous execution; S3, setting the type of the last task of the enqueued work queue list as a signal; Step S4, when the user mode application program is started, a dynamic connection QP is used for handshake connection; step S5, after the dynamic connection QP is connected, the dynamic connection QP is used for communication; Step S6, starting a reliable connection QP handshake in the background, and when the reliable connection QP handshake is completed, performing migration from a dynamic connection QP to a reliable connection QP in the hybrid connection pool; Step S7, after migration is completed, communication is carried out by using a reliable connection QP; If the length of the task list is smaller than or equal to the available queue length of the work queue, the direct enqueuing condition is met; when a user program requests connection establishment for the first time, a background starts reliable connection QP connection, after a background task is completed, the foreground is informed of successful connection of the reliable connection QP, at the moment, an ongoing dynamic connection QP request is completed firstly, and after all the tasks of the dynamic connection QP are cleared, migration from the reliable connection QP to the dynamic connection QP is carried out; The dynamic connection QP and the connection pool for reliably connecting the QP are built in the kernel, a plurality of hosts are configured with the same kernel version, network card drive and kernel module, each host is connected through an Infiniband protocol, and the global meta-information server is connected with each host through the Infiniband protocol; After the host kernel module is loaded into the kernel, the host kernel module registers meta information to the global meta information server, the dynamic connection QP obtains opposite-end handshake information through the global meta information server, and data communication is directly carried out according to the meta information and the opposite-end server.
  2. 2. A QP sharing algorithm and hybrid connection pool optimization system for kernel mode RDMA, comprising: the module M1 is used for judging whether a direct enqueuing condition is met or not according to a task list to be inserted into a work queue when the enqueuing operation of the queue is executed, if so, the direct enqueuing condition is directly inserted, then the module M3 is executed, and otherwise, the module M2 is executed; The module M2 is used for executing queue dequeuing operation, dequeuing all task batches at the head of the queue, and returning to the module M1 for continuous execution; A module M3, setting the type of the last task of the enqueued work queue list as a signal; A module M4, when the user mode application program is started, carrying out handshake connection by using a dynamic connection QP; The module M5 is used for communicating by using the dynamic connection QP after the dynamic connection QP is connected; A module M6 starts a reliable connection QP handshake in the background, and when the reliable connection QP handshake is completed, the hybrid connection pool is migrated from the dynamic connection QP to the reliable connection QP; A module M7, after migration is completed, communicating by using a reliable connection QP; If the length of the task list is smaller than or equal to the available queue length of the work queue, the direct enqueuing condition is met; when a user program requests connection establishment for the first time, a background starts reliable connection QP connection, after a background task is completed, the foreground is informed of successful connection of the reliable connection QP, at the moment, an ongoing dynamic connection QP request is completed firstly, and after all the tasks of the dynamic connection QP are cleared, migration from the reliable connection QP to the dynamic connection QP is carried out; The dynamic connection QP and the connection pool for reliably connecting the QP are built in the kernel, a plurality of hosts are configured with the same kernel version, network card drive and kernel module, each host is connected through an Infiniband protocol, and the global meta-information server is connected with each host through the Infiniband protocol; After the host kernel module is loaded into the kernel, the host kernel module registers meta information to the global meta information server, the dynamic connection QP obtains opposite-end handshake information through the global meta information server, and data communication is directly carried out according to the meta information and the opposite-end server.

Description

QP sharing algorithm and hybrid connection pool optimization method and system for kernel mode RDMA Technical Field The invention relates to the technical field of kernel mode RDMA (remote direct memory access), in particular to a method and a system for optimizing a kernel mode RDMA QP (remote data transfer protocol) sharing algorithm and a hybrid connection pool. Background RDMA technology is widely used in modern data center applications due to its high throughput and low latency characteristics. RDMA provides support for direct access of host to remote server memory, and compared with traditional TCP/IP protocol stacks, RDMA can complete data transmission between application programs and network cards without passing through an operating system kernel, and QP (queue pair) and MR (memory region) are used for data exchange between multiple machines. RDMA is enabled to have high throughput and low latency characteristics based on zero copy, core bypass, etc. RDMA also faces the problems of scalability and slow handshake setup. Firstly, the RDMA network card needs to maintain communication resources such as QP/MR, when the communication resources to be managed are increased, the management load of the network card is increased, the communication performance of the network card is reduced, and secondly, the time delay of establishing connection of the RDMA handshake is in the millisecond level, and compared with the communication time delay in the microsecond level, the time delay of establishing connection of the handshake is reduced by 15000 times. When an application needs to quickly start connection establishment and perform data transmission, handshake connection establishment becomes a bottleneck. Patent document CN107147722a (application number: CN 201710355317.0) discloses an IB-RTI method based on RDMA communication mechanism, creating RDMA protocol connection port, registering server side RDMA operation buffer, accepting client side connection request and replying connection success instruction when client side applies connection based on RDMA protocol, exchanging QP information, modifying QP information state level of client side, receiving data information sent by established connection client side and sending data information to established connection client side. At present, the processing method for solving the problems of expandability and low handshake connection speed is to perform QP sharing at the cross-thread level in a user state. The user state QP sharing scheme can reduce the resource maintenance cost of the single machine network card, but does not realize a reasonable QP resource protection strategy. If a reasonable QP resource protection strategy is lacking, the situation that QP cannot be used occurs, and other threads using the resource are affected. Another approach is to use kernel mode RDMA for resource sharing, which can achieve network card resource sharing across application levels at the expense of certain performance. However, this method does not implement a resource protection policy, and does not solve the problem of slow handshake establishment. Disclosure of Invention Aiming at the defects in the prior art, the invention aims to provide a QP sharing algorithm and hybrid connection pool optimization method and system for kernel mode RDMA. The QP sharing algorithm and the hybrid connection pool optimization method for kernel mode RDMA provided by the invention comprise the following steps: step S1, when queue enqueuing operation is executed, judging whether a direct enqueuing condition is met according to a task list to be inserted into a work queue, if so, directly inserting, then executing step S3, otherwise, executing step S2; step S2, performing queue dequeuing operation, dequeuing all task batches at the head of the queue, and returning to the step S1 for continuous execution; S3, setting the type of the last task of the enqueued work queue list as a signal; Step S4, when the user mode application program is started, a dynamic connection QP is used for handshake connection; step S5, after the dynamic connection QP is connected, the dynamic connection QP is used for communication; Step S6, starting a reliable connection QP handshake in the background, and when the reliable connection QP handshake is completed, performing migration from a dynamic connection QP to a reliable connection QP in the hybrid connection pool; and S7, after the migration is completed, the reliable connection QP is used for communication. Preferably, the direct enqueue condition is satisfied if the length of the task list is equal to or less than the available queue length of the work queue. Preferably, after the host kernel module is loaded into the kernel, the host kernel module registers meta information with the global meta information server, and the dynamic connection QP obtains opposite-end handshake information through the global meta information server and directly performs data comm