CN-121530904-B - AI card message communication optimization method based on heterogeneous perception assembly line and dynamic route

CN121530904BCN 121530904 BCN121530904 BCN 121530904BCN-121530904-B

Abstract

The application discloses an AI card message communication optimizing method based on heterogeneous perception assembly line and dynamic route, relating to the communication technical field, the method effectively solves the problem of low communication efficiency in the prior art by collecting system state data, generating dynamic strategy data and performing self-adaptive processing, can dynamically adapt to network changes and heterogeneous computing environments, improves communication efficiency, reduces delay and optimizes resource utilization.

Inventors

WANG YANGYANG

Assignees

华鲲振宇智能科技国际有限公司
四川华鲲振宇智能科技有限责任公司

Dates

Publication Date: 20260508
Application Date: 20260115

Claims (9)

1. An AI card message communication optimization method based on heterogeneous awareness pipelining and dynamic routing, which is characterized by comprising the following steps: collecting AI card cluster state information and current communication task information to generate system state data; heterogeneous perception and dynamic decision processing are carried out on the system state data so as to generate dynamic routing strategy data, heterogeneous task allocation strategy data and pipeline processing strategy data; performing self-adaptive pipeline processing and dynamic routing processing on the original communication message data needing to be communicated based on the dynamic routing policy data, the heterogeneous task allocation policy data and the pipeline processing policy data so as to acquire optimized communication message data; transmitting the optimized communication message data in the AI card cluster through a dynamically selected transmission path so as to complete message communication among AI cards; the step of performing heterogeneous awareness and dynamic decision processing on the system state data to generate dynamic routing policy data, heterogeneous task allocation policy data and pipeline processing policy data comprises the following steps: based on the link bandwidth information and the network congestion information in the system state data, performing dynamic topology reconstruction processing to generate the dynamic routing strategy data; Based on the computing capability difference information in the system state data, carrying out heterogeneous load balancing processing to generate heterogeneous task allocation strategy data; And carrying out self-adaptive strategy selection processing based on the message size information in the system state data so as to generate the pipeline processing strategy data.
2. The AI card messaging optimization method based on heterogeneous awareness pipelining and dynamic routing of claim 1, wherein the step of collecting AI card cluster state information and current communication task information to generate system state data includes: periodically or event-triggered collecting the calculation state information, the interconnection link state information and the network state information of each AI card to obtain original state data; Performing state integration and real-time update processing on the original state data to generate AI card cluster state information, wherein the AI card cluster state information comprises computing capability difference information, link bandwidth information and network congestion information; Collecting information of the type of the currently-executed collective communication operation and information of the size of the message so as to generate information of a current communication task; And comprehensively processing the AI card cluster state information and the current communication task information to generate the system state data.
3. The AI card message communication optimization method based on heterogeneous awareness pipelining and dynamic routing of claim 1, wherein the step of performing dynamic topology reconfiguration processing based on link bandwidth information and network congestion information in the system state data to generate the dynamic routing policy data includes: based on the real-time bandwidth information, delay information and bit error rate information of each link in the system state data, carrying out link quality evaluation processing to obtain link quality evaluation data; Performing weighted topology construction processing based on the current communication task information in the link quality evaluation data and the system state data to generate a weighted communication directed acyclic graph; Based on the network congestion information in the system state data, performing conflict detection and re-planning processing on the weighted communication directed acyclic graph to generate a real-time optimal weighted directed acyclic graph so as to form the dynamic routing strategy data.
4. The AI card message communication optimization method based on heterogeneous awareness pipelining and dynamic routing of claim 3 wherein the step of performing a weighted topology construction process based on current communication task information in the link quality assessment data and the system state data to generate a weighted communication directed acyclic graph comprises: task priority information in the current communication task information is obtained from the system state data so as to determine the priority level of the communication task; setting a corresponding link weight coefficient for each priority level; And carrying out weighted calculation processing based on the link quality evaluation data and the link weight coefficient to generate the weighted communication directed acyclic graph.
5. The AI card message communication optimization method based on heterogeneous awareness pipelining and dynamic routing of claim 1, wherein the step of performing heterogeneous load balancing processing based on computing capability difference information in the system state data to generate the heterogeneous task allocation policy data includes: Based on the calculated state information of each AI card in the system state data, carrying out fast and slow card identification processing to obtain fast card identification information and slow card identification information; based on the fast card identification information and the slow card identification information, non-uniform task allocation processing is carried out to generate a data block allocation scheme; and carrying out task scheduling planning processing on the data block allocation scheme to generate heterogeneous task allocation strategy data, wherein the heterogeneous task allocation strategy data comprises information for allocating data blocks with different sizes for AI cards with different computing capacities.
6. The AI card message communication optimization method based on heterogeneous awareness pipelining and dynamic routing of claim 1 wherein the step of performing adaptive policy selection processing based on message size information in the system state data to generate the pipelining policy data includes: Acquiring message size information from the system state data; comparing the message size information with a preset threshold value, and performing policy classification processing to obtain a policy classification result; Based on the policy classification result, a corresponding processing policy is selected from a plurality of predefined processing policies to generate the pipeline processing policy data.
7. The AI card message communication optimization method based on heterogeneous awareness pipelining and dynamic routing of claim 1, wherein performing adaptive pipelining and dynamic routing on raw communication message data to be communicated based on the dynamic routing policy data, the heterogeneous task allocation policy data, and the pipelining policy data to obtain optimized communication message data includes: Based on the pipeline processing strategy data, performing dynamic assembly processing of the configurable processing unit to generate adaptive pipeline configuration data; based on the self-adaptive pipeline configuration data, carrying out serialization, compression and scheduling processing on the original communication message data needing to be communicated so as to obtain message data subjected to pipeline processing; selecting a real-time optimal transmission path for the message data subjected to pipeline processing based on the dynamic routing policy data so as to generate path selection data; and carrying out data block allocation processing on the message data subjected to pipeline processing based on the heterogeneous task allocation policy data so as to acquire the optimized communication message data.
8. The AI card messaging optimization method based on heterogeneous awareness pipelining and dynamic routing of claim 7 wherein the step of performing a dynamic assembly process of configurable processing units based on the pipelining policy data to generate adaptive pipelining configuration data includes: acquiring processing strategy information from the pipeline processing strategy data; selecting a corresponding processing unit from a plurality of configurable processing units based on the processing policy information; and dynamically assembling the selected processing units according to a preset processing sequence to generate the self-adaptive pipeline configuration data.
9. The AI card message communication optimization method based on heterogeneous awareness pipelining and dynamic routing of claim 7, wherein the step of performing data block allocation processing on the pipelined message data based on the heterogeneous task allocation policy data to obtain the optimized communication message data includes: acquiring information of data blocks with different sizes allocated to AI cards with different computing capacities from the heterogeneous task allocation strategy data; Based on the information of the data blocks with different sizes distributed to the AI cards with different computing capacities, carrying out non-uniform data block division processing on the message data subjected to pipeline processing so as to generate a non-uniform data block set; And distributing the data blocks with different sizes in the non-uniform data block set to corresponding AI cards so as to acquire the optimized communication message data.

Description

AI card message communication optimization method based on heterogeneous perception assembly line and dynamic route Technical Field The application relates to the technical field of communication, in particular to an AI card message communication optimization method based on heterogeneous perception pipeline and dynamic routing. Background With the continued expansion of the scale of artificial intelligence models, AI card (e.g., GPU, NPU, etc. acceleration computing units) clusters have become the core infrastructure supporting large-scale deep learning training. In a distributed training scenario, gradient synchronization and parameter exchange must be achieved between AI cards through efficient collective communication operations (e.g., allReduce, allGather, etc.) to ensure collaborative completion of training tasks. However, the existing AI inter-card communication mechanism has multiple technical bottlenecks that a communication path commonly adopts a preset static topological structure (such as a ring topology or a tree topology), the structure cannot sense real-time fluctuation of network link quality (including bandwidth change, delay fluctuation and abnormal bit error rate) and cannot adapt to dynamic congestion conditions, so that communication delay is remarkably increased and bandwidth resource utilization rate is low, meanwhile, the computing capacity of AI cards in a cluster often presents heterogeneous characteristics, namely, partial AI cards are relatively fast (fast cards) and partial AI cards are relatively slow (slow cards), the traditional communication scheme adopts a uniform data partitioning and task allocation strategy, so that the fast cards are forced to wait for slow cards to finish computation, a 'short-board effect' is formed, overall communication efficiency is severely restricted, and in addition, the processing strategy of a communication pipeline is usually fixed, and the capability of self-adapting according to dynamic characteristics such as message size, task priority is lacking, for example, small-scale messages can increase cost due to excessive processing, and large-scale messages can cause congestion due to insufficient processing, and further communication performance bottlenecks are further aggravated. The above problems make it difficult for the prior art to meet the urgent need for high throughput, low latency communications for large-scale AI clusters. The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present application and is not intended to represent an admission that the foregoing is prior art. Disclosure of Invention The application mainly aims to provide an AI card message communication optimization method based on heterogeneous perception pipeline and dynamic routing, which aims to improve communication efficiency, reduce delay and optimize resource utilization rate. In order to achieve the above object, the present application provides an AI card message communication optimization method based on heterogeneous awareness pipeline and dynamic routing, the method comprising: collecting AI card cluster state information and current communication task information to generate system state data; heterogeneous perception and dynamic decision processing are carried out on the system state data so as to generate dynamic routing strategy data, heterogeneous task allocation strategy data and pipeline processing strategy data; performing self-adaptive pipeline processing and dynamic routing processing on the original communication message data needing to be communicated based on the dynamic routing policy data, the heterogeneous task allocation policy data and the pipeline processing policy data so as to acquire optimized communication message data; and transmitting the optimized communication message data in the AI card cluster through a dynamically selected transmission path so as to complete message communication among the AI cards. In one embodiment, the step of collecting AI card cluster status information and current communication task information to generate system status data includes: periodically or event-triggered collecting the calculation state information, the interconnection link state information and the network state information of each AI card to obtain original state data; Performing state integration and real-time update processing on the original state data to generate AI card cluster state information, wherein the AI card cluster state information comprises computing capability difference information, link bandwidth information and network congestion information; Collecting information of the type of the currently-executed collective communication operation and information of the size of the message so as to generate information of a current communication task; And comprehensively processing the AI card cluster state information and the current communication task information to gener