Search

CN-121597348-B - Task scheduling method and system for hierarchical topology domain weight perception

CN121597348BCN 121597348 BCN121597348 BCN 121597348BCN-121597348-B

Abstract

The invention discloses a task scheduling method and a task scheduling system for hierarchical topology domain weight perception, wherein the method comprises the following steps: modeling is performed based on node performance and network bandwidth difference among nodes, a weighted hierarchical cluster node performance topology domain model is obtained, and the node performance and the network bandwidth difference among the nodes can be reflected. In the resource scheduling process, a task designates a topological domain level, a scheduler traverses all nodes of each topological domain under the level to perform pre-selection and optimization, an optimal node is finally selected, pod is allocated to the node, a scheduling result is recorded, if a plurality of topological domains meet the requirement, the node resources allocated by Pod in each topological domain are preferably scored and added, a binding request is sent according to the scheduling result of the topological domain with the highest score, and kubelet is responsible for specific binding actions. The invention considers the node performance and the network bandwidth difference between the nodes, can dispatch the workload to the optimal performance domain, and improves the training efficiency of the large model.

Inventors

  • LIU NINGXIN
  • DU JIN
  • FAN KANG
  • Fei Zheyao
  • LIU FENG

Assignees

  • 之江实验室

Dates

Publication Date
20260508
Application Date
20260130

Claims (8)

  1. 1. The task scheduling method for hierarchical topology domain weight perception is characterized by comprising the following steps of: The method comprises the steps of obtaining node network topology information and hardware performance labels, constructing an initial topology domain according to a switch hierarchy based on network topology positions, carrying out calculation performance analysis on each node in each bottom topology domain in the initial topology domain, constructing the topology domain according to the performance labels, scoring and converting the topology domain into weights according to the node performance labels, wherein in the weighted hierarchical cluster node performance topology domain model, a hierarchical structure represents the network topology of the node, a weight value of the topology domain is used as a parameter for quantitatively evaluating the performance of the node in the topology domain, and the weight value is positively correlated with the performance of the node; submitting tasks to the Kubernetes cluster, and designating a topology domain hierarchy of the tasks; Respectively traversing all Pods created by the tasks in the Kubernetes cluster, designating a topology domain of a topology domain hierarchy by the tasks, and preselecting and optimizing all nodes in each topology domain; in the resource preselection stage, node resources meeting the resource requirements and affinity/anti-affinity requirements are found out and used as preselection nodes; In the resource optimization stage, optimization scoring is carried out according to node weights of preselected nodes and the hierarchical topology domain, an optimal node is selected, pod is distributed to the node, and a scheduling result is recorded; when all Pods of the task find nodes which can be allocated in a topology domain, the task is successfully scheduled in the topology domain; if the multiple topology domains all meet the task scheduling requirement, node resources with Pod allocated in each topology domain meeting the requirement are added with optimal scores, and a binding request is sent to a corresponding node in the topology domain according to the scheduling result of the topology domain with the highest score, and a kubelet component on the node is responsible for specific binding actions.
  2. 2. The hierarchical topology domain weight aware task scheduling method of claim 1, wherein a topology domain level of a nearest common ancestor between any two nodes does not exceed a topology domain level specified by the task, among all nodes to which all pods of the task are assigned.
  3. 3. The task scheduling method of claim 1, wherein the pre-selecting adopts a Kubernetes default filtering algorithm, and the method comprises scheduling based on resource request, tag-based scheduling and affinity/anti-affinity-based scheduling, and specifically comprises the following steps: A. Filtering out node resources of which the ports required by the Pod are occupied; B. filtering node resources which do not meet the resources required by the Pod, wherein the resources required by the Pod comprise CPU, memory, GPU; C. Screening out node resources meeting Pod affinity/anti-affinity requirements; D. And screening out NodeSelector attributes meeting Pod and node resources of HostName.
  4. 4. The task scheduling method for hierarchical topology domain weight perception according to claim 1, wherein the optimization is scored according to node weights of preselected nodes and hierarchical topology domains, the final node score is the sum of all topology domain scores from the lowest topology domain to all father topology domains to the highest layer of the preselected nodes, when the node weights are different, node resources with highest scores, namely node resources with optimal performance are selected preferentially according to the scoring result, when the node weights are identical, the final node scores are identical, scheduling is performed according to a topology domain compact packaging strategy, one node is selected randomly during first Pod scheduling, and the node with the lowest nearest public ancestor hierarchy formed by the node where the Pod is located is selected preferentially during second Pod scheduling.
  5. 5. A task scheduling system implementing hierarchical topology domain weight awareness of the method of claim 1, comprising: the topology domain construction module is used for modeling the node performance and the network bandwidth difference among the nodes to obtain a weighted hierarchical cluster node performance topology domain model; The preselection module is used for preselecting nodes in all topological domains of the task-designated topological domain level, and finding out node resources meeting the resource requirements and the affinity/anti-affinity requirements to serve as preselection nodes; the node optimization module is used for performing optimization scoring according to the node weight and the hierarchical topology domain of the preselected node, selecting an optimal node, distributing Pod to the node, and recording a scheduling result; And the topology domain optimization module is used for adding node resource optimization scores of the allocated Pods in each topology domain meeting the requirements when the plurality of topology domains meet the task scheduling requirements, sending binding requests to corresponding nodes in the topology domain according to the scheduling results of the topology domain with the highest score, and enabling kubelet components on the nodes to be responsible for specific binding actions.
  6. 6. A hierarchical topology domain weight aware task scheduling device comprising one or more processors configured to implement the hierarchical topology domain weight aware task scheduling method of any of claims 1-4.
  7. 7. An electronic device comprising a memory and a processor, wherein the memory is coupled to the processor, wherein the memory is configured to store program data, and wherein the processor is configured to execute the program data to implement the hierarchical topology domain weight aware task scheduling method of any of the preceding claims 1-4.
  8. 8. A computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the hierarchical topology domain weight aware task scheduling method of any of claims 1-4.

Description

Task scheduling method and system for hierarchical topology domain weight perception Technical Field The invention belongs to the technical field of cluster task scheduling, and relates to a task scheduling method and system for hierarchical topology domain weight perception. Background At present, in an AI large model training scene, node resources with the same specification can cause node performance or network transmission performance difference among nodes due to factors such as network topology, hardware driving and the like, so that an overall training task is dragged slowly by nodes with poor performance, and training efficiency is obviously affected. For example, the distributed model training adopts parallel modes such as model parallel and the like to split the whole model onto a plurality of nodes, a large amount of data exchange can be carried out between the nodes in the training process, the fewer switches are crossed between two nodes, the lower the communication delay is, the higher the throughput is, and when the distributed nodes are crossed between the switches, the network transmission performance between the nodes is lower than that under the same switch, so that the training efficiency is obviously affected. Therefore, users want to schedule the workload to the best performance domain, reduce the communication across the switches as much as possible, to speed up the data exchange and improve the training efficiency. Kubernetes, as the most popular container orchestration platform at present, provides a variety of scheduling strategies, such as resource request based scheduling, tag based scheduling, affinity/anti-affinity based scheduling, and the like. And the large model training requires a large number of nodes to cooperatively work and communicate, and new scheduling requirements are put forward for the existing scheduling system. In the prior art, the scheduling mode is static, and Pod can be scheduled to a specific node group through a Label Selector (Label Selector), but the scheduling strategy cannot be dynamically adjusted according to node performance or network bandwidth difference among nodes, wherein Pod is the most basic, smallest and deployable and manageable computing unit in Kubernetes. Disclosure of Invention Aiming at the problems in the implementation of the prior art, the invention provides a task scheduling method and a task scheduling system for hierarchical topology domain weight perception, which consider the node performance and the network bandwidth difference among nodes, can schedule the workload to the optimal performance domain and improve the training efficiency of a large model. In order to achieve the technical aim, the invention adopts the following technical scheme that the task scheduling method for sensing the weight of the hierarchical topological domain comprises the following steps: modeling the node performance and the network bandwidth difference among the nodes to obtain a weighted hierarchical cluster node performance topology domain model; submitting tasks to the Kubernetes cluster, and designating a topology domain hierarchy of the tasks; Respectively traversing all Pods created by the tasks in the Kubernetes cluster, designating a topology domain of a topology domain hierarchy by the tasks, and preselecting and optimizing all nodes in each topology domain; in the resource preselection stage, node resources meeting the resource requirements and affinity/anti-affinity requirements are found out and used as preselection nodes; In the resource optimization stage, optimization scoring is carried out according to node weights of preselected nodes and the hierarchical topology domain, an optimal node is selected, pod is distributed to the node, and a scheduling result is recorded; when all Pods of the task find nodes which can be allocated in a topology domain, the task is successfully scheduled in the topology domain; if the multiple topology domains all meet the task scheduling requirement, node resources with Pod allocated in each topology domain meeting the requirement are added with optimal scores, and a binding request is sent to a corresponding node in the topology domain according to the scheduling result of the topology domain with the highest score, and a kubelet component on the node is responsible for specific binding actions. The modeling method comprises the steps of obtaining node network topology information and hardware performance labels, constructing an initial topology domain according to a switch hierarchy based on network topology positions, carrying out calculation performance analysis on each node in each bottom topology domain in the initial topology domain, constructing the topology domain according to the performance labels, scoring according to the node performance labels in the topology domain, and converting the scoring into weights. Further, in the weighted hierarchical cluster node performance topology domain mod