Search

CN-122019082-A - Scheduling system, scheduling method of scheduling system, electronic device, and storage medium

CN122019082ACN 122019082 ACN122019082 ACN 122019082ACN-122019082-A

Abstract

The application discloses a dispatching system, a dispatching method of the dispatching system, electronic equipment and a storage medium, and belongs to the technical field of cloud computing. The scheduling system comprises a scheduling control module, a resource management module and a job scheduling module, wherein the scheduling control module is used for responding to a received task to be scheduled, determining a target scheduling computing node from all scheduling computing nodes contained in a container scheduling cluster according to the calculation force specification requirement of the task to be scheduled, the resource management module is used for deploying a target container group on the target scheduling computing node and registering the target container group as a scheduling computing node in a job scheduling cluster, the calculation force resource quantity allocated to the target container group is matched with the calculation force specification requirement, the job scheduling module is used for distributing the task to be scheduled to the target container group for execution, and the calculation force resource quantity of the scheduling computing node corresponding to the target container group is dynamically applied and divided based on the requirement and is not the system resource quantity of all the scheduling computing nodes, so that the problem that the calculation force resource utilization rate of the job scheduling cluster is low is effectively solved.

Inventors

  • WANG FEI
  • JIANG JIELONG
  • DENG YUNLONG
  • ZHOU QIFENG
  • SHI KAIWEN
  • Wu Dianqiu

Assignees

  • 广电运通集团股份有限公司

Dates

Publication Date
20260512
Application Date
20260109

Claims (10)

  1. 1. A scheduling system, comprising a container orchestration cluster and a job scheduling cluster, the job scheduling cluster deployed and operating on the container orchestration cluster, the scheduling system further comprising: The scheduling control module is used for responding to the received task to be scheduled and determining a target scheduling computing node from all scheduling computing nodes contained in the container scheduling cluster according to the computational power specification requirement of the task to be scheduled; The resource management module is used for deploying a target container group on the target scheduling computing node and registering the target container group as a scheduling computing node in the job scheduling cluster, wherein the amount of the computing power resources allocated by the target container group is matched with the computing power specification requirement; And the job scheduling module is used for distributing the task to be scheduled to the target container group for execution.
  2. 2. The scheduling system of claim 1, wherein the resource management module is configured to partition a target virtual power slice meeting a power specification requirement from a physical power card on the target orchestration computing node using a container resource isolation mechanism or a device virtualization plug-in, and bind the target virtual power slice as a power resource to the target container group for power resource sharing multiplexing.
  3. 3. The scheduling system of claim 1, wherein the scheduling control module is configured to screen candidate orchestration computing nodes from the orchestration computing nodes included in the container orchestration cluster that meet the computational power specification requirements of the tasks to be scheduled; Accessing a node dynamic score queue, and acquiring node scores of candidate arrangement calculation nodes from the node dynamic score queue, wherein the node scores are calculated according to a plurality of resource monitoring parameters of the candidate arrangement calculation nodes; And determining the candidate arrangement calculation node with the highest node score as the target arrangement calculation node.
  4. 4. A scheduling system in accordance with claim 3 wherein the scheduling control module is configured to maintain a node dynamic score queue based on: Periodically acquiring each resource monitoring parameter of each orchestration computing node from a monitoring component, wherein the resource monitoring parameters comprise at least one performance parameter and resource residual condition parameters; According to each resource monitoring parameter of each arrangement computing node, respectively determining a parameter score of each resource monitoring parameter of each arrangement computing node; Carrying out weighted summation processing on the parameter scores and the parameter weights of the resource monitoring parameters to obtain node scores of the arranging calculation nodes; And storing the node scores of the orchestration calculation nodes in the node dynamic score queue according to the order of the node scores from high to low.
  5. 5. The scheduling system according to claim 1, wherein the resource management module is configured to set a node name of a scheduling computing node corresponding to the target container group as a task name of the task to be scheduled when registering the target container group as the scheduling computing node in the job scheduling cluster; The job scheduling module is used for searching the target container group from the job scheduling cluster according to the task name of the task to be scheduled, and distributing the task to be scheduled to the target container group for execution.
  6. 6. The scheduling system of claim 5, wherein the job scheduling module comprises a container process manager; the target container group sends a task execution completion semaphore to the container process manager under the condition that the task to be scheduled is executed, so that the container process manager exits from a main process; The target container group triggers automatic destruction and releases own computing power resources under the condition that the main process exits; and under the condition that the resource management module detects the destroying semaphore of the target container group, canceling the scheduling computing node corresponding to the target container group from the job scheduling cluster.
  7. 7. The scheduling system of any one of claims 1-6, wherein the resource management module is further configured to obtain a custom resource for a user to submit the task to be scheduled to the scheduling control module; Under the condition that the self-defined resources comprise task priorities, inserting the tasks to be scheduled into task scheduling queues of the job scheduling cluster according to the task priorities; inserting the task to be scheduled into the tail of the task scheduling queue under the condition that the user-defined resource does not comprise task priority; the scheduling control module is used for taking the current task to be scheduled from the queue first reading of the task scheduling queue.
  8. 8. A scheduling method of a scheduling system is characterized in that the scheduling system comprises a container scheduling cluster and a job scheduling cluster, the job scheduling cluster is deployed and operated on the container scheduling cluster, the scheduling system further comprises a scheduling control module, a resource management module and a job scheduling module, and the scheduling method comprises the following steps: Responding to the received task to be scheduled through the scheduling control module, and determining a target scheduling computing node from all scheduling computing nodes contained in the container scheduling cluster according to the calculation power specification requirement of the task to be scheduled; Deploying a target container group on the target scheduling computing node through the resource management module, and registering the target container group as a scheduling computing node in the job scheduling cluster, wherein the amount of the computing power resources allocated by the target container group is matched with the computing power specification requirement; And distributing the task to be scheduled to the target container group for execution through the job scheduling module.
  9. 9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the scheduling method of the scheduling system of claim 8 when the program is executed by the processor.
  10. 10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements a scheduling method of a scheduling system according to claim 8.

Description

Scheduling system, scheduling method of scheduling system, electronic device, and storage medium Technical Field The application belongs to the technical field of cloud computing, and particularly relates to a scheduling system, a scheduling method of the scheduling system, electronic equipment and a storage medium. Background In job scheduling systems, some clusters, such as the simple Linux resource management and job scheduling systems (Simple Linux Utility for Resource Management, slurm), do not allow one node to run multiple tasks (or jobs) at the same time by default, and resources on one node cannot be shared by multiple tasks (i.e. the resources are not overstocked and overstocked (Overcommit) or shared), which can result in a large amount of node resources being idle (especially computing resources), especially in light load periods, and the computing resources of the system are "overruled" by a few tasks, so that the resource utilization is low. Disclosure of Invention The present application is directed to solving at least one of the technical problems existing in the related art. Therefore, the application provides a dispatching system, a dispatching method of the dispatching system, electronic equipment and a storage medium, and improves the resource utilization rate of system computing power and the like. In a first aspect, the present application provides a scheduling system, the scheduling system comprising a container scheduling cluster and a job scheduling cluster, the job scheduling cluster being deployed and operated on the container scheduling cluster, the scheduling system further comprising: the scheduling control module is used for responding to the received task to be scheduled and determining a target scheduling computing node from all scheduling computing nodes contained in the container scheduling cluster according to the calculation power specification requirement of the task to be scheduled; The resource management module is used for deploying the target container group on the target scheduling computing node and registering the target container group as a scheduling computing node in the job scheduling cluster, wherein the amount of the distributed computing power resources of the target container group is matched with the computing power specification requirement; and the job scheduling module is used for distributing the task to be scheduled to the target container group for execution. The scheduling system comprises a container scheduling cluster and a job scheduling cluster, wherein the job scheduling cluster is deployed and operated in the container scheduling cluster, when a task is scheduled, a scheduling control module responds to a received task to be scheduled, a target scheduling computing node is determined from all scheduling computing nodes contained in the container scheduling cluster according to the calculation power specification requirement of the task to be scheduled, a target container group is deployed on the target scheduling computing node through a resource management module and registered as the scheduling computing node in the job scheduling cluster, the calculation power resource quantity allocated to the target container group is matched with the calculation power specification requirement, the task to be scheduled is distributed to the target container group through the job scheduling module to be executed, namely, the created scheduling computing node is exclusive to the task to be scheduled, but the calculation power resource quantity corresponding to the scheduling computing node is based on the system resource quantity which is applied and divided dynamically and is not the system resource quantity which is all the scheduling computing nodes in the container scheduling computing node, and the problem that the calculation power resource utilization rate of the job scheduling cluster (for example Slurm native clusters) is low is effectively solved. According to one embodiment of the application, the resource management module is used for dividing a target virtual computing power slice meeting the computing power specification requirement from a physical computing power card on a target arranging computing node by using a container resource isolation mechanism or a device virtualization plug-in, and binding the target virtual computing power slice to a target container group as computing power resources so as to share and multiplex the computing power resources. According to one embodiment of the application, the scheduling control module is used for screening candidate scheduling calculation nodes meeting the calculation power specification requirement of the task to be scheduled from all scheduling calculation nodes contained in the container scheduling cluster; Accessing a node dynamic score queue, and acquiring node scores of the candidate arrangement calculation nodes from the node dynamic score queue, wherein the node scores are calculated