Search

CN-122019183-A - Scheduling method and system based on multi-equipment cooperation

CN122019183ACN 122019183 ACN122019183 ACN 122019183ACN-122019183-A

Abstract

The application discloses a scheduling method and a scheduling system based on multi-equipment cooperation, and relates to the technical field of cooperative scheduling. Then, an incremental PID control algorithm is introduced to dynamically resolve the thermal bias into a load intensity control signal that is adapted to the current thermal tolerance. And then, establishing a mapping mechanism from the continuous signal to the discrete credentials, generating a token pool for controlling task admission, and routing the task to a physical node with thermal safety margin through atomicity matching of the token and the task. Based on the method, the task execution feedback and the thermal deviation feature are combined, and closed-loop self-adaptive correction and iterative optimization are carried out on the control parameters. Therefore, the forced frequency reduction and performance collapse of the nodes caused by heat accumulation can be effectively avoided, and the computational power self-adaptive matching and scheduling stability guarantee of the heterogeneous clusters under the continuous high-load working condition are realized.

Inventors

  • Pan Yanghuan
  • LI YANG
  • SONG LINGLING

Assignees

  • 浙江数智引擎信息科技有限公司

Dates

Publication Date
20260512
Application Date
20260211

Claims (9)

  1. 1. The scheduling method based on multi-equipment cooperation is characterized by comprising the following steps of: S1, acquiring an original sensor data stream of a heterogeneous device cluster, wherein the original sensor data stream comprises a current core temperature and a target safety temperature; s2, carrying out deviation quantization on the current core temperature and the target safe temperature to obtain a thermal deviation vector containing a temperature error and a change rate thereof; S3, based on the current PID parameter set, performing incremental PID control signal calculation on the thermal deviation vector to obtain a load intensity control signal representing the current allowable bearing capacity proportion of each device; s4, based on the maximum throughput standard of the equipment, carrying out dynamic mapping from a continuous signal to a discrete token on a load intensity control signal to obtain a task token pool containing the current available task admission credentials of each equipment; S5, performing atomic matching verification on the task token pool and the task queue to be processed, and after the tokens of the corresponding devices are successfully deducted, routing the tasks to physical nodes for execution so as to obtain a scheduled task flow and collecting execution feedback data generated in the task operation process; and S6, performing parameter self-adaptive correction and closed-loop feedback based on the execution feedback data and the thermal deviation vector to obtain a PID parameter set for the next scheduling period.
  2. 2. The scheduling method based on multi-device cooperation according to claim 1, wherein step S2 comprises: Performing prior estimation correction and weighted smoothing on the current core temperature in the original sensor data stream to obtain a smoothed current temperature; Calculating an algebraic difference between the target safe temperature and the smoothed current temperature to obtain a temperature error; and performing first-order differential operation on the temperature error to obtain a change rate, and performing structured packaging on the temperature error and the change rate to obtain a thermal deviation vector.
  3. 3. The scheduling method based on multi-device cooperation according to claim 1, wherein step S3 comprises: Based on a preset integral separation threshold, carrying out integral separation threshold judgment on the absolute value of the temperature error in the thermal deviation vector to obtain an integral separation coefficient; based on the current PID parameter set, performing incremental PID calculation on the thermal deviation vector to obtain a control increment value of the period; and superposing the control increment value to the output value of the previous period and performing interval clipping processing on the result to obtain a load intensity control signal.
  4. 4. The scheduling method based on multi-device cooperation according to claim 1, wherein step S4 comprises: Determining a token generation rate in a current control period based on the load strength control signal and a device maximum throughput reference; Calculating a new increment according to the token generation rate and the time interval, and injecting the new increment into the residual state at the last moment to obtain the theoretical total number of tokens containing non-integer parts; Based on the maximum bucket capacity, discretized shaping and anti-overflow clipping are performed on the theoretical total number of tokens to update the integer number of available credentials in the task token pool.
  5. 5. The scheduling method based on multi-device cooperation according to claim 1, wherein step S5 comprises: Candidate devices with effective certificates are screened out through traversing the task token pool, and are bound with the head task of the task queue to be processed based on a load balancing strategy so as to determine candidate task mapping pairs; Performing an atomic subtraction operation on the candidate task map on the corresponding device token number, and distributing task codes to the physical nodes through the communication interface to obtain a scheduled task flow; By lifecycle monitoring of the scheduled task flow, the actual start time and end time of the task are captured to calculate the physical execution time consumption, thereby aggregating to generate execution feedback data including completion status.
  6. 6. The scheduling method based on multi-device cooperation according to claim 1, wherein step S6 comprises: calculating zero crossing rate and average absolute error of the thermal deviation vector in the historical time window to obtain a stability characteristic vector representing the dynamic characteristic of the system; Performing fuzzy gain scheduling reasoning on the stability characteristic vector to obtain a parameter adjustment vector; And iteratively updating the current PID parameter set based on the parameter adjustment vector to output the PID parameter set for use in the next scheduling period.
  7. 7. The scheduling method based on multi-device cooperation according to claim 6, wherein step S6 of performing fuzzy gain scheduling inference on the stability feature vector to obtain the parameter adjustment vector comprises performing fuzzy gain scheduling inference on the stability feature vector according to the following formula: Wherein, the As an oscillating component in the stability feature vector, As the precision component in the stability feature vector, In order to eliminate the adjustment step size coefficient of the error, In order to suppress the adjustment step size coefficient of the oscillation, The vector is adjusted for the parameter.
  8. 8. The multi-device collaboration-based scheduling method of claim 6, wherein iteratively updating a current PID parameter set based on the parameter adjustment vector to output the PID parameter set for use in a next scheduling period comprises: weighting trend synthesis is carried out on the parameter adjustment vector and the historical momentum vector based on the momentum attenuation factor so as to obtain an updated historical momentum vector and a candidate parameter set containing a preliminary update value; based on a pre-detected thermal time constant estimated value, performing equal damping manifold projection correction on the candidate parameter set to obtain a corrected parameter set conforming to thermodynamic coupling constraint; And taking the currently effective PID parameter set as a trust domain center, carrying out dynamic trust domain interception and physical boundary amplitude limiting on the correction parameter set, and scaling the update amplitude when the Euclidean distance between the correction parameter set and the current PID parameter set exceeds a preset trust radius so as to obtain the PID parameter set for the next scheduling period.
  9. 9. A multi-device collaboration-based scheduling system, comprising: the multi-source heterogeneous data acquisition module is used for acquiring an original sensor data stream of a heterogeneous device cluster, wherein the original sensor data stream comprises a current core temperature and a target safe temperature; the deviation quantization module is used for performing deviation quantization on the current core temperature and the target safe temperature to obtain a thermal deviation vector containing a temperature error and the change rate of the temperature error; the PID control resolving module is used for resolving the incremental PID control signals of the thermal deviation vector based on the current PID parameter set to obtain load intensity control signals representing the current allowable bearing capacity proportion of each device; The token dynamic mapping module is used for carrying out dynamic mapping from continuous signals to discrete tokens on the load intensity control signals based on the maximum throughput standard of the equipment so as to obtain a task token pool containing the current available task admission credentials of each equipment; The atomic task scheduling module is used for performing atomic matching verification on the task token pool and the task queue to be processed, and routing the task to a physical node for execution after the token of the corresponding equipment is successfully deducted so as to obtain a scheduled task flow and acquire execution feedback data generated in the task running process; and the self-adaptive correction and feedback module is used for carrying out parameter self-adaptive correction and closed-loop feedback based on the execution feedback data and the thermal deviation vector so as to obtain a PID parameter set for the next scheduling period.

Description

Scheduling method and system based on multi-equipment cooperation Technical Field The application relates to the technical field of cooperative scheduling, in particular to a scheduling method and system based on multi-equipment cooperation. Background With the rapid development of the internet of things, 5G communication and edge computing technologies, mass data is in explosive growth on the edge side of the network, and the computing mode gradually evolves from traditional central cloud centralized processing to edge distributed collaborative processing. In complex application scenarios such as smart cities, industrial internet, intelligent transportation and the like, in order to meet the severe requirements of computation-intensive tasks such as deep learning reasoning, real-time video analysis and the like on low time delay and high throughput, a plurality of computing nodes with different functions and different performances are generally required to be assembled into heterogeneous equipment clusters. By constructing a scheduling scheme based on multi-device cooperation, fragmented computing resources in a cluster can be integrated, and parallel processing and load balancing of tasks are realized, so that physical bottlenecks of single devices in aspects of calculation upper limit, storage capacity, power consumption control and the like are broken through. The efficient collaborative scheduling is not only the key for improving the overall computing power utilization rate of the edge cluster, but also the core technical support for guaranteeing the real-time response capability of key tasks and prolonging the service life of equipment. However, although multi-device co-scheduling can theoretically significantly improve the overall throughput of the cluster, in an actual high-load application scenario, it is often difficult for an existing scheduling policy to maintain the expected performance stability. In the prior art, conventional scheduling algorithms are mostly based on static nominal computing power of a device or current instantaneous resource utilization (such as CPU occupancy and memory remaining), and their core assumption is that the processing power of a computing node is constant or linear predictable. This assumption ignores the thermodynamic properties of the physical device under long, high intensity operation. In reality, especially for embedded edge devices with limited heat dissipation conditions, continuous high-load computation inevitably leads to a steep rise in core temperature, which in turn triggers a forced down-conversion protection mechanism (i.e. thermal throttling) at the hardware level. Once the device enters a thermal protection state, it is not only unable to maintain the estimated full load calculation force, but may even drop down in processing capacity due to frequency dips. At this time, if the scheduling system still continuously distributes tasks according to performance indexes before frequency reduction, serious calculation force prediction deviation is caused, so that a large amount of tasks are backlogged at overheat nodes, and system-level performance collapse and scheduling failure are caused. Therefore, there is a need for an optimized multi-device co-scheduling method and system. Disclosure of Invention The present application has been made to solve the above-mentioned technical problems. According to an aspect of the present application, there is provided a scheduling method based on multi-device cooperation, including: S1, acquiring an original sensor data stream of a heterogeneous device cluster, wherein the original sensor data stream comprises a current core temperature and a target safety temperature; s2, carrying out deviation quantization on the current core temperature and the target safe temperature to obtain a thermal deviation vector containing a temperature error and a change rate thereof; S3, based on the current PID parameter set, performing incremental PID control signal calculation on the thermal deviation vector to obtain a load intensity control signal representing the current allowable bearing capacity proportion of each device; s4, based on the maximum throughput standard of the equipment, carrying out dynamic mapping from a continuous signal to a discrete token on a load intensity control signal to obtain a task token pool containing the current available task admission credentials of each equipment; S5, performing atomic matching verification on the task token pool and the task queue to be processed, and after the tokens of the corresponding devices are successfully deducted, routing the tasks to physical nodes for execution so as to obtain a scheduled task flow and collecting execution feedback data generated in the task operation process; and S6, performing parameter self-adaptive correction and closed-loop feedback based on the execution feedback data and the thermal deviation vector to obtain a PID parameter set for the next s