Search

CN-122019107-A - Heterogeneous calculation force integrated dynamic scheduling system

CN122019107ACN 122019107 ACN122019107 ACN 122019107ACN-122019107-A

Abstract

The invention discloses a heterogeneous calculation force integrated dynamic scheduling system, which belongs to the technical field of computers and comprises the following components: the heterogeneous computing power monitoring decision component analyzes computing power resource states based on a topological coherence theory to generate topological feature descriptors, optimizing parameters and state conversion signals, and comprises a topological performance space construction unit, a topological data storage unit, a topological feature analysis unit, a manifold learning unit, a topological maintenance optimization unit, a parameter self-adaptive adjustment unit, a topological state description unit, a state configuration mapping unit and a topological mutation detection unit.

Inventors

  • Liang Yinye
  • LI SHOUYUE
  • YU HONGMIAO
  • GU XUE
  • HE SHUNYUN

Assignees

  • 青海数字经济发展集团有限公司

Dates

Publication Date
20260512
Application Date
20260210

Claims (10)

  1. 1. Heterogeneous calculation power integration dynamic scheduling system, characterized by comprising: The heterogeneous computing force unified modeling component is used for carrying out unified modeling on heterogeneous computing force resources to generate a heterogeneous computing force resource model; The heterogeneous computing force integrated scheduling component is connected with the heterogeneous computing force unified modeling component and is used for receiving a resource model sent by the heterogeneous computing force unified modeling component and generating a scheduling strategy based on the resource model; The heterogeneous computing power integrated driving component is connected with the heterogeneous computing power integrated scheduling component and is used for receiving a scheduling strategy sent by the heterogeneous computing power integrated scheduling component and executing resource calling and task distribution; The heterogeneous computing force monitoring decision-making component is connected with the heterogeneous computing force integrated driving component and is used for collecting execution data of the heterogeneous computing force integrated driving component, analyzing computing force resource states based on a topological coherence theory and generating topological feature descriptors, optimization parameters and state conversion signals; and the system operation and maintenance component is connected with the heterogeneous computing power monitoring decision component and is used for receiving the monitoring data generated by the heterogeneous computing power monitoring decision component and providing a system state visual display and maintenance function.
  2. 2. The heterogeneous computing power integrated dynamic scheduling system of claim 1, wherein the heterogeneous computing power monitoring decision component comprises: The topological performance space construction unit is used for collecting the computational power resource performance data sent by the heterogeneous computational power integrated driving component, constructing a computational power resource multidimensional performance index system, generating a performance point cloud, establishing a simple complex and extracting topological characteristics; the topological data storage unit is connected with the topological performance space construction unit and is used for storing a performance point set, an edge set, a simplex set and a continuous coherent descriptor; The topology characteristic analysis unit is connected with the topology data storage unit and is used for analyzing the continuous coherent descriptors and generating computing power resource topology characteristic description.
  3. 3. The heterogeneous computing force integrated dynamic scheduling system of claim 2, wherein the heterogeneous computing force monitoring decision component further comprises: The manifold learning unit is used for constructing a computing power resource performance manifold based on the data in the topology data storage unit, determining a neighborhood relation, calculating local coordinates and generating global coordinate alignment; The topology maintenance optimization unit is connected with the manifold learning unit and is used for executing weight optimization on the computing power resource performance manifold, calculating a topology loss function and a performance loss function, generating an optimization gradient, updating a weight vector and verifying the stability of a topology structure; And the parameter self-adaptive adjusting unit is connected with the topology maintenance optimizing unit and is used for dynamically adjusting the learning rate, the topology maintenance weight and the neighborhood range.
  4. 4. The heterogeneous computing force integrated dynamic scheduling system of claim 3, wherein the heterogeneous computing force monitoring decision component further comprises: The topology state description unit is used for constructing a state descriptor system based on topology invariants and generating a zero-order descriptor, a first-order descriptor, a high-order descriptor and a multi-scale integration descriptor; The state configuration mapping unit is connected with the topology state description unit and is used for establishing a mapping relation between a topology state and system configuration, collecting historical data, extracting key features, learning a mapping model and generating configuration suggestions; The topology mutation detection unit is connected with the topology state description unit and the state configuration mapping unit and is used for monitoring the change of the continuous graph, tracking the invariant of the topology, calculating the change rate, detecting the structural mutation, evaluating the influence range and executing the corresponding coping strategy.
  5. 5. The heterogeneous computing force integrated dynamic scheduling system of claim 1, wherein the heterogeneous computing force unified modeling component comprises: The model library modeling unit is used for receiving the topological feature descriptors sent by the heterogeneous computing power monitoring decision component and updating a computing power resource model library; The business demand analysis unit is used for analyzing and calculating business characteristics and determining the demand characteristics of business on computing power resources; The computing power resource pool modeling unit is used for constructing a computing power resource pool model based on the topological feature descriptor and generating resource configuration information; and the model training unit is used for training and updating the heterogeneous computing power unified model based on the optimization parameters sent by the heterogeneous computing power monitoring decision component.
  6. 6. The heterogeneous computing force integrated dynamic scheduling system of claim 1, wherein the heterogeneous computing force integrated scheduling component comprises: the calculation power scheduling unit is used for receiving the resource model sent by the heterogeneous calculation power unified modeling component, and carrying out optimization evaluation calculation based on an objective function to generate an optimal solution; the task analysis unit is used for analyzing the characteristics of the computing task and determining the requirement of the task on the computing power resource; and the resource matching unit is used for executing the optimal matching of the task and the resource according to the analysis result of the task analysis unit and the optimal solution of the computational power scheduling unit, and generating a scheduling strategy.
  7. 7. The heterogeneous computing force integrated dynamic scheduling system of claim 1, wherein the heterogeneous computing force integrated drive assembly comprises: The computing power task scheduling unit is used for receiving the scheduling strategy sent by the heterogeneous computing power integrated scheduling component, managing a computing task queue and distributing computing tasks; The computing power resource driving unit is used for calling computing power resources based on the scheduling strategy, monitoring the state of the resources and dynamically adjusting the resource allocation; And the heterogeneous computing power adaptation scheduling unit is used for realizing adaptation and switching among different types of computing power resources and ensuring seamless cooperation of the resources.
  8. 8. The heterogeneous computing power integrated dynamic scheduling system of claim 1, wherein the system operation and maintenance component comprises: The system maintenance unit is used for maintaining and upgrading all components of the system; The information extraction unit is used for receiving monitoring data from the heterogeneous computing power monitoring decision component and extracting key system state information; The visual display unit is used for displaying the system state information in a visual mode and supporting multi-dimensional interactive data display; and the exception handling unit is used for detecting system exception and executing fault isolation and recovery operation.
  9. 9. The heterogeneous computing force integrated dynamic scheduling system of claim 1, wherein the heterogeneous computing force monitoring decision component, the heterogeneous computing force unified modeling component, the heterogeneous computing force integrated scheduling component, and the heterogeneous computing force integrated driving component form a data closed loop, comprising: Performance data flow from the heterogeneous computational force integration driver component to the heterogeneous computational force monitoring decision component; a topological feature data stream from the heterogeneous computational force monitoring decision component to the heterogeneous computational force unified modeling component; A resource model data flow from the heterogeneous computational power unified modeling component to the heterogeneous computational power integrated scheduling component; and the scheduling strategy data flow from the heterogeneous computing force integrated scheduling component to the heterogeneous computing force integrated driving component.
  10. 10. The heterogeneous computing power integrated dynamic scheduling system of claim 1, further comprising an exception handling and fault tolerance mechanism, comprising: A data exception handling function for handling missing data, filtering noise data, and detecting outliers; the component fault processing function is used for monitoring the running state of each component, executing fault isolation and supporting degradation running; The resource fault processing function is used for monitoring the health state of the power resources, identifying fault resources and dynamically adjusting task allocation; And the system recovery function is used for saving the key state of the system and supporting quick recovery and progressive recovery.

Description

Heterogeneous calculation force integrated dynamic scheduling system Technical Field The invention relates to the technical field of computers, in particular to a heterogeneous computing power integrated dynamic scheduling system which is applied to an environment where a plurality of heterogeneous computing resources work cooperatively. Background With the rapid development of technologies such as artificial intelligence, big data analysis, and high performance computing, heterogeneous computing environments are becoming more and more common. In such an environment, CPU, GPU, FPGA, TPU and other different types of computing resources coexist, each having different computing characteristics and performance advantages. How to efficiently manage and schedule these heterogeneous computing resources, to make them work together and fully exploit their respective advantages, becomes an important challenge in the current computing field. The existing heterogeneous computing resource management system mainly adopts a heuristic rule or simple statistical model-based method to schedule resources, and is difficult to adapt to complex and changeable computing environments. The system has the defects that firstly, the traditional method mainly focuses on single-point performance indexes, complex relation structures among resources are difficult to capture, secondly, data noise and environmental fluctuation are sensitive, so that resource management is unstable, thirdly, local optimization and global optimization are difficult to balance, the situation that resource distribution is unbalanced frequently occurs, and finally, the adaptability to environmental changes is poor, and dynamic change of calculation load cannot be responded quickly. Therefore, a system capable of uniformly modeling, intelligently scheduling and dynamically adapting heterogeneous computing power resources is needed to improve the overall performance and the resource utilization rate of the system. Disclosure of Invention The invention aims to provide a heterogeneous computing power integrated dynamic scheduling system, introduces a topological coherent theory as a core innovation method, realizes efficient monitoring, accurate evaluation and intelligent scheduling of heterogeneous computing power resources, and solves the problems in the prior art. The invention provides a heterogeneous calculation force integrated dynamic scheduling system, which comprises: The heterogeneous computing force unified modeling component is used for carrying out unified modeling on heterogeneous computing force resources to generate a heterogeneous computing force resource model; The heterogeneous computing force integrated scheduling component is connected with the heterogeneous computing force unified modeling component and is used for receiving a resource model sent by the heterogeneous computing force unified modeling component and generating a scheduling strategy based on the resource model; The heterogeneous computing power integrated driving component is connected with the heterogeneous computing power integrated scheduling component and is used for receiving a scheduling strategy sent by the heterogeneous computing power integrated scheduling component and executing resource calling and task distribution; The heterogeneous computing force monitoring decision-making component is connected with the heterogeneous computing force integrated driving component and is used for collecting execution data of the heterogeneous computing force integrated driving component, analyzing computing force resource states based on a topological coherence theory and generating topological feature descriptors, optimization parameters and state conversion signals; and the system operation and maintenance component is connected with the heterogeneous computing power monitoring decision component and is used for receiving the monitoring data generated by the heterogeneous computing power monitoring decision component and providing a system state visual display and maintenance function. Preferably, the heterogeneous computing power monitoring decision component comprises: The topological performance space construction unit is used for collecting the computational power resource performance data sent by the heterogeneous computational power integrated driving component, constructing a computational power resource multidimensional performance index system, generating a performance point cloud, establishing a simple complex and extracting topological characteristics; the topological data storage unit is connected with the topological performance space construction unit and is used for storing a performance point set, an edge set, a simplex set and a continuous coherent descriptor; The topology characteristic analysis unit is connected with the topology data storage unit and is used for analyzing the continuous coherent descriptors and generating computing power resource topology characte