CN-122019088-A - Cloud computing-based distributed computing task cooperative processing method and system

CN122019088ACN 122019088 ACN122019088 ACN 122019088ACN-122019088-A

Abstract

The invention relates to the technical field of task processing, in particular to a cloud computing-based distributed computing task collaborative processing method and a cloud computing-based distributed computing task collaborative processing system, which comprise the steps of obtaining a distributed computing task to be processed and task characteristic data of the distributed computing task; the method comprises the steps of carrying out split processing on a distributed computing task based on task feature data to generate a plurality of subtasks corresponding to the distributed computing task, obtaining node feature data and historical load data of each computing node in a cloud computing platform, and distributing resource nodes for the subtasks based on the node feature data and the historical load data of each computing node by combining the task feature data of the subtasks to obtain a resource node set corresponding to each subtask. According to the invention, by carrying out feature analysis and splitting on the distributed computing tasks and combining the features of each computing node and the historical load data, the resource nodes are reasonably allocated for each subtask, so that the utilization efficiency of resources is improved, and the idle and waste of the resources are avoided.

Inventors

WANG BO

Assignees

宁波子骞信息科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260122

Claims (10)

1. The cloud computing-based distributed computing task cooperative processing method is characterized by comprising the following steps of: acquiring task feature data of a distributed computing task to be processed; Splitting the distributed computing task based on the task feature data to generate a plurality of subtasks corresponding to the distributed computing task, wherein any subtask carries part or all of the task feature data; Acquiring node characteristic data and historical load data of each computing node in the cloud computing platform; Based on the node characteristic data and the historical load data of each computing node, combining the task characteristic data of the plurality of subtasks, distributing resource nodes for the plurality of subtasks, and obtaining a resource node set corresponding to each subtask; Analyzing the logic dependency relationship among the plurality of subtasks and constructing the communication link characteristics among different resource nodes based on the resource node set corresponding to each subtask, and generating a cooperative processing strategy among the plurality of subtasks, wherein the cooperative processing strategy is used for indicating the time sequence and the path of the data interaction of the subtasks on the different resource nodes; and estimating the execution efficiency data corresponding to the distributed computing task by adopting the cooperative processing strategy.
2. The cloud computing-based distributed computing task collaborative processing method according to claim 1, wherein the node characteristic data comprises a computing power level and a network bandwidth level; The method for distributing resource nodes for the subtasks based on the node characteristic data and the historical load data of the computing nodes and combining the task characteristic data of the subtasks to obtain a resource node set corresponding to each subtask comprises the following steps: carrying out standardized representation processing on the node characteristic data and the historical load data of each computing node to generate node standard characteristic information of each computing node; Searching computing nodes matched with the plurality of subtasks in the cloud computing platform based on task feature data of the plurality of subtasks; Acquiring matching degree scores between task feature data of any subtask and the node standard feature information of the searched computing node; Determining the computing node with the highest matching degree score as the main resource node of the subtask; and if the matching degree score is larger than or equal to a set allocation threshold value, adding the main resource node into the resource node set corresponding to the subtask.
3. The cloud computing-based distributed computing task collaborative processing method according to claim 2, wherein, based on node feature data and historical load data of each computing node, resource nodes are allocated to the plurality of subtasks in combination with task feature data of the plurality of subtasks to obtain a resource node set corresponding to each subtask, further comprising: If the matching degree score is smaller than the allocation threshold, selecting a standby resource node from a standby node pool of the cloud computing platform; Calculating a matching degree score between the task feature data of the subtasks and the node standard feature information of the standby resource nodes; and if the calculated matching degree score is greater than or equal to the allocation threshold value, adding the standby resource node into a resource node set corresponding to the subtask.
4. The cloud computing-based distributed computing task cooperative processing method according to claim 3, wherein the analyzing the logical dependency relationship among the plurality of subtasks and constructing the communication link characteristics among different resource nodes based on the resource node set corresponding to each subtask, and generating the cooperative processing strategy among the plurality of subtasks, comprises: Analyzing the logic dependency relationship among the plurality of subtasks to obtain task dependency link data; calculating communication delay information between any two resource nodes and constructing communication link characteristics based on the resource node set corresponding to each subtask; according to the execution sequence determined by the task dependent link data, sequentially distributing operation instructions to each subtask; Based on the network bandwidth determined by the communication link characteristics, adjusting the data transmission quantity between adjacent subtasks; and combining the adjusted operation instruction and the data transmission capacity into the cooperative processing strategy.
5. The cloud computing-based distributed computing task co-processing method of claim 4, wherein after employing the co-processing policy, the method further comprises: Acquiring current load data of each computing node in real time in the process of executing the distributed computing task by adopting the cooperative processing strategy; If the current load data indicates that any resource node is in an overload state, screening target subtasks running on the overload resource node from the plurality of subtasks, searching idle resource nodes with load lower than a set load threshold value in the resource node set, migrating the execution right of the target subtasks from the overload resource node to the idle resource node, and updating the cooperative processing strategy; after the collaborative processing strategy is adopted to estimate the execution efficiency data corresponding to the distributed computing task, the method further comprises the following steps: if the execution efficiency data meets the set performance requirement, the cooperative processing strategy is issued to a dispatching center of the cloud computing platform; and controlling the dispatching center to allocate computing resources and start the distributed computing task according to the collaborative processing strategy.
6. The cloud computing-based distributed computing task collaborative processing method according to claim 5, wherein estimating execution efficiency data corresponding to the distributed computing task using the collaborative processing policy includes: Inputting the collaborative processing strategy into an efficiency estimation model, wherein the efficiency estimation model is used for representing the mapping relation between the collaborative strategy and the task execution duration; Calling the efficiency estimation model to output estimated execution time length corresponding to the distributed computing task as the execution efficiency data, and After the distributed computing task is executed, acquiring actual execution time length; Calculating a deviation value between the actual execution duration and the estimated execution duration; and if the deviation value is larger than the set error range, carrying out parameter updating on the efficiency estimation model by utilizing the actual execution time length.
7. The cloud computing-based distributed computing task collaborative processing method according to claim 6, wherein the task feature data includes task type, data magnitude, and processing aging requirements; The method for acquiring the distributed computing task to be processed and the task characteristic data of the distributed computing task comprises the following steps: Receiving a calculation request sent by a user terminal, and extracting a distributed calculation task to be processed from the calculation request; Carrying out semantic analysis on the distributed computing task and identifying task attribute tags; And generating the task feature data based on the task attribute tags.
8. The cloud computing-based distributed computing task collaborative processing method according to claim 7, wherein before allocating resource nodes to the plurality of subtasks based on node feature data and historical load data of the computing nodes in combination with task feature data of the plurality of subtasks to obtain a set of resource nodes corresponding to each subtask, the method further comprises: Analyzing task type information of the distributed computing task according to the task characteristic information; splitting the distributed computing task into a plurality of task sub-modules according to the task type information and the node state information; calculating communication delay information between any two computing nodes in the computing node cluster; According to the module demand information of the task sub-module and the node load capacity information, matching a candidate transmission link in an initial transmission link set; Predicting transmission efficiency fluctuation range information of the candidate transmission link in a preset processing period according to the real-time load information of the candidate transmission link; Judging whether the transmission efficiency fluctuation range information meets preset efficiency requirement threshold information or not, and determining the candidate transmission link as a target transmission link when judging that the transmission efficiency fluctuation range information meets the preset efficiency requirement threshold information.
9. The cloud computing-based distributed computing task collaborative processing method according to claim 8, wherein splitting the distributed computing task into a plurality of task sub-modules according to the task type information and the node state information comprises: determining logic dependency relationship information of the distributed computing task according to the task type information; According to the node state information, analyzing computing power resource distribution information and network topology structure information of the computing node cluster; Identifying a critical path task and a non-critical path task in the distributed computing task according to the logic dependency relationship information; And combining the computing power resource distribution information and the network topology structure information, and disassembling the critical path task and the non-critical path task into a plurality of independent task sub-modules according to a preset splitting granularity.
10. The cloud computing-based distributed computing task cooperative processing system is applicable to the cloud computing-based distributed computing task cooperative processing method according to any one of claims 1 to 9, and is characterized by comprising the following steps: the task acquisition unit is used for acquiring a distributed computing task to be processed and task characteristic data of the distributed computing task; The task splitting unit is used for splitting the distributed computing task based on the task characteristic data to generate a plurality of subtasks corresponding to the distributed computing task, wherein any subtask carries part or all of the task characteristic data; the node acquisition unit is used for acquiring node characteristic data and historical load data of each computing node in the cloud computing platform; The task allocation unit is used for allocating resource nodes for the plurality of subtasks based on the node characteristic data and the historical load data of each computing node and combining the task characteristic data of the plurality of subtasks to obtain a resource node set corresponding to each subtask; The cooperative processing unit is used for analyzing the logic dependency relationship among the plurality of subtasks and constructing the communication link characteristics among different resource nodes based on the resource node set corresponding to each subtask, and generating a cooperative processing strategy among the plurality of subtasks, wherein the cooperative processing strategy is used for indicating the time sequence and the path of the data interaction of the subtasks on the different resource nodes; and the execution estimating unit is used for estimating the execution efficiency data corresponding to the distributed computing task by adopting the cooperative processing strategy.

Description

Cloud computing-based distributed computing task cooperative processing method and system Technical Field The invention relates to the technical field of task processing, in particular to a cloud computing-based distributed computing task cooperative processing method and a cloud computing-based distributed computing task cooperative processing system. Background At present, the conventional method often adopts a static resource allocation strategy, and cannot be flexibly adjusted according to real-time load and task characteristics, so that part of computing nodes may be in an idle state for a long time, other nodes may be inefficient due to overload, and when the conventional method faces sudden load changes, the conventional method usually lacks a dynamic adjustment mechanism, and cannot timely migrate tasks or reallocate resources, thereby influencing the stability and high availability of the system. In addition, the traditional method often lacks fine consideration on task splitting and scheduling, which may cause improper processing of dependency relationship between tasks, increase delay of data transmission, reduce overall execution efficiency, and the traditional distributed computing system does not have an effective real-time monitoring mechanism, so that load conditions of all computing nodes cannot be timely obtained, quick decision and optimization cannot be made, and generally depends on preset scheduling rules, and analysis on task characteristics and historical loads is lacking, so that task scheduling cannot adapt to requirements of different scenes, and performance is poor under certain conditions. Disclosure of Invention In order to achieve the above purpose, the invention provides a distributed computing task cooperative processing method based on cloud computing, which comprises the following steps: acquiring task feature data of a distributed computing task to be processed; Splitting the distributed computing task based on the task feature data to generate a plurality of subtasks corresponding to the distributed computing task, wherein any subtask carries part or all of the task feature data; Acquiring node characteristic data and historical load data of each computing node in the cloud computing platform; Based on the node characteristic data and the historical load data of each computing node, combining the task characteristic data of the plurality of subtasks, distributing resource nodes for the plurality of subtasks, and obtaining a resource node set corresponding to each subtask; Analyzing the logic dependency relationship among the plurality of subtasks and constructing the communication link characteristics among different resource nodes based on the resource node set corresponding to each subtask, and generating a cooperative processing strategy among the plurality of subtasks, wherein the cooperative processing strategy is used for indicating the time sequence and the path of the data interaction of the subtasks on the different resource nodes; and estimating the execution efficiency data corresponding to the distributed computing task by adopting the cooperative processing strategy. Preferably, the node characteristic data comprises a computing capacity grade and a network bandwidth grade, wherein the historical load data comprises CPU occupancy rate and memory occupancy rate in a historical time period; The method for distributing resource nodes for the subtasks based on the node characteristic data and the historical load data of the computing nodes and combining the task characteristic data of the subtasks to obtain a resource node set corresponding to each subtask comprises the following steps: carrying out standardized representation processing on the node characteristic data and the historical load data of each computing node to generate node standard characteristic information of each computing node; Searching computing nodes matched with the plurality of subtasks in the cloud computing platform based on task feature data of the plurality of subtasks; Acquiring matching degree scores between task feature data of any subtask and the node standard feature information of the searched computing node; Determining the computing node with the highest matching degree score as the main resource node of the subtask; and if the matching degree score is larger than or equal to a set allocation threshold value, adding the main resource node into the resource node set corresponding to the subtask. Preferably, based on the node feature data and the historical load data of each computing node, the resource nodes are allocated to the plurality of subtasks by combining the task feature data of the plurality of subtasks, so as to obtain a resource node set corresponding to each subtask, and the method further includes: If the matching degree score is smaller than the allocation threshold, selecting a standby resource node from a standby node pool of the cloud computing platform; Calcul