Search

CN-121996398-A - Task processing method and related equipment

CN121996398ACN 121996398 ACN121996398 ACN 121996398ACN-121996398-A

Abstract

The task processing method is applied to an operation and maintenance platform, the operation and maintenance platform comprises an event queue and a task executor, the event queue receives at least one operation and maintenance task of an application, the task executor adds a task event corresponding to at least one operation and maintenance task into the event queue, the task executor takes out a first task event from the event queue, calls a first execution unit in an execution unit resource pool, executes a first subtask, issues the operation of the first subtask on a resource to the resource, releases the first execution unit, the event queue receives the first resource event sent by the resource, and the task executor determines the next task or the next subtask to be processed according to the state of the resource described by the first resource event. The method takes the event as the center, takes the states of the operation and maintenance task, the subtask and the resource as event processing, and realizes the task processing from smaller granularity. In addition, the execution unit is not blocked by the waiting state of the operation and maintenance task, and the resource utilization rate is improved.

Inventors

  • LUO JINRONG
  • LI SHIYU
  • Jing Haixiang

Assignees

  • 华为云计算技术有限公司

Dates

Publication Date
20260508
Application Date
20241107

Claims (20)

  1. 1. A task processing method, applied to an operation and maintenance platform, where the operation and maintenance platform is used to execute an operation and maintenance task for an application, the operation and maintenance platform includes an event queue and a task executor, the task executor includes an execution unit resource pool, and the execution unit resource pool includes at least one execution unit, and the method includes: the event queue receives at least one operation and maintenance task of the application, task events corresponding to the at least one operation and maintenance task are added into the event queue, and the task events corresponding to the at least one operation and maintenance task are represented through task information of the at least one operation and maintenance task; The task executor takes out a first task event from the event queue, calls a first execution unit in the execution unit resource pool, executes a first subtask, wherein the first subtask comprises a subtask for decomposing a first operation and maintenance task corresponding to the first task event, and issues the operation of the first subtask on a resource to the resource, and releases the first execution unit, and the resource comprises the application or a module and a component of the application; The event queue receives a first resource event sent by the resource, wherein the first resource event describes the state of the resource; and the task executor determines the next task or the next subtask to be processed according to the state of the resource, and executes the next task or the next subtask.
  2. 2. The method of claim 1, wherein the task executor determines a next task or a next sub-task to be processed according to the state of the resource, and executing the next task or the next sub-task, comprises: when the state of the resource is successful, the task executor checks whether the first operation and maintenance task corresponding to the first task event comprises the next subtask to be processed; If yes, the task executor generates a subtask event, and the subtask event is added into the event queue, wherein the subtask event is used for indicating to execute a second subtask of the first operation and maintenance task; And the task executor takes out the subtask event from the event queue, calls a second execution unit in the execution unit resource pool, executes the second subtask, issues the operation of the second subtask on the resource to the resource, and releases the second execution unit.
  3. 3. The method according to claim 2, wherein the method further comprises: the task executor checks whether a conflict task arrives according to the task information of the first operation and maintenance task; the task executor generates a subtask event, adds the subtask event to the event queue, and comprises the following steps: and the task executor does not detect the arrival of the conflict task, generates a subtask event, and adds the subtask event into the event queue.
  4. 4. A method according to claim 3, characterized in that the method further comprises: And the task executor detects that the conflict task arrives, and determines the task operation of the first operation and maintenance task to which the second subtask belongs according to the task information of the second subtask, wherein the task operation comprises stopping or rollback.
  5. 5. The method of claim 1, wherein the task executor determining a next task or a next sub-task to be processed according to the state of the resource comprises: and when the state of the resource is failure, the task executor acquires the retry times of the first operation and maintenance task, and when the retry times are smaller than the maximum retry times, a retry subtask event is generated, the retry subtask event is added into the event queue, and the retry subtask event is used for retrying the first subtask.
  6. 6. The method according to any one of claims 1 to 5, further comprising: the task executor updates the state of the first operation and maintenance task according to the execution condition of the first operation and maintenance task or the subtask of the first operation and maintenance task, wherein the state of the first operation and maintenance task comprises creation, discarding, running, stopping, success or failure.
  7. 7. The method of any of claims 1 to 6, wherein the task executor fetches a first task event from the event queue, invokes a first execution unit in the execution unit resource pool, and performs a first sub-task, comprising: The task executor takes out a first task event from the event queue, and checks whether a second task event is included in the event queue, wherein the state of a second operation and maintenance task corresponding to the second task event is in operation, the task type of the second operation and maintenance task is the same as the task type of the first operation and maintenance task, and the task object of the second operation and maintenance task is the same as the task object of the first operation and maintenance task; If not, the task executor calls a first execution unit in the execution unit resource pool to execute a first subtask.
  8. 8. The method of claim 7, wherein the method further comprises: If yes, the task executor discards the first operation and maintenance task.
  9. 9. The method according to any one of claims 1 to 8, wherein the operation and maintenance platform further comprises a task storage module for storing task information of the at least one operation and maintenance task; The event queue receives at least one operation and maintenance task of the application, including: and the event queue receives the at least one operation and maintenance task sent by the task storage module.
  10. 10. The method of claim 9, wherein the operation and maintenance tasks include user tasks or system tasks, the user tasks being triggered by a console or by an application programming interface.
  11. 11. The operation and maintenance platform is characterized by being used for executing operation and maintenance tasks aiming at applications, comprising an event queue and a task executor, wherein the task executor comprises an execution unit resource pool, and the execution unit resource pool comprises at least one execution unit; the event queue is used for receiving at least one operation and maintenance task of the application, adding a task event corresponding to the at least one operation and maintenance task into the event queue, and representing the task event corresponding to the at least one operation and maintenance task through task information of the at least one operation and maintenance task; The task executor is configured to take out a first task event from the event queue, call a first execution unit in the execution unit resource pool, execute a first subtask, where the first subtask includes a subtask that is decomposed by a first operation and maintenance task corresponding to the first task event, issue an operation of the first subtask on a resource to a resource, and release the first execution unit, where the resource includes the application or a module and a component of the application; the event queue is further configured to receive a first resource event sent by the resource, where the first resource event describes a state of the resource; the task executor is further configured to determine a next task or a next subtask to be processed according to the state of the resource, and execute the next task or the next subtask.
  12. 12. The operation and maintenance platform according to claim 11, wherein the task executor is specifically configured to: when the state of the resource is successful, checking whether a first operation and maintenance task corresponding to the first task event comprises a next subtask to be processed; If yes, generating a subtask event, and adding the subtask event into the event queue, wherein the subtask event is used for indicating to execute a second subtask of the first operation and maintenance task; And taking out the subtask event from the event queue, calling a second execution unit in the execution unit resource pool, executing the second subtask, issuing the operation of the second subtask on the resource to the resource, and releasing the second execution unit.
  13. 13. The operation and maintenance platform according to claim 12, wherein the task executor is further configured to: Checking whether a conflict task arrives according to the task information of the first operation and maintenance task; The task executor is specifically configured to: and if the conflict task is not detected to arrive, generating a subtask event, and adding the subtask event into the event queue.
  14. 14. The operation and maintenance platform according to claim 13, wherein the task executor is further configured to: And detecting that the conflict task arrives, and determining task operation of a first operation and maintenance task to which the second subtask belongs according to the task information of the second subtask, wherein the task operation comprises stopping or rolling back.
  15. 15. The operation and maintenance platform according to claim 11, wherein the task executor is specifically configured to: And when the state of the resource is failure, acquiring the retry times of the first operation and maintenance task, and when the retry times are smaller than the maximum retry times, generating a retry subtask event, adding the retry subtask event into the event queue, wherein the retry subtask event is used for retrying the first subtask.
  16. 16. The operation and maintenance platform according to any one of claims 11 to 15, wherein the task executor is further configured to: Updating the state of the first operation and maintenance task according to the execution condition of the first operation and maintenance task or the subtask of the first operation and maintenance task, wherein the state of the first operation and maintenance task comprises creation, discarding, running, stopping, success or failure.
  17. 17. An operation and maintenance platform according to any one of claims 11 to 16, wherein the task executor is specifically configured to: Taking out a first task event from the event queue, and checking whether a second task event is included in the event queue, wherein the state of a second operation and maintenance task corresponding to the second task event is in operation, the task type of the second operation and maintenance task is the same as the task type of the first operation and maintenance task, and the task object of the second operation and maintenance task is the same as the task object of the first operation and maintenance task; and if not, calling a first execution unit in the execution unit resource pool to execute a first subtask.
  18. 18. The operation and maintenance platform according to claim 17, wherein the task executor is further configured to: If yes, discarding the first operation and maintenance task.
  19. 19. The operation and maintenance platform according to any one of claims 11 to 18, further comprising a task storage module for storing task information of the at least one operation and maintenance task; the event queue is specifically configured to: and receiving the at least one operation and maintenance task sent by the task storage module.
  20. 20. The operation and maintenance platform according to claim 19, wherein the operation and maintenance tasks include user tasks or system tasks, the user tasks being triggered by a console or by an application programming interface.

Description

Task processing method and related equipment Technical Field The present application relates to the field of operation and maintenance technologies, and in particular, to a task processing method, an operation and maintenance platform, a computing device cluster, a computer readable storage medium, and a computer program product. Background In order to ensure the long-term stable operation of the application service, an operation and maintenance task can be executed through an operation and maintenance platform so as to operate and maintain the application. The operation and maintenance (Operations AND MAINTENANCE, O & M) is an abbreviation, and essentially is an acceptable state in terms of cost, stability and efficiency for operation and maintenance of each stage of the lifecycle of the application. The operation and maintenance tasks may include, but are not limited to, starting, stopping, restarting, or upgrading an application. Some of the operation and maintenance tasks are complex, for example, the complexity of a long task is relatively high, and the operation and maintenance platform can decompose the complex operation and maintenance task into a plurality of subtasks and then execute the plurality of subtasks. By way of example of an integrated traffic flow, the operation and maintenance platform may perform a flow retry task when an error is encountered in the operation of the integrated traffic flow. The above-mentioned flow retry task can be decomposed into the following subtasks, 1. Stop the integrated service flow, 2. Wait for the integrated service flow to stop successfully, 3. Start the integrated service flow, 4. Wait for the integrated service flow to start successfully, 5. If the application has not started successfully, repeat the above-mentioned steps 3 times, the interval of two adjacent times is 30 seconds. However, each of the operation and maintenance tasks generally requires an independent thread to run, and although the complex operation and maintenance tasks are split, the resource utilization is still low when the complex operation and maintenance tasks are executed. Disclosure of Invention The application provides a task processing method, which takes an event as a center, processes the states of an operation and maintenance task, a subtask and a resource as the event, realizes the decomposition and processing of the task from a smaller granularity, and can release execution units such as threads after issuing the operation on the resource to the resource when executing the task or the subtask, without waiting for the completion of the operation on the resource, the execution units cannot be blocked by the waiting state of the operation and maintenance task, thus improving the utilization rate of the resource. The application also provides an operation and maintenance platform, a computing device cluster, a computer readable storage medium and a computer program product corresponding to the method. In a first aspect, the present application provides a task processing method. The method is applied to the operation and maintenance platform. The operation and maintenance platform is used for executing operation and maintenance tasks aiming at the application. The operation and maintenance platform can be software, and the software can be independent operation and maintenance software. The operation and maintenance software may be different according to the application of the operation and maintenance. For example, the operation and maintenance software can be an integrated platform, i.e., a service, a micro-service hosting platform, a function hosting platform, a virtual machine, and a container management operation and maintenance platform. It should be noted that the present application may also be applied to a management operation platform deployed by private cloud and hybrid cloud of a data center. The operation and maintenance platform can manage and execute operation and maintenance tasks in the platform, and particularly optimize the processing of complex operation and maintenance tasks, so that the resource utilization rate is improved. In some examples, the operation and maintenance platform can also be hardware, which can be a cluster of computing devices with operation and maintenance capabilities. When the computing device cluster runs, the task processing method is executed. Specifically, the operation and maintenance platform comprises an event queue and a task executor, wherein the task executor comprises an execution unit resource pool, and the execution unit resource pool comprises at least one execution unit. The event queue receives at least one operation and maintenance task of the application, task events corresponding to the at least one operation and maintenance task are added into the event queue, and task events corresponding to the at least one operation and maintenance task are represented through task information of the at least one operation and maint