Search

CN-121979687-A - Task processing method, task processing system, electronic device and storage medium

CN121979687ACN 121979687 ACN121979687 ACN 121979687ACN-121979687-A

Abstract

The application discloses a task processing method, a task processing system, electronic equipment and a storage medium, and relates to the fields of artificial intelligence technology and task processing. The method comprises the steps of obtaining task processing requests, carrying out task scheduling on the task processing requests based on resource state information of a plurality of processor clusters and attribute information of the task processing requests to obtain scheduling results, wherein the scheduling results are used for determining target processor clusters for executing the task processing requests, and distributing the task processing requests to the target processor clusters according to the scheduling results to obtain task processing results returned by the target processor clusters. The application solves the technical problem of larger resource waste caused by lower utilization rate of computing resources in the processor when the quality of the task processing request is ensured in the related technology.

Inventors

  • XU ZHAO
  • DING HUPING

Assignees

  • 阿里云计算有限公司

Dates

Publication Date
20260505
Application Date
20260403

Claims (19)

  1. 1. A method of task processing, comprising: acquiring a task processing request; Performing task scheduling on the task processing request based on resource state information of a plurality of processor clusters and attribute information of the task processing request to obtain a scheduling result, wherein the scheduling result is used for determining a target processor cluster for executing the task processing request, and the matching degree between the target processor cluster and the resource state information and the attribute information is greater than that between other processor clusters in the plurality of processor clusters and the resource state information and the attribute information; and distributing the task processing request to the target processor cluster according to the scheduling result so as to acquire a task processing result returned by the target processor cluster.
  2. 2. The method according to claim 1, wherein the attribute information includes a request type and a request priority, and the request type of the task processing request includes: Processing a request by an online synchronous task; Processing a request by an online asynchronous task; task processing requests of offline batch processing; The request priority of the online synchronous task processing request is higher than the request priority of the online asynchronous task processing request, and the request priority of the online asynchronous task processing request is higher than the request priority of the task processing request of the offline batch processing.
  3. 3. A task processing method as defined in claim 2, wherein the task processing requests of different request types use the same model service.
  4. 4. The task processing method according to claim 2, wherein the plurality of processor clusters includes: the guarantee resource cluster is used for preferentially receiving and processing the online synchronous task processing request, and receiving and processing the online asynchronous task processing request and/or the offline batch processing task processing request when the guarantee resource cluster has free resource capacity; the asynchronous resource cluster is used for processing the online asynchronous task processing request after capacity expansion or capacity contraction; and the offline resource cluster is used for processing the task processing request of the offline batch processing after the capacity expansion or the capacity contraction is carried out.
  5. 5. The task processing method according to claim 4, characterized in that the task processing method further comprises: estimating the peak request flow of the online synchronous task processing request to obtain an estimated result; and determining reserved resource capacity in the guaranteed resource cluster based on the estimated result in response to the guaranteed resource cluster being a cluster with fixed resource capacity, wherein the reserved resource capacity is used for processing the online synchronous task processing request.
  6. 6. The task processing method according to claim 5, wherein determining the reserved resource capacity in the guaranteed resource cluster based on the estimation result includes: Determining the peak request quantity of the online synchronous task processing request based on the estimated result; and determining the reserved resource capacity by using the peak request quantity and the maximum throughput capacity of the unit processor in the guaranteed resource cluster.
  7. 7. The task processing method according to claim 4, characterized in that the task processing method further comprises: acquiring current request flow of the online asynchronous task processing request and historical backlog information of the online asynchronous task processing request; And expanding or shrinking the capacity of the asynchronous resource cluster based on the current request flow, the historical backlog information, the maximum throughput capacity of unit processors in the asynchronous resource cluster and the number of processors contained in the asynchronous resource cluster.
  8. 8. The task processing method of claim 7, wherein expanding the asynchronous resource cluster based on the current request traffic, the historical backlog information, the maximum throughput capacity of the unit processor, and the number of processors comprises: calculating the product of the maximum throughput capacity of the unit processor and the number of the processors to obtain a calculation result; And responding to the current request flow being larger than the calculation result, determining an online asynchronous task processing request with backlog based on the historical backlog information, and expanding the capacity of the asynchronous resource cluster according to a preset capacity expansion calculation mode based on the current request flow, the maximum throughput capacity of the unit processor and the number of processors.
  9. 9. The task processing method of claim 7, wherein scaling the asynchronous resource cluster based on the current request traffic, the historical backlog information, a maximum throughput capability of the unit processor, and the number of processors comprises: calculating the product of the maximum throughput capacity of the unit processor and the number of the processors to obtain a calculation result; and responding to the current request flow smaller than the calculation result, determining that no backlog exists on the basis of the historical backlog information, and carrying out capacity reduction on the asynchronous resource cluster according to a preset capacity reduction calculation mode based on the current request flow, the maximum throughput capacity of the unit processor and the number of processors.
  10. 10. The task processing method according to claim 4, characterized in that the task processing method further comprises: acquiring the total request quantity of the task processing requests of the offline batch processing; And determining the number of processors to be started in the offline resource cluster based on the total request amount, the maximum throughput capacity of the unit processors in the offline resource cluster and the specified processing time length of the task processing request of the offline batch processing.
  11. 11. The task processing method according to claim 4, characterized in that the task processing method further comprises: In response to the sum of the first request rate, the second request rate and the third request rate being less than or equal to a fourth request rate, preferentially scheduling the online synchronous task processing request to the guaranteed resource cluster, and in response to the existence of the idle resource capacity in the guaranteed resource cluster, scheduling the online asynchronous task processing request and/or the offline batch task processing request to the guaranteed resource cluster; The first request rate is the request rate of the online synchronous task processing request, the second request rate is the request rate of the online asynchronous task processing request, the third request rate is the request rate of the offline batch task processing request, and the fourth request rate is the maximum request rate allowed by the guaranteed resource cluster.
  12. 12. The method according to claim 4, wherein the unit processors in the guaranteed resource cluster are provided with an online local request queue, an asynchronous local request queue, an offline local request queue and a request processing thread pool, the request processing thread pool comprises a plurality of threads, and the task processing method further comprises: controlling the threads to preferentially pull the online synchronous task processing requests from the online local request queue for processing; And controlling at least part of the threads to pull the online asynchronous task processing request from the asynchronous local request queue for processing or pull the offline batch task processing request from the offline local request queue for processing in response to the residual request quantity in the online local request queue meeting a preset condition.
  13. 13. A method of task processing, comprising: Acquiring a translation service processing request; Scheduling the translation service processing request based on resource state information of a plurality of graphic processing unit clusters and attribute information of the translation service processing request to obtain a scheduling result, wherein the scheduling result is used for determining a target graphic processing unit cluster for executing the translation service processing request; And distributing the graphic processing unit processing request to the target graphic processing unit cluster according to the scheduling result so as to acquire a translation service processing result returned by the target graphic processing unit cluster.
  14. 14. A method of task processing, comprising: responding to an input instruction acted on an operation interface, and acquiring a task processing request; displaying a task processing result on the operation interface in response to a processing instruction acting on the operation interface, wherein the task processing result is obtained according to the task processing method according to any one of claims 1 to 12.
  15. 15. A task processing system, comprising: a resource manager for maintaining resource state information for a plurality of processor clusters; The task scheduler is used for acquiring a task processing request, performing task scheduling on the task processing request based on the resource state information and the attribute information of the task processing request to obtain a scheduling result, and distributing the task processing request to a target processor cluster according to the scheduling result; and the target processor cluster is used for returning a task processing result.
  16. 16. The task processing system of claim 15, wherein the request types of the task processing requests include an online synchronous task processing request, an online asynchronous task processing request, and an offline batch task processing request, and wherein the plurality of processor clusters include: the guarantee resource cluster is used for preferentially receiving and processing the online synchronous task processing request, and receiving and processing the online asynchronous task processing request and/or the offline batch processing task processing request when the guarantee resource cluster has free resource capacity; the asynchronous resource cluster is used for processing the online asynchronous task processing request after capacity expansion or capacity contraction; the offline resource cluster is used for processing the task processing request of the offline batch processing after the capacity expansion or the capacity contraction is carried out; Wherein the task processing system further comprises: and the capacity expansion and contraction controller is used for controlling the asynchronous resource cluster to process the online asynchronous task processing request or controlling the offline resource cluster to process the offline batch task processing request in a capacity expansion or contraction mode.
  17. 17. An electronic device, comprising: a memory storing an executable program; a processor for executing the program, wherein the program executes the task processing method according to any one of claims 1 to 14.
  18. 18. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a stored executable program, wherein the executable program, when run, controls a device in which the computer-readable storage medium is located to perform the task processing method according to any one of claims 1 to 14.
  19. 19. A computer program product comprising a computer program which, when executed by a processor, implements the task processing method of any one of claims 1 to 14.

Description

Task processing method, task processing system, electronic device and storage medium Technical Field The application relates to the field of artificial intelligence technology and task processing, in particular to a task processing method, a task processing system, electronic equipment and a storage medium. Background Currently, a HPA mechanism (Horizontal Pod Autoscaler, horizontal Pod automatic expansion and contraction) or a static resource allocation scheme based on Kubernetes is commonly adopted in a deployment process of task processing model services, and when service response quality (such as low delay of an online synchronous request) is guaranteed, a peak-by-peak reservation or exclusive resource strategy is often adopted, so that a processor cluster is largely idle in a traffic valley period, the resource utilization rate is generally low, and large resource waste is generated. In view of the above problems, no effective solution has been proposed at present. Disclosure of Invention The embodiment of the application provides a task processing method, a task processing system, electronic equipment and a storage medium, which at least solve the technical problem that the utilization rate of computing resources in a processor is low when the quality of executing task processing requests is ensured in the related art, so that larger resource waste is generated. According to one aspect of the embodiment of the application, a task processing method is provided, which comprises the steps of obtaining a task processing request, carrying out task scheduling on the task processing request based on resource state information of a plurality of processor clusters and attribute information of the task processing request to obtain a scheduling result, wherein the scheduling result is used for determining a target processor cluster for executing the task processing request, the matching degree between the target processor cluster and the resource state information and the attribute information is larger than that between other processor clusters in the plurality of processor clusters and the resource state information and the attribute information, and distributing the task processing request to the target processor cluster according to the scheduling result so as to obtain the task processing result returned by the target processor cluster. According to another aspect of the embodiment of the application, a task processing method is provided, which comprises the steps of obtaining a translation service processing request, scheduling the translation service processing request based on resource state information of a plurality of graphic processing unit clusters and attribute information of the translation service processing request to obtain a scheduling result, wherein the scheduling result is used for determining a target graphic processing unit cluster for executing the translation service processing request, and distributing the graphic processing unit processing request to the target graphic processing unit cluster according to the scheduling result so as to obtain the translation service processing result returned by the target graphic processing unit cluster. According to another aspect of the embodiment of the application, a task processing method is provided, which comprises the steps of responding to an input instruction acted on an operation interface to obtain a task processing request, responding to the processing instruction acted on the operation interface to display a task processing result on the operation interface, wherein the task processing result is obtained according to any one of the task processing methods. According to one aspect of the embodiment of the application, a task processing device is provided, which comprises a first acquisition module, a first scheduling module and a first distribution module, wherein the first acquisition module is used for acquiring task processing requests, the first scheduling module is used for performing task scheduling on the task processing requests based on resource state information of a plurality of processor clusters and attribute information of the task processing requests to obtain scheduling results, the scheduling results are used for determining target processor clusters for executing the task processing requests, and the matching degree between the target processor clusters and the resource state information and the attribute information is larger than that between other processor clusters in the plurality of processor clusters and the resource state information and the attribute information, and the first distribution module is used for distributing the task processing requests to the target processor clusters according to the scheduling results so as to acquire the task processing results returned by the target processor clusters. According to another aspect of the embodiment of the application, a task processing device is provided, which