CN-114787777-B - Task transfer method between heterogeneous processors
Abstract
A method, system, and apparatus determine that one or more tasks should be relocated from a first processor to a second processor by comparing a performance metric to an associated threshold or by using other indications. To relocate the one or more tasks from the first processor to the second processor, the first processor is stopped and state information from the first processor is copied to the second processor. The second processor uses the status information and then services the incoming task in place of the first processor.
Inventors
- Alexander J. branorville
- Benjamin Tsien
- Elliot H. Mednick
Assignees
- 超威半导体公司
Dates
- Publication Date
- 20260512
- Application Date
- 20201112
- Priority Date
- 20191210
Claims (17)
- 1. A method for repositioning a computer-implemented task, the method comprising: executing the task in the microprocessor when the task can be serviced by the microprocessor, and Responsive to the microprocessor being unable to service the task: Powering down the microprocessor; Repositioning the task from the microprocessor to a relatively low power processor and executing the task on the relatively low power processor; monitoring one or more metrics associated with the performance of the task by the relatively low power processor; Comparing at least one of the one or more metrics to a threshold value, and Based on the comparison, the task is selectively relocated from the relatively lower power processor to the relatively higher power processor and executed on the relatively higher power processor.
- 2. The method of claim 1, wherein the at least one metric comprises a core utilization metric of the relatively low power processor.
- 3. The method of claim 2, wherein: the core utilization metric includes an indication of a duration for which the relatively low power processor is operating at maximum speed, The threshold is an indication of a duration threshold, and The task is relocated to the relatively higher power processor on the condition that the indication of the duration of time that the relatively lower power processor is operating at maximum speed is greater than the duration threshold.
- 4. The method of claim 1, wherein the at least one metric comprises a memory utilization metric associated with the relatively low power processor.
- 5. The method of claim 4, wherein: the memory utilization metric includes an indication of a duration of time that the memory is operating in a maximum memory performance state, The threshold is an indication of a duration threshold, and The task is relocated to the relatively higher power processor on the condition that the indication of the duration of time that the relatively lower power processor is operating at maximum speed is greater than the duration threshold.
- 6. The method of claim 1, wherein the at least one of the one or more metrics comprises a Direct Memory Access (DMA) data rate.
- 7. A method for repositioning a computer-implemented task, the method comprising: Monitoring one or more metrics associated with executing the task by a relatively high power processor; Comparing at least one of the one or more metrics of the task to a threshold; copying an architectural state of the relatively higher power processor to a memory of the relatively lower power processor based on the comparison, wherein the memory is dedicated to storing the architectural state and the architectural state includes one or more register settings and one or more flag settings of the relatively higher power processor, and The tasks are performed on the relatively low power processor using the architecture state copied from and stored in the memory of the relatively low power processor.
- 8. The method of claim 7, wherein: The at least one metric includes an indication of a duration of a single core using the relatively high power processor, The threshold is an indication of a duration threshold, and The task is relocated to the relatively lower power processor on the condition that the indication of the duration of the single core using the relatively higher power processor is less than the duration threshold.
- 9. The method of claim 7, wherein the at least one metric comprises a core utilization metric of the relatively high power processor.
- 10. The method of claim 9, wherein: the core utilization metric for the relatively higher power processor includes an average utilization over a time interval, The threshold is an indication of a utilization threshold, and The task is relocated to the relatively lower power processor on condition that the average utilization over a certain time interval is below the utilization threshold.
- 11. The method of claim 9, wherein: the core utilization metric for the relatively higher power processor includes idle state average residency, The threshold is an indication of an idle state threshold, and The task is relocated to the relatively lower power processor on condition that the idle state average residency is greater than the idle state threshold.
- 12. The method of claim 7, wherein: The at least one metric includes a memory utilization metric associated with the relatively low power processor, The threshold is a memory utilization threshold, and The task is relocated to the relatively lower power processor if the memory utilization metric is less than the memory utilization threshold.
- 13. A system for repositioning a computer-implemented task, comprising: A microprocessor; a first processor; A second processor, and Circuitry configured to: executing tasks in the microprocessor when the tasks can be serviced by the microprocessor, and Responsive to the microprocessor being unable to service the task: Powering down the microprocessor; repositioning the task from the microprocessor to a first processor and executing the task on the first processor; monitoring one or more metrics associated with the first processor performing the task; Comparing at least one of the one or more metrics to a threshold value, and Selectively repositioning the task from the first processor to the second processor and executing the task on the second processor based on the comparison, Wherein the first processor is a relatively low power processor and the second processor is a relatively high power processor.
- 14. The system of claim 13, wherein the at least one metric comprises any one or a combination of a core utilization metric of the relatively low power processor, a memory utilization metric associated with the relatively low power processor, and/or a Direct Memory Access (DMA) data rate.
- 15. The system of claim 14, wherein: The core utilization metric includes an indication of a duration of time that the first processor is operating at a maximum speed, The threshold is an indication of a duration threshold, and The task is relocated to the second processor on condition that the indication of the duration of time the first processor is operating at maximum speed is greater than the duration threshold.
- 16. The system of claim 14, wherein: the memory utilization metric includes an indication of a duration of time that the memory is operating in a maximum memory performance state, The threshold is an indication of a duration threshold, and The task is relocated to the second processor on condition that the indication of the duration of time the first processor is operating at maximum speed is greater than the duration threshold.
- 17. The system of claim 13, wherein the circuitry is further configured to set the relatively low power processor to a low power state.
Description
Task transfer method between heterogeneous processors Cross Reference to Related Applications The present application claims the benefit of U.S. non-provisional patent application Ser. No. 16/709,404, filed on 12/10 2019, the contents of which are hereby incorporated by reference. Background Traditional computer systems rely on operating system level and other higher level software decisions to move tasks between different processors within the system. These conventional solutions result in a significant amount of overhead in terms of performance inefficiency and additional power consumption. By moving tasks between different processors using finer granularity of tracking and decision making, performance per unit power consumption can be optimized. Drawings A more detailed understanding can be obtained from the following description, given by way of example in connection with the accompanying drawings, in which: FIG. 1 is a block diagram of an exemplary apparatus in which one or more features of the present disclosure may be implemented; FIG. 2 is a block diagram of the apparatus of FIG. 1, showing additional details; FIG. 3 is a block diagram depicting an example of a system for efficiently servicing an input task; FIG. 4 is a block diagram depicting another example of a system for efficiently servicing an input task; FIG. 5 is a block diagram depicting another example of a system for efficiently servicing an input task; FIG. 6 is a flow chart depicting an example method of repositioning tasks from a first processor to a second processor; FIG. 7 is a flow chart depicting another example method of repositioning tasks from a first processor to a second processor, and FIG. 8 is a flow chart depicting another example method of repositioning one or more tasks from a first processor to a second processor. Detailed Description Optimizing performance per watt at run-time in fine-grained specification is achieved by moving tasks in time between different processors, as described in further detail below. In one example, the first processor is a relatively low power and power efficient processor, and the second processor is a relatively high power and power inefficient processor. Additionally or alternatively, a relatively lower power processor may be considered a lower power processor, while a relatively higher power processor may be considered a higher power processor. In another example, the first processor and the second processor are heterogeneous, i.e., a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU). By identifying applicable conditions and relocating tasks from sub-optimal processors to more optimal processors, power consumption performance per unit is improved and overall processing performance is enhanced. In one example, a method for relocating a computer-implemented task from a relatively low power processor to a relatively high power processor includes monitoring one or more metrics associated with executing the task by the relatively low power processor. The method further includes comparing at least one of the one or more metrics to a threshold. The method also includes selectively repositioning the task to a relatively higher power processor and executing the task on the relatively higher power processor based on the comparison. In another example, the at least one metric includes a core utilization metric of a relatively low power processor. In another example, the core utilization metric includes an indication of a duration for which the lower power processor is operating at maximum speed, and the threshold is an indication of a duration threshold. The task is relocated to a relatively higher power processor if the indication of the duration of time that the lower power processor is operating at maximum speed is greater than a duration threshold. In another example, the at least one metric includes a memory utilization metric associated with a relatively low power processor. In another example, the memory utilization metric includes an indication of a duration for which the memory is operating in a maximum memory performance state, and the threshold is an indication of a duration threshold. The task is relocated to a relatively higher power processor if the indication of the duration of time that the lower power processor is operating at maximum speed is greater than a duration threshold. In another example, at least one of the one or more metrics includes a Direct Memory Access (DMA) data rate. In another example, a method for relocating a computer-implemented task from a relatively higher power processor to a relatively lower power processor includes monitoring one or more metrics associated with executing the task by the relatively higher power processor. The method also includes comparing at least one of the one or more metrics to a threshold value and selectively repositioning the task to a relatively low power processor and executing the task on the relatively low power pr