Search

EP-4735973-A1 - ENQUEUE COMMAND POWER MANAGEMENT AND SERVICE REQUEST PROCESSING LATENCY IMPROVEMENTS

EP4735973A1EP 4735973 A1EP4735973 A1EP 4735973A1EP-4735973-A1

Abstract

Methods and apparatus relating to power management and service request processing latency improvements for an enqueue command are described. In one embodiment, a storage device stores a request reference count, which is updated in response to receipt of a job request from a process at a device and in response to completion of a job corresponding to the received job request. The device enters into, remains in, or exits an active/idle power state in response to a value of the request reference count. In another embodiment, a job request with a job descriptor is issued to a target device. Device power management logic causes the target device to transition from an idle power state to an active power state in response to receipt of the job descriptor prior to the job descriptor being stored in a work queue of the target device. Other embodiments are also disclosed and claimed.

Inventors

  • WANG, JUNYUAN
  • LUKOSHKOV, MAKSIM
  • ZENG, XIN
  • LI, WEIGANG
  • MOUNISSAMY, Deepika

Assignees

  • INTEL Corporation

Dates

Publication Date
20260506
Application Date
20230630

Claims (20)

  1. An apparatus comprising: a storage device to store a request reference count; and logic circuitry to update the request reference count in response to receipt of a job request from a process at a device and to update the request reference count in response to completion of a job corresponding to the received job request, wherein the device is to enter into, remain in, or exit a power state in response to a value of the request reference count.
  2. The apparatus of claim 1, wherein the power state comprises one of an idle power state and an active power state.
  3. The apparatus of any one of claims 1 to 2, wherein the job request is an enqueue request to store the job request in a shared work queue.
  4. The apparatus of any one of claims 1 to 3, wherein the shared work queue is to store a plurality of job requests from one or more user application processes.
  5. The apparatus of any one of claims 1 to 4, wherein the job request is to be transmitted by a user application process.
  6. The apparatus of any one of claims 1 to 5, wherein the logic circuitry is to increment the request reference count in response to the receipt of the job request and to decrement the request reference count in response to completion of the job corresponding to the received job request.
  7. The apparatus of any one of claims 1 to 6, wherein the completion of the job is to be indicated via invocation of an instruction or storage of a value in a register.
  8. The apparatus of any one of claims 1 to 7, wherein a processor capability flag is to indicate an existence of the request reference count.
  9. An apparatus comprising: a processor to issue a job request with a job descriptor to a target device; and the target device including device power management logic to cause the target device to transition from an idle power state to an active power state in response to receipt of the job descriptor, wherein the device power management logic is to cause the target device to transition from the idle power state prior to the job descriptor being stored in a work queue of the target device.
  10. The apparatus of claim 9, wherein the processor is to issue the job request via an enqueue command.
  11. The apparatus of any one of claims 9 to 10, wherein the processor is to issue the job request via a enqueue command.
  12. The apparatus of any one of claims 9 to 11, wherein the device power management logic is to communicate with the processor via an Advanced Control and Power Interface (ACPI) .
  13. The apparatus of any one of claims 9 to 12, wherein the job request is an enqueue request to store the job request in a shared work queue.
  14. The apparatus of any one of claims 9 to 13, wherein the work queue is a shared work queue to store a plurality of job requests from one or more user application processes.
  15. The apparatus of any one of claims 9 to 14, wherein the job request is caused to be transmitted by a user application process.
  16. The apparatus of any one of claims 9 to 15, wherein a processor capability flag is to indicate whether the target device is capable of transitioning from the idle power state prior to the job descriptor being stored in the work queue of the target device..
  17. One or more non-transitory computer-readable media comprising one or more instructions that when executed on a processor configure the processor to perform one or more operations to cause: a storage device to store a request reference count; and logic circuitry to update the request reference count in response to receipt of a job request from a process at a device and to update the request reference count in response to completion of a job corresponding to the received job request, wherein the device is to enter into, remain in, or exit a power state in response to a value of the request reference count.
  18. The one or more non-transitory computer-readable media of claim 17, wherein the power state comprises one of an idle power state and an active power state.
  19. The one or more non-transitory computer-readable media of any one of claims 17 to 18, wherein the job request is an enqueue request to store the job request in a shared work queue.
  20. The one or more non-transitory computer-readable media of any one of claims 17 to 19, further comprising one or more instructions that when executed on the one processor configure the processor to perform one or more operations to cause the shared work queue to store a plurality of job requests from one or more user application processes.

Description

ENQUEUE COMMAND POWER MANAGEMENT AND SERVICE REQUEST PROCESSING LATENCY IMPROVEMENTS FIELD The present disclosure generally relates to the field of processors. More particularly, some embodiments relate to power management and service request processing latency improvements for an enqueue command. BACKGROUND Some processors may support an enqueue command to facilitate cloud native architectures. Such enqueue commands may be used to submit an Input/Output ( “I/O” or “IO” ) job to target Peripheral Component Interface express (PCIe) endpoints. Since the target PCIe endpoint may utilize various components to process the IO jobs, power management may become complicated, e.g., resulting in the PCIe endpoint entering an idle state prematurely before a job is fully processed. Additionally, if a PCIe endpoint is in an idle state, additional enqueue command processing may cause additional latency before an enqueued job is processed. Hence, efficient implementation of encode commands may improve power management and/or service request processing latency. BRIEF DESCRIPTION OF THE DRAWINGS The detailed description is provided with reference to the accompanying figures. In the figures, the left-most digit (s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items. FIG. 1 illustrates a block diagram of a power management solution in a user queue mode, according to an embodiment. FIGs. 2A, 2B, 3A, and 3B illustrate flow diagrams of methods, according to some embodiments. FIG. 4 illustrates an example computing system. FIG. 5 illustrates a block diagram of an example processor and/or System on a Chip (SoC) that may have one or more cores and an integrated memory controller. FIG. 6A is a block diagram illustrating both an example in-order pipeline and an example register renaming, out-of-order issue/execution pipeline according to examples. FIG. 6B is a block diagram illustrating both an example in-order architecture core and an example register renaming, out-of-order issue/execution architecture core to be included in a processor according to examples. FIG. 7 illustrates examples of execution unit (s) circuitry. DETAILED DESCRIPTION In the following description, numerous specific details are set forth in order to provide a thorough understanding of various embodiments. However, various embodiments may be practiced without the specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to obscure the particular embodiments. Further, various aspects of embodiments may be performed using various means, such as integrated semiconductor circuits ( “hardware” ) , computer-readable instructions organized into one or more programs ( “software” ) , or some combination of hardware and software. For the purposes of this disclosure reference to “logic” shall mean either hardware (such as logic circuitry or more generally circuitry or circuit) , software, firmware, or some combination thereof. As discussed above, some processors may support an enqueue command to facilitate cloud native architectures. Two such commands include ENQCMD and ENQCMDS supported by some processors provided by Intel Corporation of Santa Clara, California. These enqueue command (s) may allow clients (e.g., for both user space and kernel space applications) to submit an Input/Output ( “I/O” or “IO” ) job to target Peripheral Component Interface express (PCIe) endpoints by an abstracted job descriptor (e.g., 64 bytes) . A target device capable of supporting such an enqueue command may use a Shared Work Queue (SWQ) to accept all job requests from multiple processes, and the received job requests in SWQ can be dequeued and processed by the device’s backend. The target PCI endpoints usually reserves specific address windows (e.g., one for user  space and another for kernel space) for enqueue command access. A device driver may map the address window to an applications’ memory space, then the application can submit IO jobs via the enqueue command to a device. In turn, the device may reply to a host by writing the response to the requesting process’s memory space directly. Since the target PCIe endpoint may utilize various components to process the IO jobs, power management may become complicated, e.g., resulting in the PCIe endpoint entering an idle state prematurely before a job is fully processed. Additionally, if a PCIe endpoint is in an idle state, additional enqueue command processing may cause additional latency before an enqueued job is processed. To this end, some embodiments provide techniques for improving power management and/or service request processing latency for an enqueue command. In one embodiment, a storage device stores a request reference count. Depending on the implementation, the storage device may be any type of memory capable of