Search

US-12619455-B2 - Robust resource removal for virtual machines

US12619455B2US 12619455 B2US12619455 B2US 12619455B2US-12619455-B2

Abstract

Systems and methods providing robust resource removal for virtual machines. In one implementation, a hypervisor may receive configuration data associated with a virtual machine (VM). The hypervisor may determine, based on the configuration data, a type of support by the VM of recovery from unexpected hardware resource removal. The hypervisor may identify, based on the type of support of recovery form unexpected hardware resource removal, a type of access of the VM to one or more hardware resources. The hypervisor may launch the VM according to the type of access to the one or more hardware resources.

Inventors

  • Michael Tsirkin
  • Karen Lee Noel

Assignees

  • RED HAT, INC.

Dates

Publication Date
20260505
Application Date
20220104

Claims (20)

  1. 1 . A method comprising: launching a first virtual machine (VM) with access to memory allocated to a second VM; tasking the first VM with identifying configuration data in the memory allocated to the second VM that indicates a type of support by the second VM of recovery from unexpected hardware resource removal and transmitting the configuration data identified by the first VM in the memory allocated to the second VM to a hypervisor operating on a host operating system; receiving, by the hypervisor operating on the host operating system, the configuration data identified by the first VM in the memory allocated to the second VM, wherein the second VM is separate from the first VM; determining, based on the configuration data, the type of support by the second VM of recovery from unexpected hardware resource removal; identifying, based on the type of support of recovery from unexpected hardware resource removal, a type of access of the second VM to one or more hardware resources; and launching the second VM according to the type of access to the one or more hardware resources.
  2. 2 . The method of claim 1 , wherein the type of support is no support, and wherein the type of access is virtual device access.
  3. 3 . The method of claim 1 , wherein the type of support is support, and wherein the type of access is direct access.
  4. 4 . The method of claim 1 , further comprising: determining, by the hypervisor, to deallocate one of the one or more hardware resources; responsive to determining that the type of support by the second VM is support, suspending the second VM; and responsive to receiving an acknowledgment that the second VM has been suspended, deallocating the one of the one or more hardware resources.
  5. 5 . The method of claim 1 , further comprising: determining, by the hypervisor, to deallocate one of the one or more hardware resources assigned to the second VM, wherein the one of the one or more hardware resources is part of a physical device; and responsive to determining that the type of support by the second VM is support, removing the physical device from the second VM.
  6. 6 . The method of claim 1 , wherein determining, based on the configuration data, the type of support by the second VM of recovery from unexpected hardware resource removal of comprises: identifying, in a data structure, an entry that corresponds to the configuration data; and determining, based on the identified entry, whether the second VM supports recovery from removal of the one or more hardware resources.
  7. 7 . The method of claim 1 , further comprising: storing, in hypervisor memory, an indicator associated with the second VM, wherein the indicator indicate the type of support by the second VM of recovery from unexpected hardware resource removal.
  8. 8 . The method of claim 1 , wherein the configuration data comprises at least one of a version of a device driver installed on the second VM, a version of a guest operating system installed on the second VM, a list of drivers installed on the second VM, or a vendor of a driver installed on the second VM.
  9. 9 . A system comprising: a memory; and a processing device operatively coupled to the memory, the processing device to: launch a first virtual machine (VM) with access to memory allocated to a second VM; task the first VM with identifying configuration data in the memory allocated to the second VM that includes information related to a device driver installed on the second VM and transmitting the configuration data identified by the first VM in the memory allocated to the second VM to a hypervisor; receive, by the hypervisor, the configuration data identified by the first VM in the memory allocated to the second VM, wherein the second VM is separate from the second VM; identify, by the hypervisor, the device driver installed on the second VM based on the configuration data; determine whether at least one parameter of a plurality of parameters associated with the device driver matches a surprise removal support capability parameter, wherein the surprise removal support capability parameter indicates that the device driver supports surprise removal of one or more hardware resources supported by the device driver; and responsive to determining that the at least one parameter of the plurality of parameters associated with the device driver matches the surprise removal support capability parameter, launching the second VM with direct access to the one or more hardware resources supported by the device driver via mapping physical device memory of the one or more hardware resources to a virtual memory address range of the second VM.
  10. 10 . The system of claim 9 , wherein the processing device is further to: responsive to determining that the at least one parameter of the plurality of parameters associated with the device driver does not match the surprise removal support capability parameter, launch the second VM; and providing the second VM access to the one or more hardware resources through a virtual device.
  11. 11 . The system of claim 9 , wherein the processing device is further to: determine, by the hypervisor, to deallocate one of the one or more hardware resources assigned to the second VM; responsive to determining that the at least one parameter of the plurality of parameters associated with the device driver matches the surprise removal support capability parameter, suspend the second VM; and responsive to receiving an acknowledgment that the second VM has been suspended, deallocate the one of the one or more hardware resources.
  12. 12 . The system of claim 9 , wherein the processing device is further to: responsive to determining that the at least one parameter of the plurality of parameters associated with the device driver does not match the surprise removal support capability parameter, determine that the second VM does not support recovery from unexpected removal of hardware resources; responsive to determining that the at least one parameter of the plurality of parameters associated with the device driver matches the surprise removal support capability parameter, determine that the second VM supports recovery from unexpected removal of the hardware resources; and store, in hypervisor memory, an indicator associated with the second VM, wherein the indicator indicates whether the second VM supports recovery from unexpected removal of the hardware resources.
  13. 13 . A non-transitory computer-readable media storing instructions that, when executed, cause a processing device to: launch a first virtual machine (VM) with access to memory allocated to a second VM; task the first VM with identifying configuration data in the memory allocated to the second VM that indicates a type of support by the second VM of recovery from unexpected hardware resource removal and transmitting the configuration data identified by the first VM in the memory allocated to the second VM to a hypervisor operating on a host operating system; receive, by the hypervisor operating on the host operating system, the configuration data identified by the first VM in the memory allocated to the second VM, wherein the second VM is separate from the second VM; determine, based on the configuration data, a type of support by the second VM of recovery from unexpected hardware resource removal; identify, based on the type of support of recovery from unexpected hardware resource removal, a type of access of the second VM to one or more hardware resources; and launch the second VM according to the type of access to the one or more hardware resources.
  14. 14 . The non-transitory computer-readable media of claim 13 , wherein the type of support is no support, and wherein the type of access is virtual device access.
  15. 15 . The non-transitory computer-readable media of claim 13 , wherein the type of support is support, and the type of access is direct access.
  16. 16 . The non-transitory computer-readable media of claim 13 , further comprising: determining, by the hypervisor, to deallocate one of the one or more hardware resources assigned to the second VM, wherein the one of the one or more hardware resources is part of a physical device; and responsive to determining that the type of support by the second VM is support, removing the physical device from the second VM.
  17. 17 . The non-transitory computer-readable media of claim 13 , wherein determining, based on the configuration data, the type of support by the second VM of recovery from unexpected hardware resource removal comprises: identifying, in a data structure, an entry that corresponds to the configuration data; and determining, based on the identified entry, whether the second VM supports recovery form removal of the one or more hardware resources.
  18. 18 . The non-transitory computer-readable media of claim 13 , wherein the configuration data comprises at least one of a version of a device driver installed on the second VM, a version of a guest operating system installed on the second VM, a list of drivers installed on the second VM, or a vendor of a driver installed on the second VM.
  19. 19 . The non-transitory computer-readable media of claim 13 , further comprising: storing, in hypervisor memory, an indicator associated with the second VM, wherein the indicator indicates the type of support by the second VM of recovery from unexpected hardware resource removal.
  20. 20 . The non-transitory computer-readable media of claim 13 , further comprising: determining, by the hypervisor, to deallocate one of the one or more hardware resources; responsive to determining that the type of support by the second VM is support, suspending the second VM; and responsive to receiving an acknowledgment that the second VM has been suspended, deallocating the one of the one or more hardware resources.

Description

TECHNICAL FIELD The present disclosure is generally related to virtualization systems, and more particularly, to robust resource removal for virtual machines. BACKGROUND A virtual machine (VM) is a portion of software that, when executed on appropriate hardware, creates an environment allowing the virtualization of an actual physical computer system (e.g., a server, a mainframe computer, etc.). The actual physical computer system is typically referred to as a “host machine,” and the operating system (OS) of the host machine is typically referred to as the “host operating system.” Typically, software on the host machine known as a “hypervisor” (or a “virtual machine monitor”) manages the execution of one or more virtual machines or “guests,” providing a variety of functions such as virtualizing and allocating resources, context switching among virtual machines, etc. The operating system (OS) of the virtual machine is typically referred to as the “guest operating system.” In a virtualized environment, physical devices, such as network devices or video cards, can be made available to guests by the hypervisor by a process known as device assignment. The hypervisor can create a virtual device within the guest that is associated with the physical device so that any access of the virtual device can be forwarded to the physical device by the hypervisor with little or no modification. Removal of a device from an assigned guest OS without warning (e.g., by simply unplugging it without using a device manager or removal utility), is referred to as “surprise removal.” BRIEF DESCRIPTION OF THE DRAWINGS The present disclosure is illustrated by way of example, and not by way of limitation, and can be more fully understood with reference to the following detailed description when considered in connection with the figures in which: FIG. 1 depicts a block diagram of an example computer system architecture operating in accordance with one or more aspects of the present disclosure. FIG. 2 depicts a block diagram illustrating a computer system operating in accordance with one or more aspects of the present disclosure. FIG. 3 is a flow diagram of an example method of providing robust resource removal for VMs, in accordance with one or more aspects of the present disclosure. FIG. 4 is a flow diagram of an example method for determining the surprise removal capability of a VM using an exposed parameter, in accordance with one or more aspects of the present disclosure. FIG. 5 depicts a block diagram of an example computer system, in accordance with one or more aspects of the present disclosure. FIG. 6 depicts a block diagram of an illustrative computer system operating in accordance with one or more aspects of the present disclosure. DETAILED DESCRIPTION Implementations of the disclosure are directed to providing robust resource removal for virtual machines (VMs). In a virtualized environment, removal of a virtual device from a host computer system can sometimes occur for reasons of reliability (due to guest OS or host OS instability) or resource overcommit (e.g., when a host OS is short on resources consumed by the guest OS) to free resources for use by different virtual machines or hosts. Typically, removal of a virtual device (such as a Peripheral Component Interconnect (PCI) device) from a VM involves sending a removal notification from the hypervisor to the guest OS of the VM and receiving an explicit acknowledgment from the guest OS that indicates that the guest OS has entered a state in which it is safe to remove the device. The acknowledgment from the guest OS avoids guest OS errors as a result of the device removal prior to the guest OS (also referred to herein as “guest”) flushing any associated cache to avoid losing data. For example, the removal of a disk could result in the loss of critical data, or the removal of a network interface controller could result in the loss of networking communication packet information. However, this removal process may be time consuming, particularly if the guest is slow or the guest is not operating properly. Some VMs, however, are capable of supporting the surprise removal of physical devices. That is, some VMs include systems that enable the VM to recover from a surprise removal of hardware resources of an assigned physical device. For VMs that support surprise removal, the removal process of having the hypervisor send a removal notification to the VM's guest OS, and waiting for explicit approval from the VM's guest OS, can be redundant and unnecessarily time consuming. Aspects of the present disclosure address the above-noted and other deficiencies by implementing robust resource removal for VMs that support surprise removal. The hypervisor, prior to launching a VM, can determine whether the VM supports surprise removal. In embodiments, the hypervisor can launch a special helper VM prior to launching the VM. The special helper VM can have access to the memory of the VM, and can identif