KVM SUPPORT STATEMENTS FOR SLES 11 SP2 Overview -------- This document provides information about kvm supportability for use by the customer support team, quality engineering, end user, and other interested parties. Kvm stands for Kernel-based Virtual Machine. Kvm consists of two main components: * A set of kernel modules (kvm.ko, kvm-intel.ko, and kvm-amd.ko) that provides the core virtualization infrastructure and processor specific drivers. * A userspace program (qemu-kvm) that provides emulation for virtual devices and control to manage virtual machines The term kvm more properly refers to the kernel level virtualization functionality, but is also commonly used to reference the userspace component. KVM Host Status --------------- The qemu-kvm version currently included in SLES 11 SP2 is 0.15.1. In addition to the qemu-kvm program the kvm package provides a monitoring utility, firmware components, key-mapping files, scripts, and windows drivers. These components, along with the kvm kernel modules are the focus of this support document. Interoperability with other virtualization tools has been tested and is an essential part of SUSE's support stance. These tools include: virt-manager, vm-install, qemu-img, virt-viewer and the libvirt daemon and shell. KVM Host Configuration ---------------------- Kvm supports a number of different architectures, but we will only support x86_64 hosts. Kvm is designed around hardware virtualization features included in both AMD (AMD-V) and Intel (VT-x) cpus, as well as other virtualization features such as IOMMU and SR-IOV. The following websites identify processors which support hardware virtualization: http://wiki.xensource.com/xenwiki/HVM_Compatible_Processors http://en.wikipedia.org/wiki/X86_virtualization The kvm kernel modules will not load if the basic hardware virtualization features are not present or are not enabled in the BIOS. Qemu-kvm can operate without the kvm kernel modules loaded, but we do not support this mode of operation. Kvm allows for both memory and disk space overcommit. It is up to the user to understand the implications of doing so however, as hard errors resulting from actually exceeding available resources will result in guest failures. CPU overcommit is also allowed but can result in severe performance degredation. Guest Virtual Machine Details ----------------------------- The following table lists guest operating systems tested, and the support status: All guests OSs listed include both 32 and 64 bit x86 versions. For a supportable configuration, the same minimum memory requirements as for a physical installation is assumed. Most guests require some additional support for accurate time keeping. Where available, kvm-clock is to be used. NTP or similar network based time keeping protocols are also highly recommended (in host as well as guest) to help maintain stable time. When using the kvm-clock running NTP inside the guest is not recommended. Be aware that guest nics which don't have an explicit mac address specified (on qemu-kvm command line) will be assigned a default mac address, resulting in networking problems if more than one such instance is visible on the same network segment. Guest images created under SLES 11 SP1 are supported under SLES 11 SP2, but not vice-versa. +-------------------------------------------------------------------------------------------------+ Guest OS Virt Type PV Drivers Available Support Status Notes SLES11 SP2 FV kvm-clock, Fully Supported SLES11 SP1 virtio-net, virtio-blk, virtio-balloon SLES10 SP4 FV kvm-clock, Fully Supported virtio-net, virtio-blk, virtio-balloon SLES9 SP4 FV Fully Supported 32 bit kernel: specify clock=pmtmr on linux boot line 64 bit kernel: specify ignore_lost_ticks on linux boot line SLED11 SP2 FV kvm-clock, Tech Preview SLED11 SP1 virtio-net, virtio-blk, virtio-balloon RHEL 4.x FV Yes see RedHat Best Effort See footnote RHEL 5.x website for details RHEL 6.x Win 2003 SP2+ FV virtio-net, Fully Supported Host must have constant_tsc cpu feature Win 2008 SP2+ virtio-blk, with SVVP Win 2008 R2+ virtio-balloon certification Win 2012 (VMDP drivers preferred, win-virtio-drivers.iso drivers are deprecated) Win XP SP3+ FV virtio-net, Best Effort Win 2003 SP2+ virtio-blk, Win Vista SP2+ virtio-balloon Win 7 SP1+ (VMDP drivers preferred, Win 8 win-virtio-drivers.iso drivers are deprecated) Footnote: http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/chap-Virtualization_Host_Configuration_and_Guest_Installation_Guide-KVM_guest_timing_management.html Supported Limits ---------------- The following limits have been tested, and are supported: Host RAM and CPU Same with kvm modules loaded as without. Refer to the SLES Release Notes for specifics Guest RAM size 512 GB Virtual CPUs per guest 64 NICs per guest 8 Block devices per guest 4 emulated, 20 para-virtual (virtio-blk) Maximum number of guests Limit is defined as the total number of vcpus in all guests being no greater than 8 times the number of cpu cores in the host General KVM Features -------------------- - vm-install interoperability Define and Install guest via vm-install This includes specifying RAM, disk type and location, video type, keyboard mapping, NIC type, binding and mac address, and boot method Restrictions: Raw disk, qcow2, or qed format only. Realtek, e1000 or virtio NICs only Sound cards are not supported - virt-manager interoperability Manage guests via virt-manager This includes autostart, start, stop, restart, pause, unpause, save, restore, clone, migrate, special key sequence insertion, guest console viewers, performance monitoring, cpu pinning, and static modification of cpu, RAM, boot method, disk, nic, mouse, display, video and host PCI assignments Restrictions: No sound devices, qxl, vmvga (vmware) or xen video devices added pcnet, ne2k_pci and eepro100 virtual NICs are not supported emulated scsi disk is not supported Raw, qed and qcow2 are the only supported storage formats Spice graphics is not supported - virsh interoperability Manage guests via virsh virsh is an in-tree libvirt tool that exposes all of the libvirt API. Most virsh subcommands are supported, including creation, modification, and destruction of guests and all lifecycle operations. Any virsh subcommands which translate to unsupported qemu-kvm command-line or monitor syntax would also be unsupported. Guest XML descriptions used by virsh can be created manually, using vm-install/virt-manager, or with external tools/scripts. - qemu-kvm command-line Manage guests via direct invocation of qemu-kvm. It's generally preferred to use virt-manager, but for greater flexibility, qemu-kvm may be directly invoked. Restrictions: See restrictions in Appendix A - kvm_stat Debugging/monitoring functionality - Live and static migration Migration of guests between hosts Restrictions: Source and target machines should have the same cpu features which are exposed to the guest by the cpu model specified (-cpu xxx). Guest storage is accessible from both machines (shared), guest timekeeping is properly controlled, compatible guest "definition" on source and target, no physical devices passed through to the guest. Migration from SP2 to SP1 guests is not supported. Migration support from SP1 to SP2 guests is being evaluated. - Kernel Samepage Merging (KSM) The KVM Host includes KSM support, and KVM is optimized to use KSM if available. It allows for automatic sharing of memory pages between guests, freeing some host memory. - Transparent Huge Pages (THP) The KVM host now includes THP support, and KVM is optimized to use THP if available, both via madvise and opportunistic methods. - PCI Passthrough An AMD IOMMU or Intel VT-d is required for this feature. The respective feature needs to be enabled in the BIOS. For VT-d, passing the kernel parameter "intel_iommu=on" is mandatory. Additionally, some host hardware will require the use of the kvm kernel module parameter: allow_unsafe_assigned_interrupts=1 (with the attended security issues). Many PCIe cards from major vendors should be supportable. Refer to systems level certifications for specific details, or contact the vendor in question for support statements. - USB Host Device Passthrough A physical USB device may be passed through from the host to a guest. This is supported. - Memory ballooning Dynamically changing the amount of memory allocated to a guest Restrictions: This requires a balloon driver operating in the guest. Other KVM Features ----------------- - Hotplug cpu Dynamically changing the number of vcpus assigned to the guest This is not supported. - Hotplug devices Dynamically adding or removing emulated or passthrough physical devices in the guest. This is now supported. - User for kvm What users may invoke qemu-kvm or the management tools (which in turn invoke qemu-kvm) The user must be root when using SUSE included management tools. Otherwise, it must be in the kvm group. Libvirt now has the option invokes qemu-kvm using a non-privileged user named qemu, though the default is to still use the root user. - Host Suspend/Hibernate Suspending/hibernating host with KVM installed or with guests running Suspending or Hibernating the host with guests running is not supported. Merely having kvm installed however is supported. - Power Management Changing power states in the host while guests are running A properly functioning constant_tsc machine is required. - NUMA Support for KVM on NUMA machines NUMA machines are supported. Using numactl to pin qemu-kvm processes to specific nodes is recommended. - Kvm module parameters Specifying parameters for the kvm kernel modules This is not supported unless done under the direction of SUSE support personnel. - Qemu only mode Qemu-kvm can be used in non-kvm mode, where the guest cpu instructions are emulated instead of being executed directly by the processor. This mode is enabled by using the -no-kvm parameter. This mode is not supported, but may be useful for problem resolution. - VirtFS (filesystem passthrough) Portions of the host filesystem can now be passed through to the guest via VirtFS. This is supported. - Guest Agent (/usr/bin/qemu-ga) The guest agent is a new feature which permits specific actions to be taken within a linux guest, as controlled from the host. This is not yet supported. - vhost-net kernel module support The vost-net kernel module allows for a more efficient network transport to the guest. It is automatically used by libvirt if loaded, or when using the qemu-kvm commandline, by adding ",vhost=on" to the networking option. This is supported. - AHCI guest storage interface The AHCI interface for SATA storage has been recently added. It permits much higher block i/o performance than the ide interface, and is particularly useful for use in recent Windows OS versions. It is not yet supported. - qcow2 and qed storage formats qcow2 is now supported, and a new qed format which in some cases performs better than qcow2 is also supported. - Online Disk Resizing Online disk resizing is being added to qemu-kvm, but is not yet supported. Deprecation, Supersession and Dropped Features ---------------------------------------------- The use of ",boot=on" for virtio disks is no longer needed since the bios used supports the virtio block interface directly. In fact, its usage may cause problems, and is now considered deprecated. The "-tdf" command line option is now considered deprecated. The unsupported "-no-kvm-pit" command line option is now considered deprecated. The unsupported "-no-kvm-pit-reinjection" command line option is now considered deprecated. The "-pcidevice" qemu-kvm command line option is no longer recognized. Use "-device pci-assign ..." instead. The "-osk" qemu-kvm command line option is no longer recognized. The "-M mac" qemu-kvm command line option is no longer recognized. The unsupported "-enable-nesting" is no longer recognized. Performance Limits ------------------ Our effort to provide kvm virtualization to our customers is to allow workloads designed for physical installations to be virtualized and thus inherit the benefits of modern virtualization techniques. Some trade-offs present with virtualizing workloads are a slight to moderate performance impact and the need to stage the workload to verify its behavior in a virtualized environment (esoteric software and rare, but possible corner cases can behave differently.) Although every reasonable effort is made to provide a broad virtualization solution to meet disparate needs, there will be cases where the workload itself is unsuited for kvm virtualization. In these cases creating an L3 incident would be inappropriate. We therefore propose the following performance expectations for guests performance to be used as a guideline in determining if a reported performance issue should be investigated as a bug (values given are rough approximations at this point - more validation is required): Category Fully Virtualized Paravirtualized Host-Passthrough CPU, MMU 7% (QEMU emulation not applicable 97% (unsupported)) (Hardware Virt. + EPT/NPT) 85% (Hardware Virt. + Shadow Pagetables) Network I/O 60% 75% 95% (1Gb LAN) e1000 emulated NIC Virtio NIC Disk I/O 40% 85% 95% IDE emulation Virtio block Graphics 50% not applicable (non-accelerated) VGA or Cirrus Time accuracy 95% - 105% 100% not applicable (worst case, using kvm-clock recommended settings and before ntp assist), where 100% = accurate timekeeping, 150% = time runs fast by 50%, etc. Percentage values are a comparison of performance achieved with the same workload under non-virtualized conditions. SUSE does not guarantee performance numbers. Paravirtualization Support -------------------------- To improve the performance of the guest OS, paravirtualized drivers are provided when available. It is recommended that they be used, but are not generally required for support. SUSE has developed virtio based drivers for Windows, which are available in the Virtual Machine Driver Pack (VMPD). These drivers are preferred over the drivers provided in the win-virtio-drivers.iso image file (the latter being deprecated and will probably not be provided in future releases.) One of the more difficult aspects of virtualization is correct timekeeping, and we are still evaluating proposed guidelines for the best configuration to achieve that goal. As mentioned previously, if a pv timekeeping option is available (eg: kvm-clock), it should be used. The memory ballooning driver is provided to help manage memory resources among competing demands. Management tools for example can take advantage of that feature. Appendix A ---------- These storage formats are supported: raw, qcow2, qed. This storage may be located in files within the host filesystem, logical volumes, host physical disks, or network based storage. The following qemu-kvm command line options are supported: -h -help -version -M [pc|pc-0.12|pc-0.14|pc-0.15] -cpu [?|qemu64 ] -smp ... -fda/-fdb ... -hda/-hdb/-hdc/-hdd ... -cdrom ... -drive ... (if specified if=[ide|floppy|virtio] and format=[raw|qcow2|qed] and snapshot=off) -global ... -boot ... -m ... -k ... -audio-help -usb -usbdevice [tablet|mouse] -device [isa-serial|isa-parallel|isa-fdc|ide-drive|ide-hd|ide-cd|pci-assign|VGA|cirrus-vga|rtl8139|virtio-net-pci|virtio-blk-pci|virtio-balloon-pci|virtio-9p-pci|usb-hub|usb-ehci|usb-tablet|usb-storage|usb-mouse|usb-kbd|virtserialport|virtconsole|virt-serial-pci||sga|rtl8139|i82559er] -fsdev ... -virtfs ... -name ... -uuid .. -display ... -nographic -no-frame -alt-grab -ctrl-grab -no-quit -sdl -vga [std|cirrus|none] -full-screen -vnc ... -no-acpi -no-hpet -balloon ... -smbios ... -net [nic|user|tap|none] ... (for mode= only rtl8139 and virtio are supported) -netdev ... -chardev .. -kernel ... -append ... -initrd ... -serial ... -parallel ... -monitor ... -mon ... -debugcon ... -pidfile ... -S -gdb ... -s -d ... -enable-kvm -no-reboot -no-shutdown -loadvm ... -daemonize -clock -rtc ... -watchdog ... -watchdog-action ... -echr ... -incoming ... -nodefaults -runas ... -readconfig ... -writeconfig ... -nodefconfig -tdf -mem-path ... -mem-prealloc (NOTE: -pcidevice is no longer an option. Use -device pci-assign instead) The following qemu-kvm monitor commands are supported: help ? info ... loadvm ... logfile ... logitem ... q block_resize ... eject ... drive_del ... change device ... stop [c|cont] gdbserver ... x ... xp ... [p|print] ... sendkey ... system_reset system_powerdown device_add ... device_del ... cpu ... mouse_move ... mouse_button ... mouse_set ... memsave ... pmemsave ... migrate ... migrate_set_speed ... migrate_set_downtime ... drive_add ... balloon target ... watchdog_action ... mce ... Conversely, the following qemu-kvm command line options are not supported: -M [pc-0.13|pc-0.11|pc-0.10|isapc] -cpu [phenom|core2duo|qemu32|kvm64|coreduo|486|pentium|pentium2|pentium3|athlon|n270] -numa ... -set ... -drive ,if=scsi|mtd|pflash], snapshot=on, format=[anything besides raw,qcow2,or qed] -mtdblock ... -sd ... -pflash ... -snapshot -soundhw ... -usbdevice [disk|host|serial|braille|net] -device driver [ivshmem|smbus-eeprom|scsi-disk|scsi-cd|scsi-hd|scsi-generic|usb-wacom-tablet|usb-braille|usb-serial|usb-net|usb-bt-dongle|ioh3240|x3130-upstream|xio3130-downstream|ich9-usb-uhci1|ich9-usb-uhci2|ich9-usb-uhci3|vt82c686b-usb-uhci|piix3-usb-uhci|piix4-usb-uhci|sysbus-ohci|pci-ohci|ich9-usb-ehci1|SUNW|sysbus-fdc|isa-applesmc|usb-ccid|ccid-card-passthrough|i6300esb|ne2k_pci|i82801|i825*|pcnet|ne2k_isa|ich9_ahci|lsi53c895a|isa-vga|vmware-vga|sb16|AC97|gus|cs4231a|intel-hda|hda-duplex|hda-output|ib700|isa-debugcon|testdev] -curses -spice -portrait -rotate -vga [vmware|qxl|xenfb] -g ... -win2k-hack -no-fd-bootchk -acpitable ... -net socket ... -net dump ... -bt ... -qmp ... -singlestep -L ... -bios ... -option-rom ... -icount ... -virtioconsole ... -show-cursor -tb-size ... -chroot ... -prom-env ... -semihosting -old-param -no-kvm -no-kvm-irqchip -no-kvm-pit -no-kvm-pit-reinjection -nvram ... -kvm-shadow-memory ... (NOTE: -osk is no longer an option) (NOTE: -M mac is no longer an option) The following qemu-kvm monitor commands are not supported: commit ... screendump ... savevm ... delvm ... singlestep ... i ... o ... sum ... usb_add ... wavcapture ... stopcapture ... boot_set nmi ... migrate_cancel client_migrate_info ... snapshot_blkdev ... pci_add ... pci_del... pcie_aer_inject_error ... host_net_add ... host_net_remove ... netdev_add netdev_del ... hostfwd_add ... hostfwd_remove ... set_link ... acl_show ... acl_policy ... acl_add ... acl_remove ... acl_reset ... close_fd ... block_passwd ... cpu_set ... set_password ... expire_password ...