KVM SUPPORT STATEMENTS FOR SLES 11 SP3 Overview -------- This document provides information about KVM supportability for use by the customer support team, quality engineering, end user, and other interested parties. KVM stands for Kernel-based Virtual Machine. KVM consists of two main components: * A set of kernel modules (kvm.ko, kvm-intel.ko, and kvm-amd.ko) that provides the core virtualization infrastructure and processor specific drivers. * A userspace program (qemu-kvm) that provides emulation for virtual devices and control to manage virtual machines The term KVM more properly refers to the kernel level virtualization functionality, but is also commonly used to reference the userspace component. KVM Host Status --------------- The qemu-kvm version currently included in SLES 11 SP3 is 1.4.2. In addition to the qemu-kvm program the kvm package provides a monitoring utility, firmware components, helper utilities, key-mapping files and scripts. These components, along with the KVM kernel modules and other related kernel modules are the focus of this support document. Interoperability with other virtualization tools has been tested and is an essential part of SUSE's support stance. These tools include: virt-manager, vm-install, qemu-img, virt-viewer and the libvirt daemon and shell. KVM Host Configuration ---------------------- KVM supports a number of different architectures, but SUSE supports only 64 bit x86 hosts. Additionally, KVM on the s390 architecture is now provided as a technology preview. KVM is designed around hardware virtualization features included in both AMD (AMD-V) and Intel (VT-x) x86 cpus, as well as other virtualization features such as IOMMU and SR-IOV. The required KVM kernel modules will not load if the hardware virtualization feature is not present or is not enabled in the BIOS. Qemu-kvm can function without these kernel modules loaded, but this mode is not supported. KVM allows for both memory and disk space overcommit. It is up to the user to understand the implications of doing so however, as hard errors resulting from actually exceeding available resources will result in guest failures. CPU overcommit is also allowed but can result in severe performance degradation. Guest Support Details --------------------- The following table lists x86 guest operating systems tested, and their support status: All guests OSs listed include both 32 and 64 bit x86 versions. For a supportable configuration, the same minimum memory requirements as for a physical installation is assumed. Most guests require some additional support for accurate time keeping. Where available, kvm-clock is to be used. NTP or similar network based time keeping protocols are also highly recommended (in host as well as guest) to help maintain stable time. When using the kvm-clock running NTP inside the guest is not recommended. Be aware that guest nics which don't have an explicit mac address specified (on qemu-kvm command line) will be assigned a default mac address, resulting in networking problems if more than one such instance is visible on the same network segment. Guest images created under previous supported releases are supported under this release, but not vice-versa. +-------------------------------------------------------------------------------------------------+ Guest OS Virt Type PV Drivers Available Support Status Notes SLES12 SP1 FV kvm-clock, Fully Supported SLES12 virtio-net, SLES11 SP4 virtio-blk, SLES11 SP3 virtio-balloon, SLES11 SP2 virtio-console, SLES11 SP1 virtio-rng SLES10 SP4 FV kvm-clock, Fully Supported virtio-net, virtio-blk, virtio-balloon, virtio-console SLES9 SP4 FV Fully Supported 32 bit kernel: specify clock=pmtmr on linux boot line 64 bit kernel: specify ignore_lost_ticks on linux boot line SLED12 SP1 FV kvm-clock, Technology Preview SLED12 virtio-net, SLED11 SP4 virtio-blk, SLED11 SP3 virtio-balloon, SLED11 SP2 virtio-console, SLED11 SP1 virtio-rng RHEL 4.x FV Yes, see RedHat Best Effort See footnote RHEL 5.x website for details RHEL 6.x RHEL 7.x Win 2008 SP2+ FV virtio-net, Fully Supported Host must have constant_tsc cpu feature Win 2008 R2+ virtio-blk, with SVVP Win 2012 virtio-balloon certification Win 2012 R2 (VMDP drivers preferred, win-virtio-drivers.iso drivers are deprecated) Win Vista SP2+ FV virtio-net, Best Effort Win 7 SP1+ virtio-blk, Win 8 virtio-balloon Win 10 (VMDP drivers preferred, Use "-cpu core2duo,hv_relaxed" Additional Hyper-V enablements may be needed win-virtio-drivers.iso drivers are deprecated) Footnote: http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/chap-Virtualization_Host_Configuration_and_Guest_Installation_Guide-KVM_guest_timing_management.html Supported Limits ---------------- The following limits have been tested, and are supported: Host RAM and CPU Same with KVM modules loaded as without. Refer to the SLES Release Notes for specifics Guest RAM size 2 TB Virtual CPUs per guest 160 NICs per guest 8 Block devices per guest 4 emulated, 20 para-virtual (virtio-blk) or 100 (virtio-scsi) Maximum number of guests Limit is defined as the total number of vcpus in all guests being no greater than 8 times the number of cpu cores in the host Major KVM Features and Compatibilities -------------------------------------- The following features are supported unless otherwise specified. - vm-install interoperability Define and Install guest via vm-install, which is a libvirt based utility This includes specifying RAM, disk type and location, video type, keyboard mapping, passthrough devices, NIC type, binding and mac address, and boot method Restrictions: Raw disk, qcow2, or qed format only. Realtek, e1000 or virtio NICs only Sound cards are not supported - virt-manager interoperability Manage guests via virt-manager, which is a libvirt based utility This includes autostart, start, stop, restart, pause, unpause, save, restore, clone, migrate, special key sequence insertion, guest console viewers, performance monitoring, cpu pinning, and static modification of cpu, RAM, boot method, disk, nic, mouse, display, video and host PCI or USB assignments Restrictions: No sound devices, qxl, vmvga (vmware) or xen video devices added pcnet, ne2k_pci and eepro100 virtual NICs are not supported emulated scsi disk is not supported Raw, qed and qcow2 are the only supported storage formats Spice graphics is not supported - virsh interoperability Manage guests via virsh, which is a libvirt based utility virsh is an in-tree libvirt tool that exposes all of the libvirt API. Most virsh subcommands are supported, including creation, modification, and destruction of guests and all lifecycle operations. Any virsh subcommands which translate to unsupported qemu-kvm command-line or monitor syntax would also be unsupported. Guest XML descriptions used by virsh can be created manually, using vm-install/virt-manager, or with external tools/scripts - qemu-kvm command-line and monitor interface Manage guests via direct invocation of qemu-kvm. It's generally preferred to manage guests using libvirt, but for greater flexibility, qemu-kvm may be used directly Restrictions: See Appendix A for which options are supported - s390-virtio-ccw On the s390 platform, a virtio-ccw transport and associated machine definition is provided. This is the basis for KVM on s390, which is provided as a technology preview - Live and static migration Migration of guests between hosts Restrictions: Source and target host machines should have the same cpu features which are exposed to the guest by the cpu model specified and currently the only cpu model supported for migration is -cpu qemu64 (default) with no additional cpu features specified Guest storage is accessible from both machines (shared), guest timekeeping is properly controlled A compatible guest "definition" is used on the source and target hosts No physical devices are passed through to the guest The use of the AHCI interface is currently not supported with migration Virtfs is incompatible with migration The -mem-path command-line option is incompatible with migration Both source and target hosts must be at the same support pack level - Kernel Samepage Merging (KSM) The KVM Host includes KSM support, and KVM is optimized to use KSM if available. It allows for automatic sharing of memory pages between guests, freeing some host memory - Transparent Huge Pages (THP) The KVM host now includes THP support, and KVM is optimized to use THP if available, both via madvise and opportunistic methods - PCI Device Passthrough A physical PCI device may be passed through from the host to a guest An AMD IOMMU or Intel VT-d is required for this feature. The respective feature needs to be enabled in the BIOS. For VT-d, passing the kernel parameter "intel_iommu=on" is mandatory. Additionally, some host hardware will require the use of the KVM kernel module parameter: allow_unsafe_assigned_interrupts=1 (with the attended security issues). Many PCIe cards from major vendors should be supportable. Refer to systems level certifications for specific details, or contact the vendor in question for support statements - USB Device Passthrough A physical USB device may be passed through from the host to a guest. Although this is supported, be aware that there are a number of shoddy devices in the market, and if such a device does not adhere to the USB specification, it may fail to operate correctly in a virtualized environment. - Memory ballooning Dynamically changing the amount of memory allocated to a guest Memory ballooning drivers available for Windows and Linux guests help manage memory resources among competing demands - Hotplug devices Dynamically adding or removing emulated or passthrough physical devices in the guest - KVM Security A kvm group is created by the KVM package which permits a non-root user to access the KVM control device file (/dev/kvm). Where possible, guests should not be run as root. Steps have been taken to enable this for libvirt as well. A setuid bridge helper has been added so that a bridged network interface can be set up without needing root privileges - Seccomp2 based sandboxing The guest can be run in a sandboxed environment where only predetermined system calls are permitted for added protection against malicious behavior - Host Suspend/Hibernate Suspending or Hibernating the host with guests running is not supported - Power Management Changing power states in the host while guests are running is not supported The cpu feature constant_tsc is required - KVM on NUMA machines Using numactl to pin qemu-kvm processes to numa nodes is recommended - kvm_stat kvm_stat is useful for problem resolution or simply monitoring KVM - guest interactions - Hotplug cpu Dynamically changing the number of vcpus assigned to the guest is not supported - APIC Virtualization Hardware APIC Virtualization allows the processor to directly inject interrupts into the guest to achieve better performance - TCG mode Qemu-kvm can be used in a non-KVM mode called TCG (Tiny Code Generator), where the guest cpu instructions are emulated instead of being executed directly by the processor. This mode is enabled with the "-no-kvm" command line option. Although this mode is not supported, it may be useful for fault isolation in some cases. It will likely result in quite a performance loss - VirtFS (filesystem passthrough) Directories in the host filesystem can be shared between the host and guest or guests using virtfs. A virtfs proxy helper is provided to enable virtfs usage when KVM is used as non-root user - Vhost-net kernel module support The vhost-net kernel module allows for a more efficient network transport to the guest. It is automatically used by libvirt if loaded, or when using the qemu-kvm commandline, by adding ",vhost=on" to the networking option - AHCI guest storage interface The AHCI interface for SATA storage has been recently added. It permits much higher block i/o performance than the ide interface, and is particularly useful for use in recent Windows OS versions - Qcow2 and qed storage formats Unlike previous service packs, qcow2 and qed storage formats can now be used with live migration - Trim and Online Disk Resizing Trim and online disk resizing support depends on the storage stack used - Virtio SCSI Virtio SCSI allows for passing through host SCSI block or generic SCSI devices to the guest, and provides addition storage options using a virtio SCSI interface within the guest - Macvtap / vhost-net zero-copy transmits Zero-copy packet transmits from the guest is now possible using vhost-net and macvtap changes that have been added to the SP3 kernel - Disk caching modes The default caching mode for disk images is now writeback due to improvements in image format handling and guest storage device fixes. The virtio-blk backend now automatically chooses writeback or writethrough depending on the guest virtio driver support and settings Note that only cache=none should be used when aio=native is also used As indicated by the name cache=unsafe is not guaranteed to protect your data in the event of a crash, but is useful because of it's performance characteristics if your disk data is somehow transient in nature or easily re-creatable - Guest Agent (/usr/bin/qemu-ga) The guest agent permits specific actions to be taken within a linux guest, as controlled from the host. This is not yet supported - KVM module parameters Specifying parameters for the KVM kernel modules This is not supported unless done under the direction of SUSE support personnel - Virtio-data-block - data-plane An experimental block IO backend is available using the ",x-data-plane=on" parameter to "-device virtio-blk-pci". This interface allows higher IO rates. This is not yet supported, and is provided as a technical preview - Q35 Machine A more modern machine type based on the Intel q35 chipset is available. This is not yet supported - Nested Virtualization When the svm or vmx cpu feature is passed through to the guest, nested virtualization is possible. This is not yet supported - Spice Spice interoperability is not enabled in KVM - Glusterfs Glusterfs interoperability is not enabled in KVM - ISCSI ISCSI integration is not enabled in KVM. It is however possible for guests to access iscsi targets available to the host via the blockio interfaces - RBD (Rados Block Devices) RBD integration is now enabled in KVM for the x86_64 architecture only, and is supported there as a technology preview. Deprecated, Superseded and Dropped Features ---------------------------------------------- The use of ",boot=on" for virtio disks is no longer needed since the bios used supports the virtio block interface directly. In fact, its usage may cause problems, and is now considered deprecated. The use of "?" as a parameter to "-cpu", "-soundhw", "-device", "-M", "-machine", "-d", and "-clock" is now considered deprecated. Use "help" instead. The "-tdf" command line option is now considered deprecated. The unsupported "-no-kvm-pit" command line option is now considered deprecated. The unsupported "-no-kvm-pit-reinjection" command line option is now considered deprecated. The "-pcidevice" qemu-kvm command line option is no longer recognized. Use "-device pci-assign" instead. The unsupported -device testdev command line option is no longer recognized. Use "-device pc-testdev" instead. The unsupported "-kvm-shadow-memory" command line option is no longer recognized. It's function is now accessible via the ",kvm_shadow_mem=" parameter to the "-machine" command line option. The unsupported "-no-kvm-irqchip" command line option is now considered deprecated. It's function is now accessible via the ",kernel_irqchip=" parameter to the "-machine" command line option. The unsupported "-osk" qemu-kvm command line option is no longer recognized. The unsupported "-M mac" qemu-kvm command line option is no longer recognized. The unsupported "-enable-nesting" command line option is no longer recognized. The unsupported "-old-param" command line option is no longer recognized. The unsupported "-semihosting" command line option is no longer recognized. The unsupported "-nvram" command line option is no longer recognized. The unsupported "cpu_set" monitor command is no longer recognized. The deprecated windows drivers (win-virtio-drivers.iso) are no longer provided. The Virtual Machine Driver Pack is the supported way to get virtio drivers for Windows guests. Performance Limits ------------------ Our effort to provide KVM virtualization to our customers is to allow workloads designed for physical installations to be virtualized and thus inherit the benefits of modern virtualization techniques. Some trade-offs present with virtualizing workloads are a slight to moderate performance impact and the need to stage the workload to verify its behavior in a virtualized environment (esoteric software and rare, but possible corner cases can behave differently.) Although every reasonable effort is made to provide a broad virtualization solution to meet disparate needs, there will be cases where the workload itself is unsuited for KVM virtualization. In these cases creating an L3 incident would be inappropriate. We therefore propose the following performance expectations for guests performance to be used as a guideline in determining if a reported performance issue should be investigated as a bug (values given are rough approximations at this point - more validation is required): Category Fully Virtualized Paravirtualized Host-Passthrough CPU, MMU 7% (QEMU emulation not applicable 97% (unsupported)) (Hardware Virt. + EPT/NPT) 85% (Hardware Virt. + Shadow Pagetables) Network I/O 60% 75% 95% (1Gb LAN) e1000 emulated NIC Virtio NIC Disk I/O 40% 85% 95% IDE emulation Virtio block Graphics 50% not applicable (non-accelerated) VGA or Cirrus Time accuracy 95% - 105% 100% not applicable (worst case, using kvm-clock recommended settings and before ntp assist), where 100% = accurate timekeeping, 150% = time runs fast by 50%, etc. Percentage values are a comparison of performance achieved with the same workload under non-virtualized conditions. SUSE does not guarantee performance numbers. Paravirtualization Support -------------------------- To improve the performance of the guest OS, paravirtualized drivers are provided when available. It is recommended that they be used, but are not generally required for support. SUSE has developed virtio based drivers for Windows, which are available in the Virtual Machine Driver Pack (VMPD). As mentioned previously, if a pv timekeeping option is available (eg: kvm-clock), it should be used. Appendix A ---------- The qemu-kvm command line is as follows: qemu-kvm [options] [disk_image] Where 'options' are taken from the options listed below, and 'disk_image' is the file system reference to the hard disk image for IDE hard disk 0. This image as well as those used with -drive, may be in the raw (no format), qcow2 or qed storage formats, and may be located in files within the host filesystem, logical volumes, host physical disks, or network based storage. Note that as a general rule, as new command line options are added which serve to replace an older option or interface, you are strongly encouraged to adapt your usage to the new option. The new option is being introduced to provide better functionality and usability going forward. In some cases existing problems or even bugs in older interfaces cannot be fixed due to functional expectations, but are resolved in the newer interface or option. This advice includes moving to the most recent machine type (eg pc-i440fx-1.4 instead of pc-0.15) if possible. The following qemu-kvm command line options are supported: -h -help -version -M [help|?|none|pc|pc-0.12|pc-0.14|pc-0.15|pc-i440fx-1.4] -machine [help|?|none|pc|pc-0.12|pc-0.14|pc-0.15|pc-i440fx-1.4] -cpu ... -smp ... -fda/-fdb ... -hda/-hdb/-hdc/-hdd ... -cdrom ... -drive ... (if specified if=[ide|floppy|virtio] and format=[raw|qcow2|qed] and snapshot=off only) -global ... -boot ... -m ... -k ... -audio-help -usb -usbdevice [disk|host|serial|braille|net|tablet|mouse] -device [isa-serial|isa-parallel|isa-fdc|ide-drive|ide-hd|ide-cd|pci-assign|kvm-pci-assign|VGA|cirrus-vga|rtl8139|virtio-net-pci|virtio-blk-pci|virtio-balloon-pci|virtio-9p-pci|usb-hub|usb-ehci|usb-tablet|usb-storage|usb-mouse|usb-kbd|virtserialport|virtconsole|virtio-serial-pci|virtio-serial|sga|i82559er|e1000|virtio-scsi-pci|scsi-cd|scsi-hd|scsi-generic|scsi-disk|scsi-block|pc-sysfw|pci-serial|pci-serial-2x|pci-serial-4x|ich9-ahci|piix-usb-uhci|usb-host|usb-serial|usb-wacom-tablet|usb_braille|usb-net|pci-ohci|piix4-usb-uhci|virtio-rng-pci] -fsdev ... -virtfs ... -name ... -uuid .. -display ... -nographic -no-frame -alt-grab -ctrl-grab -no-quit -sdl -vga [std|cirrus|none] -full-screen -vnc ... -no-acpi -no-hpet -balloon ... -smbios ... -net [nic|user|tap|bridge|none] ... (for model= only rtl8139, e1000 and virtio are supported) -netdev ... -chardev .. -kernel ... -append ... -initrd ... -serial ... -parallel ... -monitor ... -qmp ... -mon ... -debugcon ... -pidfile ... -S -gdb ... -s -d ... -enable-kvm -no-reboot -no-shutdown -loadvm ... -daemonize -clock -rtc ... -watchdog ... -watchdog-action ... -echr ... -incoming ... -nodefaults -sandbox ... -runas ... -readconfig ... -writeconfig ... -nodefconfig -no-user-config -tdf -mem-path ... -mem-prealloc -object ... The following qemu-kvm monitor commands are supported: help ? info ... savevm ... loadvm ... delvm ... logfile ... logitem ... q block_resize ... eject ... drive_del ... change device ... stop [c|cont] gdbserver ... x ... xp ... [p|print] ... sendkey ... system_reset system_powerdown device_add ... device_del ... boot_set ... cpu ... mce ... mouse_move ... mouse_button ... mouse_set ... memsave ... nmi ... pmemsave ... pci_add ... pci_del... migrate ... migrate_set_speed ... migrate_set_downtime ... drive_add ... balloon target ... usb_add ... usb_del ... watchdog_action ... dump_guest_memory ... migrate_set_cache_size ... migrate_set_capability ... system_wakeup Conversely, the following qemu-kvm command line options are not supported: -M [q35|pc-q35-1.4|pc-1.3|pc-1.2|pc-1.1|pc-1.0|pc-0.13|pc-0.11|pc-0.10|isapc] -machine [q35|pc-q35-1.4|pc-1.3|pc-1.2|pc-1.1|pc-1.0|pc-0.13|pc-0.11|pc-0.10|isapc] -numa ... -add-fd ... -set ... -drive ,if=scsi|mtd|pflash], snapshot=on, format=[anything besides raw,qcow2,or qed] -mtdblock ... -sd ... -pflash ... -snapshot -soundhw ... -device [ipoctal232|sysbus-ohci|i82562|ccid-card-passthru|smbus-eeprom|nec-usb-xhci|hda-duplex|hda-output|cfi.pflash01|ivshmem|usb-bot|lsi53c895a|ich9-usb-uhci2|ich9-usb-uhci6|q35-pcihost|ich9-usb-uhci5|ich9-usb-uhci3|i6300esb|isa-debug-exit|ne2k_pci|vfio-pci|usb-uas|ich9-usb-uhci4|ioh3420|isa-ide|esp|usb-ccid|ich9-usb-ehci2|pcnet|ich9-intel-hda|dc390|ich9-usb-ehci1|sysbus-ahci|hda-micro|pci-bridge|x3130-upstream|isa-cirrus-vga|ich9-usb-uhci1|pc-testdev|ne2k_isa|isa-vga|cs4231a|sysbus-fdc|gus||vmware-svga||i82801b11-bridge|i82557a|i82557c|i82557b|i82801|AC97|am53c974|intel-hda||i82558a|i82558b|usb-audio|i82550|isa-debugcon|ib700|sb16|megasas|i82551|xio3130-downstream|vt82c686b-usb-uhci|tpci200|i82559a|i82559b|i82559c|xlnx,ps7-usb|SUNW,fdtwo|isa-applesmc|exynos4210-ehci-usb|mch|usb-bt-dongle] -curses -spice -portrait -rotate -vga [vmware|qxl|xenfb] -g ... -win2k-hack -no-fd-bootchk -acpitable ... -net [socket|dump] ... -net dump ... -iscsi ... -bt ... -dtb -singlestep -L ... -bios ... -option-rom ... -icount ... -virtioconsole ... -show-cursor -tb-size ... -chroot ... -prom-env ... -trace ... -qtest ... -qtest-log ... -no-kvm -no-kvm-irqchip -no-kvm-pit -no-kvm-pit-reinjection The following qemu-kvm monitor commands are not supported: block_job_cancel ... block_job_complete ... block_job_pause ... block_job_resume ... block_job_set_speed ... drive_mirror ... nbd_server_add ... nbd server_start ... nbd_server_stop ... ringbuf_read ... ringbuf_write ... trace_event ... commit ... screendump ... singlestep ... i ... o ... sum ... wavcapture ... stopcapture ... migrate_cancel client_migrate_info ... snapshot_blkdev ... pcie_aer_inject_error ... host_net_add ... host_net_remove ... netdev_add netdev_del ... hostfwd_add ... hostfwd_remove ... set_link ... acl_show ... acl_policy ... acl_add ... acl_remove ... acl_reset ... close_fd ... block_passwd ... set_password ... expire_password ... In addition to the above listed human monitor commands, a JSON based monitor interface (called the QMP (Qemu Monitor Protocol) is provided which allows for a more programmatic and control oriented interaction with the monitor. See /usr/share/doc/packages/kvm/qmp-commands.txt for details on executing QMP commands. Below is the list of QMP commands: quit eject change screendump stop cont system_wakeup system_reset system_powerdown device_add device_del send-key cpu memsave pmemsave inject-nmi ringbuf-write ringbuf-read xen-save-devces-state xen-set-global-dirty-flag migrate migrate_cancel migrate-set-cache-size query-migrate-cache-size migrate_set_speed migrate_set_downtime client_migrate_info netdev_add netdev_del block_resize transaction block-snapshot-sync drive-mirror balloon set_link getfd closefd add-fd remove-fd query-fdsets block_passwd block_set_io_throttle set_password expire_password add_client qmp_capabilities human-monitor-command query-version query-commands query-events query-chardev query-block query-blockstats query-cpus query-pci query-kvm query-status query-mice query-vnc query-spice query-name query-uuid query-migrate migrate-set-capabilities query-balloon chardev-add chardev-remove