KVM on Arch Linux
GitLab: kyaulabs/autoarch: Arch Linux installation automation. |
Introduction
This is a tutorial for setting up and using KVM on Arch Linux utilizing QEMU as the back-end and libvirt as the front-end. Additional notes have been added for creating system images.
UPDATE (2019): Tested/Cleaned Up this document using a Dell R620 located in-house at KYAU Labs as the test machine.
Installation
Before getting started it is a good idea to make sure VT-x or AMD-V is enabled in BIOS.
# egrep --color 'vmx|svm' /proc/cpuinfo |
If hardware virtualization is not enabled, reboot the machine and enter the BIOS to enable it. |
Once hardware virtualization has been verified install all the packages required.
# pikaur -S bridge-utils dmidecode libguestfs libvirt \ openbsd-netcat openssl-1.0 ovmf qemu-headless \ qemu-headless-arch-extra virt-install |
Configuration
After all of the packages have been installed libvirt/QEMU need to be configured.
User/Group Management
Create a user for KVM.
# sudo useradd -g kvm -s /usr/bin/nologin kvm |
Then modify the libvirt QEMU config to reflect this.
... user = "kvm" group = "kvm" ... |
Fix permission on /dev/kvm
# sudo groupmod -g 78 kvm |
# sudo usermod -u 78 kvm |
systemd as of 234 assigns dynamic IDs to groups, but KVM expects 78 |
User Access
If non-root user access to libvirtd is desired, add the libvirt group to polkit access.
/* Allow users in kvm group to manage the libvirt daemon without authentication */ polkit.addRule(function(action, subject) { if (action.id == "org.libvirt.unix.manage" && subject.isInGroup("libvirt")) { return polkit.Result.YES; } }); |
If HAL was followed to secure the system after installation and you would like to use libvirt as a non-root user, the hidepid security feature from the /proc line in /etc/fstab will need to be removed. This will require a reboot. |
Add the users who need libvirt access to the kvm and libvirt groups.
# sudo gpasswd -a username kvm |
# sudo gpasswd -a username libvirt |
To make life easier it is suggested to set a couple shell variables for virsh, this will default to qemu:///session when running as a non-root user.
# setenv VIRSH_DEFAULT_CONNECT_URI qemu:///system |
# setenv LIBVIRT_DEFAULT_URI qemu:///system |
These can be added to /etc/bash.bashrc, /etc/fish/config.fish or /etc/zsh/zshenv depending on which shell is being used.
Hugepages
Enabling hugepages can improve the performance of virtual machines. First add an entry to the fstab, make sure to first check what the group id of the group kvm is (it should be 78.
# grep kvm /etc/group |
hugetlbfs /dev/hugepages hugetlbfs mode=1770,gid=78 0 0 |
Instead of rebooting, remount instead.
# sudo umount /dev/hugepages # sudo mount /dev/hugepages |
This can then be verified.
# sudo mount | grep huge # ls -FalG /dev/ | grep huge |
Now to set the number of hugepages to use. For this one has to do a bit of math, for each gigabyte of the system RAM that you want to use for VMs you divide the size in megabytes by two.
On my setup I will dedicated 40GB out of the 48GB of system RAM to VMs. This means (40 * 1024) / 2 or 20480 |
Set the number of hugepages.
# echo 20480 | sudo tee /proc/sys/vm/nr_hugepages |
Also set this permanently by adding a file to /etc/sysctl.d.
vm.nr_hugepages = 20480 |
Again verify the changes.
# grep HugePages_Total /proc/meminfo |
Edit the libvirt QEMU config and turn hugepages on.
... hugetlbfs_mount = "/dev/hugepages" ... |
Kernel Modules
A few additional kernel modules will help to assist KVM.
Nested virtualization can be enabled by loading the kvm_intel module with the nested=1 option. To mount directories directly from the host inside of a VM, the 9pnet_virtio module will need to be loaded. Additionally virtio-net and virtio-pci are loaded to add para-virtualized devices.
# sudo modprobe -r kvm_intel |
# sudo modprobe kvm_intel nested=1 |
# sudo modprobe 9pnet_virtio virtio_net virtio_pci |
Also load the module on boot.
options kvm_intel nested=1 9pnet_virtio virtio_net virtio_pci |
If 9pnet is going to be used, change the global QEMU config to turn off dynamic file ownership.
... dynamic_ownership = 0 ... |
Nested virtualization can be verified.
# grep nested |
If the machine has an AMD processor use kvm_amd instead for nested virtualization. |
UEFI & PCI-E Passthrough
The Open Virtual Machine Firmware (OVMF) is a project to enable UEFI support for virtual machines and enabling IOMMU will enable PCI pass-through among other things. This extends the possibilities for operating system choices significantly and also provides some other options.
GRUB
Enable IOMMU on boot by adding an option to the kernel line in GRUB.
... GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on" ... |
Re-generate the GRUB config.
# sudo grub-mkconfig -o /boot/grub/grub.cfg |
REfind
Enable IOMMU on boot by adding an option to the
... options "root=/dev/mapper/skye-root rw add_efi_memmap nomodeset intel_iommu=on zswap.enabled=1 zswap.compressor=lz4 \ zswap.max_pool_percent=20 zswap.zpool=z3fold initrd=\intel-ucode.img" ... |
Reboot the machine and then verify IOMMU is enabled.
# sudo dmesg | grep -e DMAR -e IOMMU |
If it was enabled properly, there should be a line similar to [ 0.000000] DMAR: IOMMU enabled.
OVMF
Adding the OVMF firmware to libvirt.
... nvram = [ "/usr/share/ovmf/x64/OVMF_CODE.fd:/usr/share/ovmf/x64/OVMF_VARS.fd" ] ... |
SPICE TLS
Enable SPICE over TLS will allow SPICE to be exposed externally.
Edit the libvirt QEMU config to enable SPICE over TLS.
... spice_listen = "0.0.0.0" spice_tls = 1 spice_tls_x509_cert_dir = /etc/pki/libvirt-spice ... |
Then use the following script to generate the required certificates.
#!/bin/bash SERVER_KEY=server-key.pem # creating a key for our ca if [ ! -e ca-key.pem ]; then openssl genrsa -des3 -out ca-key.pem 1024 fi # creating a ca if [ ! -e ca-cert.pem ]; then openssl req -new -x509 -days 1095 -key ca-key.pem -out ca-cert.pem -utf8 -subj "/C=WA/L=Seattle/O=KYAU Labs/CN=KVM" fi # create server key if [ ! -e $SERVER_KEY ]; then openssl genrsa -out $SERVER_KEY 1024 fi # create a certificate signing request (csr) if [ ! -e server-key.csr ]; then openssl req -new -key $SERVER_KEY -out server-key.csr -utf8 -subj "/C=WA/L=Seattle/O=KYAU Labs/CN=myhostname.example.com" fi # signing our server certificate with this ca if [ ! -e server-cert.pem ]; then openssl x509 -req -days 1095 -in server-key.csr -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 -out server-cert.pem fi # now create a key that doesn't require a passphrase openssl rsa -in $SERVER_KEY -out $SERVER_KEY.insecure mv $SERVER_KEY $SERVER_KEY.secure mv $SERVER_KEY.insecure $SERVER_KEY # show the results (no other effect) openssl rsa -noout -text -in $SERVER_KEY openssl rsa -noout -text -in ca-key.pem openssl req -noout -text -in server-key.csr openssl x509 -noout -text -in server-cert.pem openssl x509 -noout -text -in ca-cert.pem |
If setting up multiple KVM host machines, use the same CA files when generating the other machine certificates. |
Create the directory for the certificates.
# sudo mkdir -p /etc/pki/libvirt-spice |
Change permissions on the directory.
# sudo chmod -R a+rx /etc/pki |
Move the generate files to the new directory.
# sudo mv ca-* server-* /etc/pki/libvirt-spice |
Correct permissions on the files.
# sudo chmod 660 /etc/pki/libvirt-spice/* |
# sudo chown kvm:kvm /etc/pki/libvirt-spice/* |
Services
Once the bridge is up and running libvirtd can be started, enable and start the libvirtd service.
# sudo systemctl enable libvirtd |
# sudo systemctl start libvirtd |
Verify that libvirt is running.
# virsh --connect qemu:///system |
Welcome to virsh, the virtualization interactive terminal. Type: 'help' for help with commands 'quit' to quit virsh # |
If you end up at the virsh prompt, simply type quit to exit back to the shell.
Networking
The server being used for testing has quad gigabit network cards in it. For this type of setup one NIC will be used for management of the host OS, while the other three will be bonded together using 802.3ad (combines all NICs for optimal throughput).
NIC Bonding
Pull up a list of all network cards in the machines.
# ip -c=auto l |
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether d4:be:d9:b2:95:43 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether d4:be:d9:b2:95:45 brd ff:ff:ff:ff:ff:ff 4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether d4:be:d9:b2:95:47 brd ff:ff:ff:ff:ff:ff 5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether d4:be:d9:b2:95:49 brd ff:ff:ff:ff:ff:ff |
Create a management .network file, replace M.M.M.M with the management IP address and G.G.G.G with the gateway IP.
[Match] Name=eth0 [Network] DHCP=no NTP=pool.ntp.org DNS=1.1.1.1 LinkLocalAddressing=no [Address] Address=M.M.M.M/24 Label=management [Route] Gateway=G.G.G.G |
WARNING: If IPv6 is being used, remove the LinkLocalAddressing=no line from the file as this defaults to ipv6. |
Create the bond interface with systemd-networkd.
[NetDev] Name=bond0 Description=KVM vSwitch Kind=bond [Bond] Mode=802.3ad TransmitHashPolicy=layer3+4 MIIMonitorSec=1s LACPTransmitRate=fast |
Use the last three network cards to create a bond0.network file.
[Match] Name=eth1 Name=eth2 Name=eth3 [Network] Bond=bond0 |
Finally create the .network file attaching it to the bridge that is created in the next section.
[Match] Name=bond0 [Network] Bridge=kvm0 |
Network Bridge
Setting up a network bridge for KVM is simple with systemd. Create the bridge interface with systemd-networkd.
[NetDev] Name=kvm0 Kind=bridge |
Create the .network file for the bridge, replace X.X.X.X with the IP address desired for the KVM vSwitch, G.G.G.G with the gateway IP and modify the DNS if Cloudflare is not desired.
[Match] Name=kvm0 [Network] DHCP=no NTP=pool.ntp.org DNS=1.1.1.1 IPForward=yes LinkLocalAddressing=no [Address] Address=X.X.X.X/24 Label=vswitch [Route] Gateway=G.G.G.G |
And finally restart networkd.
# sudo systemctl restart systemd-networkd |
The bridge should now be up and running, this should be verified.
# ip -c=auto a |
Before adding the bridge to libvirt, check the current networking settings.
# virsh net-list --all |
Name State Autostart Persistent ---------------------------------------------- default inactive no yes |
Create a libvirt configuration for the bridge.
<network> <name>kvm0</name> <forward mode="bridge"/> <bridge name="kvm0"/> </network> |
Enable the bridge in libvirt.
# virsh net-define --file /etc/libvirt/bridge.xml |
Set the bridge to auto-start.
# virsh net-autostart kvm0 |
Start the bridge.
# virsh net-start kvm0 |
With the bridge now online, the default NAT network can be removed if it will not be used.
# virsh net-destroy default |
# virsh net-undefine default |
This can then be verified.
# virsh net-list --all |
Name State Autostart Persistent ---------------------------------------------- kvm0 active yes yes |
Firewall
Since libvirt cannot directly interface with nftables, it can only interface with iptables, firewalld can be used as a gateway in-between the two. Before it can be started nftables will have to be disabled if it is currently being used.
# sudo systemctl disable nftables |
# sudo systemctl stop nftables |
Install firewalld and dnsmasq.
# pikaur -S dnsmasq firewalld |
Start and enable the service.
# sudo systemctl enable firewalld |
# sudo systemctl start firewalld |
Verify the firewall started properly, it should return running.
# sudo firewall-cmd --state |
Add both interfaces to the public zone.
# firewall-cmd --permanent --zone=public --add-interface=eth0 |
# firewall-cmd --permanent --zone=public --add-interface=kvm0 |
Reboot the machine to verify the changes stick upon reboot.
# sudo systemctl reboot |
The SSH service is added by default to the firewall, allowing one to log back in after reboot. |
Look up the default zone config to verify the interfaces were added.
# sudo firewall-cmd --list-all |
public target: default icmp-block-inversion: no interfaces: eth0 kvm0 sources: services: dhcpv6-client ssh ports: protocols: masquerade: no forward-ports: source-ports: icmp-blocks: rich rules: |
If any other services need to be added, do so now (a couple examples have been listed below).
# sudo firewall-cmd --zone=public --permanent --add-service=https |
# sudo firewall-cmd --zone=public --permanent --add-port=5900-5950/udp |
With the firewall now setup, libvirtd should be fully started without any warnings about the firewall.
# sudo systemctl status libvirtd |
● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2019-02-26 14:27:12 PST; 16min ago Docs: man:libvirtd(8) https://libvirt.org Main PID: 693 (libvirtd) Tasks: 17 (limit: 32768) Memory: 61.3M CGroup: /system.slice/libvirtd.service └─693 /usr/bin/libvirtd Feb 26 14:27:12 skye.wa.kyaulabs.com systemd[1]: Started Virtualization daemon. |
Storage Pools
A storage pool in libvirt is merely a storage location designated to store virtual machine images or virtual disks. The most common storage types include netfs, disk, dir, fs, iscsi, logical, and gluster.
List all of the currently available storage pools.
# virsh pool-list --all |
Name State Autostart --------------------------- |
ISO Images
Begin by adding a pool for iso-images. If you wish to use something other than an existing mount-point you will have to change the type, options include the ones listed above (more can be found in the man page). A couple of examples are as follows:
<pool type='dir'> <name>iso-images</name> <target> <path>/pool/iso</path> <permissions> <mode>0770</mode> <owner>78</owner> <group>78</group> </permissions> </target> </pool> |
<pool type='fs'> <name>iso-images</name> <source> <device path="/dev/vgroup/lvol" /> </source> <target> <path>/pool/iso</path> <permissions> <mode>0770</mode> <owner>78</owner> <group>78</group> </permissions> </target> </pool> |
<pool type='netfs'> <name>iso-images</name> <source> <host name="nfs.example.com" /> <dir path="/nfs-path-to/images" /> <format type='nfs'/> </source> <target> <path>/pool/iso</path> <permissions> <mode>0770</mode> <owner>78</owner> <group>78</group> </permissions> </target> </pool> |
After creating the pool XML file, define the pool in libvirt.
# virsh pool-define iso-images.vol |
Before you begin using the pool it must also be built, it is also a good idea to set it to auto-start.
# virsh pool-build iso-images |
# virsh pool-start iso-images |
# virsh pool-autostart iso-images |
The iso-images pool should now be properly setup, feel free to import some images into the directory.
# sudo cp archlinux-2019.02.01-x86_64.iso /pool/iso |
Permissions and ownership will need to be set correctly.
# sudo chown kvm:kvm /pool/iso/archlinux-2019.02.01-x86_64.iso |
# sudo chmod 660 /pool/iso/archlinux-2019.02.01-x86_64.iso |
After copying over images and correcting permissions refresh the pool.
# virsh pool-refresh iso-images |
The image should now show up in the list of volumes for that pool.
# virsh vol-list iso-images --details |
Name Path Type Capacity Allocation --------------------------------------------------------------------------------------------------------------- archlinux-2019.02.01-x86_64.iso /pool/iso/archlinux-2019.02.01-x86_64.iso file 600.00 MiB 600.00 MiB |
Check on the status of all pools that have been added.
# virsh pool-list --all --details |
Name State Autostart Persistent Capacity Allocation Available --------------------------------------------------------------------------------------- iso-images running yes yes 108.75 GiB 4.77 GiB 103.97 GiB |
LVM
If LVM is going to be used for the VM storage pool, that can be setup now.
If the volume group has been created manually, the <source> section can be omitted from the XML and skip the build step as that is used to create the LVM volume group. |
Begin by creating a storage pool file.
<pool type='logical'> <name>vdisk</name> <source> <device path="/dev/sdX3" /> <device path="/dev/sdX4" /> </source> <target> <path>/dev/vdisk</path> <permissions> <mode>0770</mode> <owner>78</owner> <group>78</group> </permissions> </target> </pool> |
After creating the pool XML file, define the pool in libvirt, build it and set it to auto-start.
# virsh pool-define vdisk.vol |
# virsh pool-build vdisk |
# virsh pool-start vdisk |
# virsh pool-autostart vdisk |
Grant ownership of the LVM volumes to the kvm group in order to properly mount them using Libvirt.
ENV{DM_VG_NAME}=="vdisk" ENV{DM_LV_NAME}=="*" OWNER="kvm" |
Continuing as is will allow libvirtd to automatically manage the LVM volume on its own.
LVM Thin Volumes
Before going down this road, there are a couple of things to consider.
WARNING: If thin provisioning is enabled, LVM automation via libvirtd will be broken. |
In a standard LVM logical volume, all of the block are allocated when the volume is created, but blocks in a thin provisioned LV are allocated as they are written. Because of this, a thin provisioned logical volume is given a virtual size, and can then be much larger than physically available storage.
WARNING: Over-provisioning is NEVER recommended, whether it is CPU, RAM or HDD space. |
With the warnings out of the way, if thin provisioning is desired begin by creating a thin pool.
# sudo lvcreate -l +100%FREE -T vdisk/thin |
A volume group named vdisk was prepared using the previous steps via virsh build, if this was skipped either go back and redo it or prepare the volume group yourself. Doing it this way has the added benefit of breaking only most of the LVM functionality of libvirt.
VM Creation
With libvirt setup completed, time to create the first VM. Before we can begin with the VM installation a logical volume needs to be created for the VM.
If you chose the default setup using regular LVM, feel free to use virsh.
# virsh vol-create-as vdisk vmname 32GiB |
If you went the route of thin volumes, create the logical volume manually.
# sudo lvcreate -V 32G -T vdisk/thin -n vmname |
Take the time now to install virt-manager on a client machine running X11. Connection to the KVM machine over SSH using virt-manager should now be possible.
Adding a hosts entry is only required for console access over SSH when both machines are on the same local network. |
Add a hosts entry for the KVM machine on the client machine if required.
X.X.X.X vmhost |
Start the VM installation.
# virt-install \ --virt-type=kvm --hvm \ --name vmname \ --cpu host-model-only --vcpus=2 --memory 2048,hugepages \ --network=bridge=kvm0,model=virtio \ --graphics spice,port=4901,tlsport=5901,listen=0.0.0.0,password=moo \ --cdrom=/pool/iso/archlinux-2019.02.01-x86_64.iso \ --disk path=/dev/vdisk/vmname,bus=virtio \ --console pty,target_type=serial --wait -1 --boot uefi |
If this fails with a Permission denied error having to do with an nvram file, change the permissions accordingly and then re-run virt-install.
# sudo chown kvm:kvm /var/lib/libvirt/qemu/nvram/vmname_VARS.fd |
# sudo chmod 660 /var/lib/libvirt/qemu/nvram/vmname_VARS.fd |
Normally one would use --location instead of --cdrom so that --extra-args could be used to enable the console. However, Arch Linux being a hybrid iso cannot do this. |
If all went well it should print out the following:
Starting install... Domain installation still in progress. Waiting for installation to complete. |
At this point return to virt-manager on the client machine and connect to the remote libvirt instance. Then select the new virtual machine and choose Open. The remote virtual machine installation should now be on screen.
If you were lucky enough to catch the installation at the boot menu, press the <TAB> key to add the console to the kernel line before booting.
...archiso.img console=ttyS0 |
Follow through with the installation via the console.
After installation make sure to re-add the kernel parameter console=ttyS0, then reboot the VM.
If the console resolution is too large, it can be shrunk with the kernel parameter nomodeset vga=276 to set it to 800x600. More information here |
Once the console kernel parameter(s) have been added, verify this is working.
# virsh console vmname |
Once connected press <ENTER> to get to the login prompt.
Additional Notes
These notes are here from my own install.
Import from qcow2 backup.
# sudo qemu-img convert -f qcow2 -O raw backup.qcow2 /dev/vdisk/vmname |
How to re-sparsify a thin volume if restored from backup qcow2.
# sudo virt-sparsify --in-place /dev/vdisk/vmname |
Import machine that was previously exported.
# virsh define --file newxml-bind.xml |
Make a VM autostart on boot.
# virsh autostart vmname |