KVM on Arch Linux

From Wiki³
IconUNDER CONSTRUCTION: The document is currently being modified!

Icon Introduction

This is a tutorial on how to automate the setup of VMs using KVM on Arch Linux. This tutorial utilizes QEMU as a back-end for KVM using libvirt. System base images will be generated using Packer. And finally, Vagrant and vagrant-libvirt will be utilized for KVM test environments.

IconThis tutorial is meant as a supplement to the OVH installation tutorials.

For demonstration in this tutorial I am following this use case:

Environment consists of a database server, a DNS server, a web server and one or more test servers (which may or may not be clones of the three main servers). Additional servers should be able available on demand for any use case. All machine images should be built in-house so that image security can be maintained.

Icon Installation

Before getting started there are a few packages that will be needed to set all of this up.

# pacaur -S bridge-utils dnsmasq ebtables libguestfs libvirt openbsd-netcat openssl-1.0 \
ovmf packer-io qemu-headless qemu-headless-arch-extra vagrant

Icon vagrant-libvirt

The libvirt plugin installation for vagrant requires some cleanup first.

# sudo mv /opt/vagrant/embedded/lib/libcurl.so{,.backup}
# sudo mv /opt/vagrant/embedded/lib/libcurl.so.4{,.backup}
# sudo mv /opt/vagrant/embedded/lib/libcurl.so.4.4.0{,.backup}
# sudo mv /opt/vagrant/embedded/lib/pkgconfig/libcurl.pc{,backup}

Then build the plugin.

# vagrant plugin install vagrant-libvirt

Icon Hugepages

Enabling hugepages can improve the performance of virtual machines. First add an entry to the fstab, make sure to first check what the group id of the group kvm is.

# grep kvm /etc/group
# sudoedit /etc/fstab


filename: /etc/fstab
hugetlbfs /dev/hugepages hugetlbfs mode=1770,gid=999 0 0

Instead of rebooting, remount instead.

# sudo umount /dev/hugepages
# mount /dev/hugepages

This can then be verified.

# sudo mount | grep huge
# ls -FalG /dev/ | grep huge

Now to set the number of hugepages to use. For this one has to do a bit of math, for each gigabyte of the system RAM that you want to use for VMs you divide the size in megabytes by two.

IconOn my setup I will dedicated 12GB out of the 16GB of system RAM to VMs. This means (12 * 1024) / 2 or 6144

Set the number of hugepages.

# echo 6144 | sudo tee /proc/sys/vm/nr_hugepages

Also set this permanently by adding a file to /etc/sysctl.d.

filename: /etc/sysctl.d/40-hugepages.conf
vm.nr_hugepages = 6144

Again verify the changes.

# grep HugePages_Total /proc/meminfo

Icon KVM Group

Create a user for KVM.

# sudo useradd -g kvm -s /usr/bin/nologin kvm

Then modify the libvirt QEMU config to reflect this.

filename: /etc/libvirt/qemu.conf
user = "kvm"
group = "kvm"

Fix permission on /dev/kvm

# sudo groupmod -g 78 kvm
Iconsystemd as of 234 assigns dynamic IDs to groups, but KVM expects 78

Add the current user to the kvm group.

# sudo gpasswd -a kyau kvm

Icon OVMF & IOMMU

The Open Virtual Machine Firmware (OVMF) is a project to enable UEFI support for virtual machines and enabling IOMMU will enable PCI pass-through among other things. This extends the possibilities for operating system choices significantly and also provides some other options.

Enable IOMMU on boot by adding an option to the kernel line in GRUB.

filename: /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"

Re-generate the GRUB config.

# sudo grub-mkconfig -o /boot/grub/grub.cfg

Reboot the machine and then verify IOMMU is enabled.

# sudo dmesg | grep -e DMAR -e IOMMU

If it was enabled properly, there should be a line similar to [ 0.000000] DMAR: IOMMU enabled.

Adding the OVMF firmware to libvirt.

filename: /etc/libvirt/qemu.conf
nvram = [
"/usr/share/ovmf/ovmf_code_x64.bin:/usr/share/ovmf/ovmf_vars_x64.bin"
]

Icon LVM

During the installation of the KVM host machine a data volume group was created for VMs. Before carving out disk space for virtual machines, create the volume(s) that will exist outside of the virtual machines. These will be used for databases, web root directories and any other data that needs to persist between VM creation and destruction.

# sudo lvcreate -L 256G data --name http
IconI am only using a single LVM volume and then creating directories inside of this for each machine

Create a directory for the volume.

# sudo mkdir /http

Format the new volume with ext4.

# sudo mkfs.ext4 -O metadata_csum,64bit /dev/data/http
# sudo mount /dev/data/http /http

Set proper permissions and mod the http user's home directory.

# sudo chown http:http /http
# sudo usermod -m -d /http http

Add the volume to fstab so that it mounts upon boot.

filename: /etc/fstab
/dev/mapper/data-http /http ext4 rw,relatime,stripe=256,data=ordered,journal_checksum 0 0

Volumes will now need to be created for each virtual machine, for this an LVM thin pool can be utilized.

Icon LVM Thin Provisioning

Thin provisioning creates another virtual layer on top of your volume group, in which logical thin volumes can be created. Thin volumes, unlike normal thick volumes, do not reserve the disk space for the volume on creation but instead do so upon write; to the operating system they are still reported as full size volumes. This means that when utilizing LVM directly for KVM it will perform similarly to a "dynamic disk" meaning it will only use what disk space it needs regardless of how big the virtual hard drive actually is. This can also be paired with LVM cloning (snapshots) to create some interesting setups, like running 1TB of VMs on a 128GB disk for example.

IconWARNING: The one disadvantage to doing this is that without proper disk monitoring and management this can lead to over provisioning (overflow will cause volume drop)

Use the rest of the data volume group for the thin pool.

# sudo lvcreate -l +100%FREE data --thinpool qemu

Pulling up lvdisplay can verify that it created a thin pool.

# sudo lvdisplay data/qemu


__LV Size <1.50 TiB
Allocated pool data 0.00%

Finally lvs should show the volume with the t and tz attributes as well as a data percentage.

# sudo lvs


__LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
http data -wi-ao---- 256.00g
qemu data twi-a-tz-- <1.50t 0.00 0.43
root neutron -wi-ao---- 63.93g

Adding volumes to the thin pool is very similar to adding normal volumes, add one for the first VM.

# sudo lvcreate -V 20G --thin -n dns data/qemu

These volumes can be shrunk or extended at any point.

# sudo lvextend -L +15G data/dns

Or even removed entirely.

# sudo lvremove data/dns

Verify the new base volume was added correctly to the thin pool.

# sudo lvs

The volume should be marked in pool qemu, have a data of 0.00% and attributes V and tz.

__LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
dns data Vwi-a-tz-- 20.00g qemu 0.00

Icon Packer

Packer is a tool for automating the creation of virtual machines, in this instance it will be used to automate the creation of Vagrant boxes. I have already taken the time to create a packer template for Arch Linux based off of my installation tutorials, but I encourage you to use this only as a basis and delve deeper to create your own templates. I could have very easily just have downloaded someone else's templates, but then I would lack understanding.

IconGitHub: kyau/packer-kvm-templates

The Packer templates are in JSON format and contain all of the information needed to create the virtual machine image. Descriptions of all the template sections and values, including default values, can be found in the Packer docs. For Arch Linux, the template file archlinux-x86_64-base-vagrant.json will be used to generate an Arch Linux qcow2 virtual machine image.

# git clone https://github.com/kyau/packer-kvm-templates
# cd packer-kvm-templates/archlinux-x86_64-base

To explain the template a bit, inside of the builders section the template is specifying that it is a qcow2 image running on QEMU KVM. A few settings are being imported from user variables that are being set in the previous section, this includes the ISO url and checksum, the country setting, disk space for the VMs primary hard drive, the amount of RAM to dedicate to the VM, how many vCores to dedicated to the VM, whether or not it is a headless VM or not, and the login and password for the primary SSH user. These are all set as user variables and placed in a section at the top to be able to make quick edits. The template also specifies that the VM should use virtio for the disk and network interfaces. Lastly the builtin web server in Packer and the boot commands; the http_directory specifies which directory will be the main root of the builtin web server (this enables one to host files up for the VM to access during installation). The boot_command is an array of commands that are to be executed upon boot in order to kick-start the installer. Finally, the qemuargs should be rather apparent as they are the arguments passed to QEMU.

# cd packer-kvm-templates

Looking then at the provisioners section which is executing three separate scripts after the machine has booted. These scripts are also being passed the required user variables that are set at the top of the file as shell variables. The install.sh script is the one that installs Arch Linux, hardnening.sh is the script that applies hardening the Arch Linux installation and finally cleanup.sh is there for general cleanup after the installation is complete.

While the README.md does have all of this information for the packer templates, it will also be detailed here.

For added security generate a new moduli for your VMs (or copy from /etc/ssh/moduli.

# ssh-keygen -G moduli.all -b 4096
# ssh-keygen -T moduli.safe -f moduli.all
# mv moduli.safe moduli && rm moduli.all

Enter the directory for the Arch Linux template and sym-link the moduli.

# cd archlinux-x86_64-base/default
# ln -s ../../moduli . && cd ..

Build the base virtual machine image.

# ./build archlinux-x86_64-base-vagrant.json
IconThis runs: PACKER_LOG=1 PACKER_LOG_PATH="packer.log" packer-io build archlinux-x86_64-base-vagrant.json, it logs to the current directory

Once finished, there should be a qcow2 vagrant-libvirt image for Arch Linux in the box directory.

Add this image to Vagrant.

# vagrant box add box/archlinux-x86_64-base-vagrant-libvirt.box --name archlinux-x86_64-base

Icon Vagrant-libvirt

Vagrant can be used to build and manage test machines. The vagrant-libvirt plugin adds a Libvirt provider to Vagrant, allowing Vagrant to control and provision machines via the Libvirt toolkit.

To bring up the first machine initialize Vagrant in a new directory first create a directory for the machine.

# cd
# mkdir testmachine
# cd testmachine

Init the machine the Vagrant.

# vagrant init archlinux-x86_64-base

Then bring up the machine.

# vagrant up

Then SSH into the machine directly.

# vagrant ssh

Icon Libvirt

While Vagrant is a great tool for working with test and development environments, for the more permanent VMs on the system, utilizing libvirt directly can interface at a higher level with KVM. This will allow the VMs to run directly off of LVM volumes.

For this a separate Packer template was created, one with all of the Vagrant stuff removed. To build one of these simply use the other JSON file in the Arch Linux template directory.

# ./build archlinux-x86_64-base.json

This can then be output directly to the LVM thin volume.

# sudo qemu-img convert -f qcow2 -O 'raw' 'qcow2/archlinux-x86_64-base.qcow2' '/dev/data/dns'

Then because it copied a thick volume onto a thin volume it will be using all of the disk space.

# sudo lvs


__LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
dns data Vwi-a-tz-- 20.00g qemu 100.00

The disk merely needs to be sparsified.

# sudo virt-sparsify --in-place /dev/data/dns

The disk should now be reading properly.

# sudo lvs


__LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
dns data Vwi-a-tz-- 20.00g qemu 7.17

Icon Virt-manager

Virt-manager can now be installed on the local machine (the one viewing this tutorial not the KVM host machine), this will be used to connect to libvirt remotely via SSH.

# pacaur -S virt-manager

Then on the KVM host machine enable and start libvirtd.

# sudo systemctl enable libvirtd
# sudo systemctl start libvirtd

Then enable access to libvirtd to everyone in the kvm group.

filename: /etc/polkit-1/rules.d/50-libvirt.rules
/* Allow users in kvm group to manage the libvirt daemon without authentication */
polkit.addRule(function(action, subject) {
if (action.id == "org.libvirt.unix.manage" &&
subject.isInGroup("kvm")) {
return polkit.Result.YES;
}
});

You should be able to connect remotely to QEMU/KVM with virt-manager over SSH, this will be useful later on.

Icon Network Bridge

Setting up a network bridge for KVM is simple with systemd. Replace X.X.X.X with the host machine's IP address and update the Gateway and DNS if not using OVH.

filename: /etc/systemd/network/kvm0.netdev
[NetDev]
Name=kvm0
Kind=bridge
 
filename: /etc/systemd/network/kvm0.network
[Match]
Name=kvm0

[Network]
DNS=213.186.33.99
Address=X.X.X.X/24
Gateway=Y.Y.Y.254
IPForward=yes
 
filename: /etc/systemd/network/eth0.network
[Match]
Name=eth0

[Network]
Bridge=kvm0

And finally restart networkd.

# sudo systemctl restart systemd-networkd

The bridge should now be up and running, this should be verified.

# ip a

Once the bridge is up and running QEMU can be directed to use it. Create a directory in /etc/ for QEMU and then make a bridge.conf.

# sudo mkdir /etc/qemu
# sudoedit /etc/qemu/bridge.conf
 
filename: /etc/qemu/bridge.conf
allow kvm0

Then set cap_net_admin on the binary helper.

# sudo setcap cap_net_admin=ep /usr/lib/qemu/qemu-bridge-helper

Icon NAT

To get NAT working inside of each VM IP forwarding will need to be enabled.

filename: /etc/sysctl.d/99-kvm.conf
net.ipv4.ip_forward = 1

Rules will also need to be appended to nftables.

filename: /etc/nftables.conf
table inet filter {

chain foward {
type filter hook forward priority 0;
oifname kvm0 accept
iifname kvm0 ct state related, established accept
iifname kvm0 drop
}

}

Rebooting at this point to make sure all these networking settings were set correctly would be a wise idea.

# sudo systemctl reboot

Icon Network Test

Using the OVH/SyS Manager setup two failover IP addresses to the same virtual MAC and use that virtual MAC for the following setup.

The network on the VM can now be tested. Use the following script to launch your virtual machine. Be sure to input the proper MAC address so it matches the one that OVH has assigned the two IP addresses.

filename: qemu-run.sh
#!/usr/bin/env bash

VCORES="1"
RAM="1024"
THIN_VOLUME="data/dns"
MAC_ADDRESS="12:34:56:78:90:ab"
sudo qemu-system-x86_64 --enable-kvm -machine q35,accel=kvm -device intel-iommu \
-m ${RAM} -smp cpus=1,maxcpus=16,cores=${VCORES} -cpu host,kvm=off \
-drive file=/dev/${THIN_VOLUME},cache=none,if=virtio,format=raw -net bridge,br=kvm0 \
-net nic,model=virtio,macaddr=${MAC_ADDRESS} -vga qxl \
-spice port=5900,addr=127.0.0.1,disable-ticketing

Once launched, you should be able to connect to the KVM using a SPICE client such as Vinagre. Click Connect in Vinagre, set the Host: to localhost and then make sure Use host is checked with your KVM host server name filled in "as a SSH tunnel". Connect and enter your SSH key password.

The KVM virtual machine should now be visible through Vinagre.

Login as root, if this was built using packer-kvm-templates the default password is password.

Edit the network interface configuration for systemd. This first VM is going to be acting as my DNS server, therefore it will be assigned two IP addresses.

filename: /etc/systemd/network/eth0.service
[Match]
Name=eth0

[Network]
Address=FAILOVER_IP_1/32
Address=FAILOVER.IP.2/32
Peer=HOST_GATEWAY/32

[Gateway]
Gateway=HOST_GATEWAY
Destination=0.0.0.0/0

This is exactly how OVH says it should be setup, however this was not enough as the VM still did not have a default route.

To fix the routing create a service on boot.

filename: /usr/lib/systemd/system/kvmnet.service
[Unit]
Description=Start KVM Network
After=systemd-udevd.service network-pre.target systemd-sysusers.service systemd-sysctl.service
Before=network.target multi-user.target shutdown.target
Conflicts=shutdown.target
Wants=network.target

[Service]
ExecStart=/usr/local/bin/kvmnet

[Install]
WantedBy=multi-user.target

And the script that does the routing.

filename: /usr/local/bin/kvmnet
#!/bin/bash
ip route add Y.Y.Y.254 dev eth0
ip route add default via Y.Y.Y.254 dev eth0

Don't forget to make it executable.

# sudo chmod +rx /usr/local/bin/kvmnet

Reboot the VM and then verify it has internet access.

# sudo reboot
# ping archlinux.org

Finally, verify it can be SSH into from the outside via BOTH IP addresses.

Icon References