KVM on Arch Linux

From Wiki³
Revision as of 00:19, 14 August 2017 by Kyau (talk | contribs) (→‎SQL)
IconUNDER CONSTRUCTION: The document is currently being modified!

Icon Introduction

This is a tutorial on how to automate the setup of VMs using KVM on Arch Linux. This tutorial utilizes QEMU as a back-end for KVM using libvirt. System base images will be generated using Packer. And finally, Vagrant and vagrant-libvirt will be utilized for KVM test environments.

Tutorial environment consists of a database server, a DNS server, a web server and one or more test servers (which may or may not be clones of the three main servers). Additional servers should be able available on demand for any use case. All machine images should be built in-house so that image security can be maintained.

Icon Installation

Before getting started there are a few packages that will be needed to set all of this up.

# pacaur -S bridge-utils libguestfs libvirt openbsd-netcat openssl-1.0 ovmf \
packer-io qemu-headless qemu-headless-arch-extra vagrant

Icon Swap

"The idea is you assign each guest a large amount of memory (more than you can actually give out) because they're generally not using it. Then you do the math to ensure you have enough swap space that the guests can actually swap out to disk in the worst-case-scenario where they all actually do use all that memory." [1] As user ndt points out, the swap space on the host machine should be equivalent to host Total Host Swap = Sum of Guest Memory Assigned + Recommended OS Swap Size.

During the installation of the host machine two 8GB partitions were left on each drive for this reason. If more space is needed, it can always be cut from LVM.

Setup the two swap partitions as Linux swap space.

# sudo mkswap /dev/sda2
# sudo mkswap /dev/sdb2

Activate the two swap partitions.

# sudo swapon /dev/sda2
# sudo swapon /dev/sdb2

The swap partitions will also need to be activated on boot.

# sudo lsblk -no UUID /dev/sda2
# sudo lsblk -no UUID /dev/sdb2
 
filename: /etc/fstab
UUID=XXX-XXXX… none swap defaults 0 0
UUID=XXX-XXXX… none swap defaults 0 0

Icon Hugepages

Enabling hugepages can improve the performance of virtual machines. First add an entry to the fstab, make sure to first check what the group id of the group kvm is.

# grep kvm /etc/group
# sudoedit /etc/fstab


filename: /etc/fstab
hugetlbfs /dev/hugepages hugetlbfs mode=1770,gid=999 0 0

Instead of rebooting, remount instead.

# sudo umount /dev/hugepages
# mount /dev/hugepages

This can then be verified.

# sudo mount | grep huge
# ls -FalG /dev/ | grep huge

Now to set the number of hugepages to use. For this one has to do a bit of math, for each gigabyte of the system RAM that you want to use for VMs you divide the size in megabytes by two.

IconOn my setup I will dedicated 12GB out of the 16GB of system RAM to VMs. This means (12 * 1024) / 2 or 6144

Set the number of hugepages.

# echo 6144 | sudo tee /proc/sys/vm/nr_hugepages

Also set this permanently by adding a file to /etc/sysctl.d.

filename: /etc/sysctl.d/40-hugepages.conf
vm.nr_hugepages = 6144

Again verify the changes.

# grep HugePages_Total /proc/meminfo

Edit the libvirt QEMU config and turn hugepages on.

filename: /etc/libvirt/qemu.conf

hugetlbfs_mount = "/dev/hugepages"

Icon KVM Group

Create a user for KVM.

# sudo useradd -g kvm -s /usr/bin/nologin kvm

Then modify the libvirt QEMU config to reflect this.

filename: /etc/libvirt/qemu.conf
user = "kvm"
group = "kvm"

Fix permission on /dev/kvm

# sudo groupmod -g 78 kvm
Iconsystemd as of 234 assigns dynamic IDs to groups, but KVM expects 78

Add the current user to the kvm group.

# sudo gpasswd -a kyau kvm

Icon Kernel Modules

In order to mount directories from the host inside of a virtual machine, the 9pnet_virtio kernel module will need to be loaded.

# sudo modprobe 9pnet_virtio

Also load the module on boot.

filename: /etc/modules-load.d/virtio-9pnet.conf
9pnet_virtio

Icon OVMF & IOMMU

The Open Virtual Machine Firmware (OVMF) is a project to enable UEFI support for virtual machines and enabling IOMMU will enable PCI pass-through among other things. This extends the possibilities for operating system choices significantly and also provides some other options.

Enable IOMMU on boot by adding an option to the kernel line in GRUB.

filename: /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"

Re-generate the GRUB config.

# sudo grub-mkconfig -o /boot/grub/grub.cfg

Reboot the machine and then verify IOMMU is enabled.

# sudo dmesg | grep -e DMAR -e IOMMU

If it was enabled properly, there should be a line similar to [ 0.000000] DMAR: IOMMU enabled.

Adding the OVMF firmware to libvirt.

filename: /etc/libvirt/qemu.conf
nvram = [
"/usr/share/ovmf/ovmf_code_x64.bin:/usr/share/ovmf/ovmf_vars_x64.bin"
]

Icon LVM

During the installation of the KVM host machine a data volume group was created for VMs. Before carving out disk space for virtual machines, create the volume(s) that will exist outside of the virtual machines. These will be used for databases, web root directories and any other data that needs to persist between VM creation and destruction.

# sudo lvcreate -L 256G data --name http
IconI am only using a single LVM volume and then creating directories inside of this for each machine

Create a directory for the volume.

# sudo mkdir /http

Format the new volume with ext4.

# sudo mkfs.ext4 -O metadata_csum,64bit /dev/data/http
# sudo mount /dev/data/http /http

Set proper permissions and mod the http user's home directory.

# sudo chown http:http /http
# sudo usermod -m -d /http http

Add the volume to fstab so that it mounts upon boot.

filename: /etc/fstab
/dev/mapper/data-http /http ext4 rw,relatime,stripe=256,data=ordered,journal_checksum 0 0

Volumes will now need to be created for each virtual machine, for this an LVM thin pool can be utilized.

Icon LVM Thin Provisioning

Thin provisioning creates another virtual layer on top of your volume group, in which logical thin volumes can be created. Thin volumes, unlike normal thick volumes, do not reserve the disk space for the volume on creation but instead do so upon write; to the operating system they are still reported as full size volumes. This means that when utilizing LVM directly for KVM it will perform similarly to a "dynamic disk" meaning it will only use what disk space it needs regardless of how big the virtual hard drive actually is. This can also be paired with LVM cloning (snapshots) to create some interesting setups, like running 1TB of VMs on a 128GB disk for example.

IconWARNING: The one disadvantage to doing this is that without proper disk monitoring and management this can lead to over provisioning (overflow will cause volume drop)

Use the rest of the data volume group for the thin pool.

# sudo lvcreate -l +100%FREE data --thinpool qemu

Pulling up lvdisplay can verify that it created a thin pool.

# sudo lvdisplay data/qemu


__LV Size <1.50 TiB
Allocated pool data 0.00%

Finally lvs should show the volume with the t and tz attributes as well as a data percentage.

# sudo lvs


__LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
http data -wi-ao---- 256.00g
qemu data twi-a-tz-- <1.50t 0.00 0.43
root neutron -wi-ao---- 63.93g

Adding volumes to the thin pool is very similar to adding normal volumes, add one for the first VM.

# sudo lvcreate -V 20G --thin -n dns data/qemu

These volumes can be shrunk or extended at any point.

# sudo lvextend -L +15G data/dns

Or even removed entirely.

# sudo lvremove data/dns

Verify the new base volume was added correctly to the thin pool.

# sudo lvs

The volume should be marked in pool qemu, have a data of 0.00% and attributes V and tz.

__LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
dns data Vwi-a-tz-- 20.00g qemu 0.00

Icon Packer

Packer is a tool for automating the creation of virtual machines, in this instance it will be used to automate the creation of Vagrant boxes. I have already taken the time to create a packer template for Arch Linux based off of my installation tutorials, but I encourage you to use this only as a basis and delve deeper to create your own templates. I could have very easily just have downloaded someone else's templates, but then I would lack understanding.

IconGitHub: kyau/packer-kvm-templates

Icon Vagrant-Libvirt

The libvirt plugin installation for vagrant requires some cleanup first.

# sudo mv /opt/vagrant/embedded/lib/libcurl.so{,.backup}
# sudo mv /opt/vagrant/embedded/lib/libcurl.so.4{,.backup}
# sudo mv /opt/vagrant/embedded/lib/libcurl.so.4.4.0{,.backup}
# sudo mv /opt/vagrant/embedded/lib/pkgconfig/libcurl.pc{,backup}

Then build the plugin.

# vagrant plugin install vagrant-libvirt

Icon Templates

The Packer templates are in JSON format and contain all of the information needed to create the virtual machine image. Descriptions of all the template sections and values, including default values, can be found in the Packer docs. For Arch Linux, the template file archlinux-x86_64-base-vagrant.json will be used to generate an Arch Linux qcow2 virtual machine image.

# git clone https://github.com/kyau/packer-kvm-templates

To explain the template a bit, inside of the builders section the template is specifying that it is a qcow2 image running on QEMU KVM. A few settings are being imported from user variables that are being set in the previous section, this includes the ISO url and checksum, the country setting, disk space for the VMs primary hard drive, the amount of RAM to dedicate to the VM, how many vCores to dedicated to the VM, whether or not it is a headless VM or not, and the login and password for the primary SSH user. These are all set as user variables and placed in a section at the top to be able to make quick edits. The template also specifies that the VM should use virtio for the disk and network interfaces. Lastly the builtin web server in Packer and the boot commands; the http_directory specifies which directory will be the main root of the builtin web server (this enables one to host files up for the VM to access during installation). The boot_command is an array of commands that are to be executed upon boot in order to kick-start the installer. Finally, the qemuargs should be rather apparent as they are the arguments passed to QEMU.

# cd packer-kvm-templates

Looking then at the provisioners section which is executing three separate scripts after the machine has booted. These scripts are also being passed the required user variables that are set at the top of the file as shell variables. The install.sh script is the one that installs Arch Linux, hardnening.sh is the script that applies hardening the Arch Linux installation and finally cleanup.sh is there for general cleanup after the installation is complete. While the README.md does have all of this information for the packer templates, it will also be detailed here.


For added security generate a new moduli for your VMs (or copy from /etc/ssh/moduli.

# ssh-keygen -G moduli.all -b 4096
# ssh-keygen -T moduli.safe -f moduli.all
# mv moduli.safe moduli && rm moduli.all

Enter the directory for the Arch Linux template and sym-link the moduli.

# cd archlinux-x86_64-base/default
# ln -s ../../moduli . && cd ..

Build the base virtual machine image.

# ./build archlinux-x86_64-base-vagrant.json
IconThis runs: PACKER_LOG=1 PACKER_LOG_PATH="packer.log" packer-io build archlinux-x86_64-base-vagrant.json, it logs to the current directory

Once finished, there should be a qcow2 vagrant-libvirt image for Arch Linux in the box directory.

Add this image to Vagrant.

# vagrant box add box/archlinux-x86_64-base-vagrant-libvirt.box --name archlinux-x86_64-base

Icon Vagrant-Libvirt

Vagrant can be used to build and manage test machines. The vagrant-libvirt plugin adds a Libvirt provider to Vagrant, allowing Vagrant to control and provision machines via the Libvirt toolkit.

To bring up the first machine initialize Vagrant in a new directory first create a directory for the machine.

# cd
# mkdir testmachine
# cd testmachine

Init the machine the Vagrant.

# vagrant init archlinux-x86_64-base

Then bring up the machine.

# vagrant up

Then SSH into the machine directly.

# vagrant ssh

Icon QEMU

While Vagrant is a great tool for working with test and development environments, for the more permanent VMs on the system, utilizing QEMU directly will allow the VMs to run directly off of LVM thin volumes. Currently vagrant-libvirt cannot do this, due to it's own snapshotting interfering with it; thankfully LVM has snapshotting of its own.

For this a separate Packer template was created, one with all of the Vagrant stuff removed. To build one of these simply use the other JSON file in the Arch Linux template directory.

# ./build archlinux-x86_64-base.json

This can then be output directly to the LVM thin volume.

# sudo qemu-img convert -f qcow2 -O raw qcow2/archlinux-x86_64-base.qcow2 /dev/data/dns

Then because it copied a thick volume onto a thin volume it will be using all of the disk space.

# sudo lvs


__LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
dns data Vwi-a-tz-- 20.00g qemu 100.00

The disk merely needs to be sparsified.

# sudo virt-sparsify --in-place /dev/data/dns

The disk should now be reading properly.

# sudo lvs


__LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
dns data Vwi-a-tz-- 20.00g qemu 7.17

Icon Network Bridge

Setting up a network bridge for KVM is simple with systemd. Replace X.X.X.X with the host machine's IP address and update the Gateway and DNS if not using OVH.

filename: /etc/systemd/network/kvm0.netdev
[NetDev]
Name=kvm0
Kind=bridge
 
filename: /etc/systemd/network/kvm0.network
[Match]
Name=kvm0

[Network]
DNS=213.186.33.99
Address=X.X.X.X/24
Gateway=Y.Y.Y.254
IPForward=yes
 
filename: /etc/systemd/network/eth0.network
[Match]
Name=eth0

[Network]
Bridge=kvm0

And finally restart networkd.

# sudo systemctl restart systemd-networkd

The bridge should now be up and running, this should be verified.

# ip a

Once the bridge is up and running QEMU can be directed to use it. Create a directory in /etc/ for QEMU and then make a bridge.conf.

# sudo mkdir /etc/qemu
# sudoedit /etc/qemu/bridge.conf
 
filename: /etc/qemu/bridge.conf
allow kvm0

Then set cap_net_admin on the binary helper.

# sudo setcap cap_net_admin=ep /usr/lib/qemu/qemu-bridge-helper
IconWARNING: I had major issues using the bridge as a regular user, I actually had to remove the setuid bit to get it working: sudo chmod u-s /usr/lib/qemu/qemu-bridge-help

Icon NAT

To get NAT working inside of each VM IP forwarding will need to be enabled.

filename: /etc/sysctl.d/99-kvm.conf
net.ipv4.ip_forward = 1

Rules will also need to be appended to nftables.

filename: /etc/nftables.conf
table inet filter {

chain foward {
type filter hook forward priority 0;
oifname kvm0 accept
iifname kvm0 ct state related, established accept
iifname kvm0 drop
}

}

Rebooting at this point to make sure all these networking settings were set correctly would be a wise idea.

# sudo systemctl reboot

Icon Network Test

The network on the VM should now be fully tested, for this a connection can be made using the SPICE protocol. On a local client machine install vinagre.

# pacaur -S vinagre

Using the OVH/SyS Manager setup two failover IP addresses to the same virtual MAC. The following arguments will launch the virtual machine. Be sure to input the proper virtual MAC so that it matches the one that OVH assigned.

# /usr/bin/qemu-system-x86_64 --enable-kvm -machine q35,accel=kvm -device intel-iommu \
-m 512 -smp 1 -cpu host -drive file=/dev/data/dns,cache=none,if=virtio,format=raw \
-net bridge,br=kvm0 -net nic,model=virtio,macaddr=00:00:00:00:00:00 -vga qxl \
-spice port=5900,addr=127.0.0.1,disable-ticketing \
-monitor unix:/tmp/monitor-dns.sock,server,nowait

Once launched, you should be able to connect to the KVM using a SPICE client such as Vinagre. Click Connect in Vinagre, set the Host: to localhost and then make sure Use host is checked with your KVM host server name filled in "as a SSH tunnel". Connect and enter your SSH key password.

The KVM virtual machine should now be visible through Vinagre.

Login as root, if this was built using packer-kvm-templates the default password is password.

Edit the network interface configuration for systemd. This first VM is going to be acting as my DNS server, therefore it will be assigned two IP addresses.

filename: /etc/systemd/network/eth0.service
[Match]
Name=eth0

[Network]
Address=FAILOVER_IP_1/32
Address=FAILOVER.IP.2/32
DNS=213.186.33.99
Peer=HOST_GATEWAY/32

[Gateway]
Gateway=HOST_GATEWAY
Destination=0.0.0.0/0

This is exactly how OVH says it should be setup, however this was not enough as the VM still did not have a default route.

IconTODO: Fix this section, this is an ugly hack

To fix the routing create a service on boot.

filename: /usr/lib/systemd/system/kvmnet.service
[Unit]
Description=Start KVM Network
After=network.target
Before=multi-user.target shutdown.target
Conflicts=shutdown.target
Wants=network.target

[Service]
ExecStart=/usr/local/bin/kvmnet

[Install]
WantedBy=multi-user.target

And the script that does the routing.

filename: /usr/local/bin/kvmnet
#!/bin/bash
ip route add Y.Y.Y.254 dev eth0
ip route add default via Y.Y.Y.254 dev eth0

Don't forget to make it executable.

# sudo chmod +rx /usr/local/bin/kvmnet

Enable the service.

# sudo systemctl enable kvmnet

Reboot the VM and then verify it has internet access.

# sudo reboot
# ping archlinux.org

Finally, verify it can be SSH into from the outside via BOTH IP addresses.

Icon Libvirt

To launch the virtual machines on boot there are two options. The first option involves importing the virtual machines into libvirt with virsh. The second option is to setup a systemd service. Given that management will be loads easier with virt-manager I will opt for this option.

On the KVM host machine enable and start libvirtd.

# sudo systemctl enable libvirtd
# sudo systemctl start libvirtd

Then enable access to libvirtd to everyone in the kvm group.

filename: /etc/polkit-1/rules.d/50-libvirt.rules
/* Allow users in kvm group to manage the libvirt daemon without authentication */
polkit.addRule(function(action, subject) {
if (action.id == "org.libvirt.unix.manage" &&
subject.isInGroup("kvm")) {
return polkit.Result.YES;
}
});

Icon Virsh

Virsh is the command line interface for libvirt. It can be used to import the QEMU arguments into an XML format that libvirt will understand.

Save the QEMU arguments used before to a temporary file.

# echo "/usr/bin/qemu-system-x86_64 --enable-kvm -machine q35,accel=kvm -device intel-iommu \
-m 512 -smp 1 -cpu Broadwell -drive file=/dev/data/dns,cache=none,if=virtio,format=raw \
-net bridge,br=kvm0 -net nic,model=virtio,macaddr=00:00:00:00:00:00 -vga qxl \
-spice port=5900,addr=127.0.0.1,disable-ticketing \
-monitor unix:/tmp/monitor-dns.sock,server,nowait" > kvm.args
IconTemporarily changing the CPU because virsh cannot recognize host

Convert this to XML format.

# virsh domxml-from-native qemu-argv kvm.args > dns.xml

Then open up the XML file in an editor and change the name, cpu and graphics block.

filename: dns.xml

<name>DNS (Arch64)</name>

<cpu mode='host-passthrough' />

<graphics type='spice' port='5900' autoport='no' listen='127.0.0.1'>
<listen type='address' address='127.0.0.1' />
</graphics>

The last two qemu:commandline arguments can also be removed as they were setting up the SPICE server which is done through the graphics block.

The XML should now be in a similar state as to when it was executed with the QEMU binary.

Import the XML into libvirt.

# sudo virsh define dns.xml

The VM can now be launched.

# sudo virsh start DNS

SSH and SPICE over SSH should both now work and the machine should be running. Use the following to start the machine on boot.

# sudo virsh autostart DNS

A reboot of the host machine at this point should yield the virtual machine DNS starting up automatically.

Icon Virt-Manager

Virt-manager can be used to manage the virtual machines remotely.

Virt-manager can now be installed on the local machine (the one viewing this tutorial not the KVM host machine), this can be used to connect to libvirt remotely via SSH.

# pacaur -S virt-manager

Connect remotely to QEMU/KVM with virt-manager over SSH and the virtual machine should be shown as running.

Icon Additional Notes

These notes are here from my own install.

# cd ~/packer-kvm-templates/archlinux-x86_64-base
# ./build archlinux-x86_64-base.json
 
# sudo lvcreate -V 20G --thin -n bind data/qemu
# sudo lvcreate -V 20G --thin -n sql data/qemu
# sudo lvcreate -V 20G --thin -n nginx data/qemu
 
# sudo qemu-img convert -f qcow2 -O raw qcow2/archlinux-x86_64-base.qcow2 /dev/data/bind
# sudo virt-sparsify --in-place /dev/data/bind
 
# vim virshxml
# ./virshxml
# sudo virsh define ~/newxml-bind.xml

Then repeat this for sql and nginx.

IconDon't forget about the notes virshxml gives for replacing the networkd service
# sudo virsh start bind
# sudo virsh start sql
# sudo virsh start nginx
 
# sudo virsh autostart bind
# sudo virsh autostart sql
# sudo virsh autostart nginx

Icon DNS

Icon DNSSEC

Adding DNSSEC to BIND is always a good idea, first add the following lines to the options inside of the BIND config.

filename: /etc/named.conf
dnssec-enable yes;
dnssec-validation yes;
dnssec-lookaside auto;

Install haveged for key generation inside of VMs.

# pacaur -S haveged
# haveged -w 1024

Gain root privileges.

# sudo -i
# cd /var/named

Create zone signing keys for all domains.

# dnssec-keygen -a ECDSAP384SHA384 -n ZONE kyau.net
# dnssec-keygen -a ECDSAP384SHA384 -n ZONE kyau.org

Create a key signing keys for all domains.

# dnssec-keygen -f KSK -a ECDSAP384SHA384 -n ZONE kyau.net
# dnssec-keygen -f KSK -a ECDSAP384SHA384 -n ZONE kyau.org

Run the following for each domain to include the keys in the zone files.

# for key in `ls Kkyau.net*.key`; do echo "\$INCLUDE $key" >> kyau.net.zone; done
# for key in `ls Kkyau.org*.key`; do echo "\$INCLUDE $key" >> kyau.org.zone; done

Run a check on each zone.

# named-checkzone kyau.net /var/named/kyau.net.zone
# named-checkzone kyau.org /var/named/kyau.org.zone

Sign each zone with the dnssec-signzone.

# dnssec-signzone -A -3 $(head -c 1000 /dev/random | sha1sum | cut -b 1-16) -N INCREMENT -o kyau.net -t kyau.net.zone
# dnssec-signzone -A -3 $(head -c 1000 /dev/random | sha1sum | cut -b 1-16) -N INCREMENT -o kyau.org -t kyau.org.zone

To update a zone at any point just edit the zone, check the zone and then re-sign as root sudo -i.

# cd /var/named
# dnssec-signzone -A -3 $(head -c 1000 /dev/random | sha1sum | cut -b 1-16) -N INCREMENT -o kyau.net -t kyau.net.zone
# systemctl restart named
IconWARNING: DO NOT increment the zone file this will be done automatically!

Modify the bind config to read from the signed zone files.

filename: /etc/named.conf
zone "kyau.net" IN {
type master;
file "kyau.net.zone.signed";
allow-update { none; };
notify no;
};

zone "kyau.org" IN {
type master;
file "kyau.org.zone.signed";
allow-update { none; };
notify no;
};

Make sure all is in order.

# named-checkconf /etc/named.conf

Next visit the domain registrar for the domain.

Icon SQL

Create a directory on the host machine for the nginx and sql server.

# sudo mkdir /www/sql /www/nginx

Make sure it has the right permissions.

# sudo chown -R kvm:kvm /www

Icon References

  1. ^ Unix & Linux Stack Exchange. KVM and Swap space

https://www.digitalocean.com/community/tutorials/how-to-setup-dnssec-on-an-authoritative-bind-dns-server--2