Skip to content

Latest commit

 

History

History
347 lines (242 loc) · 11.5 KB

libvirt-howto.md

File metadata and controls

347 lines (242 loc) · 11.5 KB

Libvirt HOWTO

Launching clusters via libvirt is especially useful for operator development.

One-time setup

It's expected that you will create and destroy clusters often in the course of development. These steps only need to be run once.

Before you begin, install the build dependencies.

Enable KVM

Make sure you have KVM enabled by checking for the device:

$ ls -l /dev/kvm 
crw-rw-rw-+ 1 root kvm 10, 232 Oct 31 09:22 /dev/kvm

If it is missing, try some of the ideas here.

Install and Enable Libvirt

On Fedora, CentOS/RHEL:

sudo yum install libvirt libvirt-devel libvirt-daemon-kvm qemu-kvm

Then start libvirtd:

sudo systemctl enable --now libvirtd

Pick names

In this example, we'll set the base domain to tt.testing and the cluster name to test1.

Clone the project

git clone https://github.com/openshift/installer.git
cd installer

Make sure you have permissions for qemu:///system

You may want to grant yourself permissions to use libvirt as a non-root user. You could allow all users in the wheel group by doing the following:

cat <<EOF >> /etc/polkit-1/rules.d/80-libvirt.rules
polkit.addRule(function(action, subject) {
  if (action.id == "org.libvirt.unix.manage" && subject.local && subject.active && subject.isInGroup("wheel")) {
      return polkit.Result.YES;
  }
});
EOF

Enable IP Forwarding

Libvirt creates a bridged connection to the host machine, but in order for the network bridge to work IP forwarding needs to be enabled. The following command will tell you if forwarding is enabled:

sysctl net.ipv4.ip_forward

If the command output is:

net.ipv4.ip_forward = 0

then forwarding is disabled and proceed with the rest of this section. If IP forwarding is enabled then skip the rest of this section.

To enable IP forwarding for the current boot:

sysctl net.ipv4.ip_forward=1

or to persist the setting across reboots (recommended):

echo "net.ipv4.ip_forward = 1" | sudo tee /etc/sysctl.d/99-ipforward.conf
sudo sysctl -p /etc/sysctl.d/99-ipforward.conf

Configure libvirt to accept TCP connections

The Kubernetes cluster-api components drive deployment of worker machines. The libvirt cluster-api provider will run inside the local cluster, and will need to connect back to the libvirt instance on the host machine to deploy workers.

In order for this to work, you'll need to enable TCP connections for libvirt.

Configure libvirtd.conf

To do this, first modify your /etc/libvirt/libvirtd.conf and set the following:

listen_tls = 0
listen_tcp = 1
auth_tcp="none"
tcp_port = "16509"

Note that authentication is not currently supported, but should be soon.

Configure the service runner to pass --listen to libvirtd

In addition to the config, you'll have to pass an additional command-line argument to libvirtd. On Fedora, modify /etc/sysconfig/libvirtd and set:

LIBVIRTD_ARGS="--listen"

On Debian based distros, modify /etc/default/libvirtd and set:

libvirtd_opts="--listen"

Next, restart libvirt: systemctl restart libvirtd

Firewall

Finally, if you have a firewall, you may have to allow connections to the libvirt daemon from the IP range used by your cluster nodes.

The following examples use the default cluster IP range of 192.168.126.0/24 (which is currently not configurable) and a libvirt default subnet of 192.168.124.0/24, which might be different in your configuration. If you're uncertain about the libvirt default subnet you should be able to see its address using the command ip -4 a show dev virbr0 or by inspecting virsh --connect qemu:///system net-dumpxml default. Ensure the cluster IP range does not overlap your virbr0 IP address.

iptables

iptables -I INPUT -p tcp -s 192.168.126.0/24 -d 192.168.124.1 --dport 16509 -j ACCEPT -m comment --comment "Allow insecure libvirt clients"

Firewalld

If using firewalld, simply obtain the name of the existing active zone which can be used to integrate the appropriate source and ports to allow connections from the IP range used by your cluster nodes. An example is shown below.

$ sudo firewall-cmd --get-active-zones
FedoraWorkstation
  interfaces: enp0s25 tun0

With the name of the active zone, include the source and port to allow connections from the IP range used by your cluster nodes.

sudo firewall-cmd --zone=FedoraWorkstation --add-source=192.168.126.0/24
sudo firewall-cmd --zone=FedoraWorkstation --add-port=16509/tcp

Verification of the source and port can be done listing the zone

sudo firewall-cmd --zone=FedoraWorkstation --list-ports
sudo firewall-cmd --zone=FedoraWorkstation --list-sources

NOTE: When the firewall rules are no longer needed, sudo firewall-cmd --reload will remove the changes made as they were not permanently added. For persistence, add --permanent to the firewall-cmd commands and run them a second time.

Configure default libvirt storage pool

Check to see if a default storage pool has been defined in Libvirt by running virsh --connect qemu:///system pool-list. If it does not exist, create it:

sudo virsh pool-define /dev/stdin <<EOF
<pool type='dir'>
  <name>default</name>
  <target>
    <path>/var/lib/libvirt/images</path>
  </target>
</pool>
EOF

sudo virsh pool-start default
sudo virsh pool-autostart default

Set up NetworkManager DNS overlay

This step allows installer and users to resolve cluster-internal hostnames from your host.

  1. Edit /etc/NetworkManager/NetworkManager.conf and set dns=dnsmasq in section [main]

  2. Tell dnsmasq to use your cluster. The syntax is server=/<baseDomain>/<firstIP>.

    For this example:

    echo server=/tt.testing/192.168.126.1 | sudo tee /etc/NetworkManager/dnsmasq.d/openshift.conf
  3. Reload NetworkManager to pick up the dns configuration change: sudo systemctl reload NetworkManager

Build and run the installer

With libvirt configured, you can proceed with the usual quick-start. Set TAGS=libvirt to add support for libvirt; this is not enabled by default because libvirt is development only.

TAGS=libvirt hack/build.sh

Cleanup

To remove resources associated with your cluster, run:

openshift-install destroy cluster

You can also use virsh-cleanup.sh, but note that it will currently destroy all libvirt resources.

Firewall

With the cluster removed, you no longer need to allow libvirt nodes to reach your libvirtd. Restart firewalld to remove your temporary changes as follows:

sudo firewall-cmd --reload

Exploring your cluster

Some things you can do:

SSH access

The bootstrap node, e.g. test1-bootstrap.tt.testing, runs the bootstrap process. You can watch it:

ssh "core@${CLUSTER_NAME}-bootstrap.${BASE_DOMAIN}"
sudo journalctl -f -u bootkube -u openshift

You'll have to wait for etcd to reach quorum before this makes any progress.

Using the domain names above will only work if you set up the DNS overlay or have otherwise configured your system to resolve cluster domain names. Alternatively, if you didn't set up DNS on the host, you can use:

virsh -c "${LIBVIRT_URI}" domifaddr "${CLUSTER_NAME}-master-0"  # to get the master IP
ssh core@$MASTER_IP

Here LIBVIRT_URI is the libvirt connection URI which you passed to the installer.

Inspect the cluster with kubectl

You'll need a kubectl binary on your path and the kubeconfig from your cluster call.

export KUBECONFIG="${DIR}/auth/kubeconfig"
kubectl get --all-namespaces pods

Alternatively, you can run kubectl from the bootstrap or master nodes. Use scp or similar to transfer your local ${DIR}/auth/kubeconfig, then SSH in and run:

export KUBECONFIG=/where/you/put/your/kubeconfig
kubectl get --all-namespaces pods

FAQ

Libvirt vs. AWS

  1. There isn't a load balancer on libvirt.

Troubleshooting

If following the above steps hasn't quite worked, please review this section for well known issues.

Install throws an Unable to resolve address 'localhost' error

If you're seeing an error similar to

Error: Error refreshing state: 1 error(s) occurred:

* provider.libvirt: virError(Code=38, Domain=7, Message='Unable to resolve address 'localhost' service '-1': Servname not supported for ai_socktype')


FATA[0019] failed to run Terraform: exit status 1

it is likely that your install configuration contains three backslashes after the protocol (e.g. qemu+tcp:///...), when it should only be two.

SELinux might prevent access to image files

Configuring the storage pool to store images in a path incompatible with the SELinux policies (e.g. your home directory) might lead to the following errors:

Error: Error applying plan:

1 error(s) occurred:

* libvirt_domain.etcd: 1 error(s) occurred:

* libvirt_domain.etcd: Error creating libvirt domain: virError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2018-07-30T22:52:54.865806Z qemu-kvm: -fw_cfg name=opt/com.coreos/config,file=/home/user/VirtualMachines/etcd.ign: can't load /home/user/VirtualMachines/etcd.ign')

As described here you can workaround by disabling SELinux, or store the images in a place well-known to work, e.g. by using the default pool.

Random domain creation errors due to libvirt race conditon

Depending on your libvirt version you might encounter a race condition leading to an error similar to:

* libvirt_domain.master.0: Error creating libvirt domain: virError(Code=43, Domain=19, Message='Network not found: no network with matching name 'test1'')

This is also being tracked on the libvirt-terraform-provider but is likely not fixable on the client side, which is why you should upgrade libvirt to >=4.5 or a patched version, depending on your environment.

MacOS support currently broken

Error with firewall initialization on Arch Linux

If you're on Arch Linux and get an error similar to

libvirt: “Failed to initialize a valid firewall backend”

or

error: Failed to start network default
error: internal error: Failed to initialize a valid firewall backend

please check out this thread on superuser.

Github Issue Tracker

You might find other reports of your problem in the Issues tab for this repository where we ask you to provide any additional information. If your issue is not reported, please do.