| SPDX-FileCopyrightText | SPDX-License-Identifier | title | author | footer | description | keywords | color | class | style | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
© 2026 Menacit AB <[email protected]> |
CC-BY-SA-4.0 |
Virtualisation course: OS-level virtualisation technology |
Joel Rangsmo <[email protected]> |
© Course authors (CC BY-SA 4.0) |
Overview of features and technology which enables OS-level virtualisation on Linux |
|
#ffffff |
|
section.center {
text-align: center;
}
|
Unlike other operating systems, there is no such thing as a "OS-level VM" in the Linux kernel.
Gluing together features like chroot, namespaces and cgroup creates the illusion.
This functionality has other neat use-cases besides virtualisation.
No worries if you don't grok all of this!
- Introduced in UNIX during the 70s
- "Change file system root"
- Not designed as a security feature
$ sudo debootstrap buster my_debian_root http://deb.debian.org/debian/
I: Retrieving InRelease
I: Resolving dependencies of required packages...
I: Retrieving libacl1 2.2.53-4
[...]
I: Configuring libc-bin...
I: Base system installed successfully.
$ ls my_debian_root/
bin boot dev etc home lib lib32 lib64 [...]
$ cat /etc/os-release | grep -F PRETTY_NAME
PRETTY_NAME="Ubuntu 22.04.1 LTS"
$ sudo chroot my_debian_root /bin/bash
root@node-1:/# cat /etc/os-release | grep -F PRETTY_NAME
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
root@node-1:/# apt-get install fortune
root@node-1:/# fortune
All generalizations are false, including this one.
-- Mark Twain
root@node-1:/# dmesg | wc -l
1002
root@node-1:/# ps -xa | grep password
/bin/pacemaker loop --password G0d!
root@node-1:/# tcpdump -i eth0
tcpdump: listening on eth0
[...]
160 packets captured
root@node-1:/# reboot
"Functionality to partition a group of processes view of the system".
Host can see through all, members can't ("one-way mirror").
We'll focus on "process", "network" and "user".
Members get their own view of the process tree.
$ ps -e | head -n 4
PID TTY TIME CMD
1 ? 00:00:01 systemd
2 ? 00:00:00 kthreadd
3 ? 00:00:00 rcu_gp
$ sudo chroot my_debian_root ps -e | head -n 4
PID TTY TIME CMD
1 ? 00:00:01 systemd
2 ? 00:00:00 kthreadd
3 ? 00:00:00 rcu_gp
$ ps -e | head -n 4
PID TTY TIME CMD
1 ? 00:00:01 systemd
2 ? 00:00:00 kthreadd
3 ? 00:00:00 rcu_gp
$ sudo unshare --fork --pid -- chroot my_debian_root /bin/bash
root@node-1:/# ps -e
PID TTY TIME CMD
1 ? 00:00:00 bash
2 ? 00:00:00 ps
Separate "network stack" for members processes.
Assign virtual/physical NICs to each namespace.
Unrelated example use-cases:
- Configure per-application FW rules
- Force cherry-picked services through a VPN
- Handle overlapping network segments
Root in chroot is root.
Lots of things such as package managers expect root privileges, but don't really need it.
User namespaces give members their own group and user lists.
Processes believe that they run as root, but not from host kernels perspective.
(not used by all virtualisation tools, scary!)
Members usage of system resources (CPU, memory, disk I/O, etc.) can be limited.
Used together with CRIU for live migration.
Limit which/how system calls can be used.
Some syscalls allows breakout of isolation.
Minimize attack surface of shared kernel.
Originally developed to make root less omnipotent.
Caps like "NET_BIND_SERVICE" and "SYS_CHROOT" can be given to non-root users.
Not very fine-grained and some are known unsafe.
These and other features make up the beautiful mess we call OS-level virtualisation on Linux!











