Linux Containers for Dummies
LXC (LinuX Containers) is an operating system–level virtualization method for running multiple isolated Linux systems (containers) on a single host.
The Linux kernel comprises cgroups (control groups) for resource isolation (CPU, memory, block I/O, network, etc.) which does not require starting any virtual machines. Cgroups also provide namespace isolation to completely isolate application’s view of the operating environment, including process trees, network, user ids and mounted file systems.
LXC combines cgroups and namespace support to provide an isolated environment for applications.
Isolation with namespaces
· pid
· mnt
· net
· uts
· ipc
· user
Isolation with cgroups
· memory
· cpu
· blkio
· devices
1. Kernel for LXC
Pick any kernel above linux kernel version — 3.3 and ensure the following options are enabled either through ‘make menuconfig’ or kernel compilation process.
filename: linux-3.3/.config
CONFIG_USER_SCHED=nCONFIG_CGROUPS=yCONFIG_CGROUP_SCHED=yCONFIG_CGROUP_DEBUG=yCONFIG_CGROUP_NS=yCONFIG_CGROUP_FREEZER=yCONFIG_CGROUP_DEVICE=yCONFIG_CPUSETS=yCONFIG_PROC_PID_CPUSET=yCONFIG_CGROUP_CPUACCT=yCONFIG_RESOURCE_COUNTERS=yCONFIG_NAMESPACES=yCONFIG_UTS_NS=yCONFIG_IPC_NS=yCONFIG_USER_NS=yCONFIG_PID_NS=yCONFIG_NET_NS=yCONFIG_VLAN_8021Q=yCONFIG_VLAN_8021Q_GVRP=nCONFIG_MACVLAN=yCONFIG_VETH=yCONFIG_DEVPTS_MULTIPLE_INSTANCES=yCONFIG_SECURITY_FILE_CAPABILITIES=yCONFIG_FAIR_GROUP_SCHED=yCONFIG_RT_GROUP_SCHED=nCONFIG_BLK_CGROUP=nCONFIG_MACVTAP=nCONFIG_CFS_BANDWIDTH=yCONFIG_NETPRIO_CGROUP=y
2. Compile lxc sources
Download LXC sources from https://linuxcontainers.org/downloads/, pick any version lxc-x.x.x.tar.gz
[srinivasc@agni]# tar xvf lxc-1.0.4.tar.gz[srinivasc@agni]# cd lxc-1.0.4[srinivasc@agni]# export CC=mipsel-linux-uclibc-gcc; ./configure --host=mipsel[srinivasc@agni]# make[srinivasc@agni]# ls * | grep lxc-*lxc-create lxc-start lxc-execute lxc-console lxc-stop lxc-freeze lxc-ps lxc-monitor lxc-unfreeze lxc-info ..
Copy these binaries to /usr/bin or /sbin or /bin directory of rootfs of target running above compiled linux kernel.
3. LXC and networking
Bridge set-up and linux host interface (eth0) configuration on target machine (MIPS)
bridge-startbrctl showbrctl addbr host0ifconfig eth0 192.168.10.2 uproute add default gw 10.140.47.254 eth0ifconfig host0 192.168.10.1 up
4. Running Application Containers
We can use the lxc-execute command to create a application container (name: guest) in which we can run a command that is effectively isolated from the rest of the system.
For example, the following command creates an application container named guest
that runs sleep for 300 seconds.
[srinivasc@agni]# lxc-execute -n guest -- sleep 300
While the container is active, we can monitor it by running commands such as lxc-ls — active and lxc-info -n guest from another window.
[srinivasc@agni]# lxc-ls --activeguest[srinivasc@agni]# lxc-info -n gueststate: RUNNINGpid: 4021
If we need to customize an application container, we can use a configuration file.
For example, we might want to change the container’s network configuration or the system directories that it mounts.
The following example shows settings from a sample configuration file where the rootfs
is mostly not shared except for mount entries to ensure that lxc-init
and certain library and binary directory paths are available.
lxc.utsname = guestlxc.tty = 1lxc.pts = 1lxc.rootfs = /tmp/guest/rootfslxc.mount.entry=/lib /tmp/guest/rootfs/lib none ro,bind 0 0lxc.mount.entry=/usr/libexec /tmp/guest/rootfs/usr/lib none ro,bind 0 0lxc.mount.entry=/lib64 /tmp/guest/rootfs/lib64 none ro,bind 0 0lxc.mount.entry=/usr/lib64 /tmp/guest/rootfs/usr/lib64 none ro,bind 0 0lxc.mount.entry=/bin /tmp/guest/rootfs/bin none ro,bind 0 0lxc.mount.entry=/usr/bin /tmp/guest/rootfs/usr/bin none ro,bind 0 0lxc.cgroup.cpuset.cpus=1
The mount entry for /usr/libexec
is required so that the container can access /usr/libexec/lxc/lxc-init
on the host system.
The example configuration file mounts both /bin
and /usr/bin
. In practice, we should limit the host system directories that an application container mounts to only those directories that the container needs to run the application.
Note
To avoid potential conflict with system containers, do not use the /container
directory for application containers.
We must also configure the required directories under the rootfs
directory:
[srinivasc@agni]# TMPDIR=/tmp/guest/rootfs[srinivasc@agni]# mkdir -p $TMPDIR/lib $TMPDIR/usr/lib $TMPDIR/lib32 $TMPDIR/usr/lib32 \$TMPDIR/bin $TMPDIR/usr/bin $TMPDIR/dev/pts $TMPDIR/dev/shm $TMPDIR/proc
In this example, the directories include /dev/pts
, /dev/shm
, and /proc
in addition to the mount point entries defined in the configuration file.
We can then use the -f option to specify the configuration file (config
) to lxc-execute:
[srinivasc@agni]# lxc-execute -n guest -f config -- ps -efUID PID PPID C STIME TTY TIME CMD0 1 0 0 07:36 ? 00:00:00 /usr/lib/lxc/lxc-init -- ps -ef0 2 1 0 08:46 ? 00:00:00 ps -ef
This example shows that the ps command runs as a child of lxc-init
.
As for system containers, we can set cgroup
entries in the configuration file and use the lxc-cgroup command to control the system resources to which an application container has access.
Note
lxc-execute is intended to run application containers that share the host’s root file system, and not to run system containers that we create using lxc-create. Use lxc-start to run system containers.
5. Sample Configuration & logs
Filename: /etc/fstab
cgroup /cgroup cgroup defaults 0 0
Filename: /etc/lxc/guest/config
# Default pivot locationlxc.pivotdir = lxc_putold# Default mount entrieslxc.mount.entry = proc proc proc nodev,noexec,nosuid 0 0lxc.mount.entry = sysfs sys sysfs defaults 0 0lxc.mount.entry = /sys/fs/fuse/connections sys/fs/fuse/connections none bind,optional 0 0# Default console settingslxc.tty = 4lxc.pts = 1024# Default capabilitieslxc.cap.drop = sys_module mac_admin mac_override sys_time# When using LXC with apparmor, the container will be confined by default.# If you wish for it to instead run unconfined, copy the following line# (uncommented) to the container's configuration file.#lxc.aa_profile = unconfined# To support container nesting on an Ubuntu host while retaining most of# apparmor's added security, use the following two lines instead.#lxc.aa_profile = lxc-container-default-with-nesting#lxc.hook.mount = /usr/share/lxc/hooks/mountcgroups# If you wish to allow mounting block filesystems, then use the following# line instead, and make sure to grant access to the block device and/or loop# devices below in lxc.cgroup.devices.allow.#lxc.aa_profile = lxc-container-default-with-mounting# Default cgroup limitslxc.cgroup.devices.deny = a## Allow any mknod (but not using the node)lxc.cgroup.devices.allow = c *:* mlxc.cgroup.devices.allow = b *:* m## /dev/null and zerolxc.cgroup.devices.allow = c 1:3 rwmlxc.cgroup.devices.allow = c 1:5 rwm## consoleslxc.cgroup.devices.allow = c 5:0 rwmlxc.cgroup.devices.allow = c 5:1 rwm## /dev/{,u}randomlxc.cgroup.devices.allow = c 1:8 rwmlxc.cgroup.devices.allow = c 1:9 rwm## /dev/pts/*lxc.cgroup.devices.allow = c 5:2 rwmlxc.cgroup.devices.allow = c 136:* rwm## rtclxc.cgroup.devices.allow = c 254:0 rm