LXC configuration and troubleshooting

Most LXC users will run into the fairly common and mysterious 'cgroup error' when starting containers. This is a catchall error and doesn't really say anything. The LXC logs are also frequently unhelpful.

However once you become familiar with the various moving parts in LXC it's fairly easy to identify what could be wrong. The first thing to do is to launch the container without the -d mode in the terminal so you can see the container boot and get more information on the error.

Let's take a quick overview of the typical LXC installation and where the configuration files are.

/var/lib/lxc - this is where LXC containers are located.
/var/lib/lxc/containername/config - this is the location of the individual container configuration file
/var/lib/lxc/containername/fstab - this is the fstab file of individual containers used to mount host folders in containers

/etc/lxc/lxc.conf - you can define an alternative lxc directory with the lxc.lxcpath variable in this file
/etc/lxc/default.conf - this typically defines the default lxc.network type and settings used when creating containers with lxc-create
/etc/lxc/dnsmasq.conf - this is mainly used to configure Dnsmasq to assign static IPs to containers
/etc/default/lxc - this defines whether the default lxc bridge is used and is mainly used by the lxc and lxc-net init scripts
/etc/init.d/lxc - this is used to autostart lxc containers as per settings in the individual containers
/etc/init.d/lxc-net - this starts the default lxcbr0 network bridge and sets up container networking and internet access for containers.

/etc/fstab - this is where cgroups fs is mounted (not applicable if your distribution uses cgroups-lite or cgmanager like Ubuntu)
/etc/default/grub - this is where cgroups memory support is enabled by a grub flag to the kernel

/usr/local/share/lxc/templates - this is where the container OS templates are stored 
/usr/local/share/lxc/config - this is where the various default container configurations are derived from

As you can see there is a lot of ways for things to go wrong. But in most scenarios with default installation things are robust and work smoothly.

Let's take a look at some common reasons for cgroup errors and solutions

Cgroups in fstab is not set

Without cgroups mounted LXC containers will not be able to start. In Ubuntu cgroups mounting is managed by cgmanager or cgroups lite. In Debian it's managed by mounting cgroups in the /etc/fstab file. Please ensure this line exists in your system fstab

cgroup /sys/fs/cgroup cgroup defaults 0 0

And cgroups are mounted by checking if /sys/fs/cgroup exists. If not mount it with the command below.

mount /sys/fs/cgroup

Cgroups memory is not set
You will usually get a cgroup error if you are using a container that has cgroups memory limits set, or are trying to use cgroups to set container memory usage. To use cgroups memory you need to enable it in the kernel via grub command line setting.

So head to /etc/default/grub in Debian/Ubuntu or in your Grub config for other distributions and add or append this.

GRUB_CMDLINE_LINUX_DEFAULT="cgroup_enable=memory swapaccount=1"

LXC networking is not up
This can be a frequent cause of errors and containers failing to start. The default LXC bridge lxcbr0 may not be up, preventing containers from starting their networking and getting IPs. First check if the bridge is up

brctl show

This will show you the bridges available. If lxcbr0 is missing you need to check your lxc-net service is up and working properly.

service --status-all

If the lxc-net is missing restart the lxc-net service

service lxc-net start

This should start the lxc-net service and the lxcbr0 bridge.

Dnsmasq related errors
Another common reason for networking to fail is if Dnsmasq is not able to bind to the lxcbr0 interface. This happens if the user already has Dnsmasq installed but configured to bind to all interfaces, which is its default configuration. With this configuration the lxc-net script will not be able to bring up the lxcbr0 interface as Dnsmasq will fail to bind to it and DHCP will not work.

Dnsmasq is a great little program used extensively in most distributions and a large number of programs including Libvirt, Openstack etc. So it's more likely than not that you have Dnsmasq installed. A lot of programs also often use dnsmasq-base as an instance rather than the system daemon hence the availability of both dnsmasq-base and dnsmasq(full) packages in most Linux distributions.

What programs like libvirt do if Dnsmasq(full) is already installed, is drop an exception in /etc/dnsmasq.d/ telling it not to bind to the libvirt 'virbr0' interface avoiding the error. The Flockport LXC installer package does the same for the lxcbr0 interface.

nano /etc/dnsmasq.d/lxc

bind-interfaces
except-interface=lxcbr0

This is a minefield basically as having Dnsmasq(full) configured to bind to all interfaces by default is not a good idea, and should one of the interfaces be a public IP, you are then essentially running an open dns relay open to ddos attacks.

If you don't have Dnsmasq(full) installed then this error will not happen. If you are using Dnsmasq(full) then it's a good idea to configure it to bind to a specific interfaces, even a fake interface like 'abc' will do to prevent it from binding to all interfaces.

Look for the 'interface=' in your /etc/dnsmasq.conf file and add a random interface and uncomment bind-interfaces.

interface=abc
bind-interfaces

LXC is not complex but some of the pieces in Linux are complex, and this is faced by most programs. This kind of thing is difficult to detect and track down without days of work.

Another Dnsmasq error that could crop up is if there is no Dnsmasq user. The Dnsmasq(full) package creates a dnsmasq user by default. The dnsmas-base package often does not depending on the distribution, version etc so when using LXC it's a good idea to check and create a dnsmasq user if none exists.

cat /etc/groups
adduser --system --home /var/lib/misc --gecos "dnsmasq" --no-create-home --disabled-password --quiet dnsmasq

The Flockport Debian LXC package checks these issues and resolves them during install so users should not face these issues.

Container configuration file errors
This is how a typical LXC container configuration files looks. It's located in the /var/lib/lxc/containername folder. In this case the container name is 'deb64'

lxc.mount = /var/lib/lxc/deb64/fstab
lxc.mount.entry = proc proc proc nodev,noexec,nosuid 0 0
lxc.mount.entry = sysfs sys sysfs defaults 0 0
lxc.mount.entry = /sys/fs/fuse/connections sys/fs/fuse/connections none bind,optional 0 0
lxc.tty = 4
lxc.pts = 1024
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c *:* m
lxc.cgroup.devices.allow = b *:* m
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 254:0 rm
lxc.cgroup.devices.allow = c 10:229 rwm
lxc.cgroup.devices.allow = c 10:200 rwm
lxc.cgroup.devices.allow = c 1:7 rwm
lxc.cgroup.devices.allow = c 10:228 rwm
lxc.cgroup.devices.allow = c 10:232 rwm
lxc.cgroup.cpuset.cpus = 1
lxc.cgroup.memory.limit_in_bytes = 1G
lxc.cgroup.memory.memsw.limit_in_bytes = 1G
lxc.utsname = deb64
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxcbr0
lxc.network.name = eth0
lxc.network.hwaddr = 00:16:3e:0a:83:c5
lxc.network.mtu = 1500
lxc.cap.drop = sys_module
lxc.cap.drop = mac_admin
lxc.cap.drop = mac_override
lxc.cap.drop = sys_time
lxc.rootfs = /var/lib/lxc/deb64/rootfs

Pay attention to the 'lxc.mount and lxc.rootfs' locations defined, and network defined to ensure they match your current container location and network.

Also sometimes containers configurations may be shortened and have 'lxc.include' line to the default configuration file for the specific container OS like below.

# Template used to create this container: /usr/local/share/lxc/templates/lxc-download
# Parameters passed to the template: -d ubuntu -r precise -a i386
# For additional config options, please look at lxc.conf(5)

# Distribution configuration
lxc.include = /usr/local/share/lxc/config/ubuntu.common.conf
lxc.arch = x86

# Container specific configuration
lxc.rootfs = /var/lib/lxc/ubuntu32/rootfs
lxc.utsname = ub32

# Network configuration
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxcbr0
lxc.network.name = eth0
lxc.network.hwaddr = 00:16:3e:3c:f5:c2
lxc.network.mtu = 1500

Ensure the lxc.include config points to the right location for the template.

Tip: For container portability its more useful to have the full container configuration in the config file, which Flockport does.

lxc-start errors

lxc-start: Executing '/sbin/init' with no configuration file may crash the host

This is a basic LXC error and means the container name provided with the lxc-start command does not exist. This is usually the result of a user typo error or if the container is not in the lxc folder /var/lib/lxc Please recheck the container name and if the container is in the lxc folder with lxc-ls -f command.

Looking at the lxc-net script
The lxc-net script is responsible for bringing up the network for containers. Typically it sets up the default lxc bridge lxcbr0 and some iptables rules so containers have access to the internet. You don't have to use the lxc-net enabled network, you can use your own bridge by simply editing the individual container config file with the bridge name.

Let's look at some of the options available in the script. The lxc-net script is located at /etc/init.d/ or /etc/init/ in Ubuntu. Open it up in a text editor for reference.

# set up the lxc network
USE_LXC_BRIDGE="false"
LXC_BRIDGE="lxcbr0"
LXC_ADDR="10.0.3.1"
LXC_NETMASK="255.255.255.0"
LXC_NETWORK="10.0.3.0/24"
LXC_DHCP_RANGE="10.0.3.2,10.0.3.254"
LXC_DHCP_MAX="253"
LXC_DHCP_CONFILE="/etc/lxc/dnsmasq.conf"
varrun="/var/run/lxc"
LXC_DOMAIN="lxc"

The options for the LXC bridge name, subnet, network and dhcp range is set here. These are configurable and can be changed from defaults when required.

The LXC_DHCP_CONFFILE variable refers to the dnsmasq.conf in /etc/lxc that sets up static IPs for containers. Here you can associate containers with specific IPs if required.

The other interesting setting is LXC_DOMAIN. Here you can assign a domain to containers so they can be found by their domain names.

Now let's look at the part of the script that set up the bridge, network and iptables rules. The rest of the script restarts or stops the network by deleting the settings.

brctl addbr ${LXC_BRIDGE} || { echo "Missing bridge support in kernel"; exit 0; }
    echo 1 > /proc/sys/net/ipv4/ip_forward
    mkdir -p ${varrun}
    ifconfig ${LXC_BRIDGE} ${LXC_ADDR} netmask ${LXC_NETMASK} up
    iptables -I INPUT -i ${LXC_BRIDGE} -p udp --dport 67 -j ACCEPT
    iptables -I INPUT -i ${LXC_BRIDGE} -p tcp --dport 67 -j ACCEPT
    iptables -I INPUT -i ${LXC_BRIDGE} -p udp --dport 53 -j ACCEPT
    iptables -I INPUT -i ${LXC_BRIDGE} -p tcp --dport 53 -j ACCEPT
    iptables -I FORWARD -i ${LXC_BRIDGE} -j ACCEPT
    iptables -I FORWARD -o ${LXC_BRIDGE} -j ACCEPT
    iptables -t nat -A POSTROUTING -s ${LXC_NETWORK} ! -d ${LXC_NETWORK} -j MASQUERADE
    iptables -t mangle -A POSTROUTING -o ${LXC_BRIDGE} -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill

    LXC_DOMAIN_ARG=""
    if [ -n "$LXC_DOMAIN" ]; then
        LXC_DOMAIN_ARG="-s $LXC_DOMAIN"
    fi
    dnsmasq $LXC_DOMAIN_ARG -u dnsmasq --strict-order --bind-interfaces --pid-file=${varrun}/dnsmasq.pid --conf-file=${LXC_DHCP_CONFILE} --listen-address ${LXC_ADDR} --dhcp-range ${LXC_DHCP_RANGE} --dhcp-lease-max=${LXC_DHCP_MAX} --dhcp-no-override --except-interface=lo --interface=${LXC_BRIDGE} --dhcp-leasefile=/var/lib/misc/dnsmasq.${LXC_BRIDGE}.leases --dhcp-authoritative || cleanup
    touch ${varrun}/network_up
}

This is basically creating the lxcbr0 bridge, settings up networking & routing, dhcp, NAT masquerading so containers have access to the internet, and starting a dnsmasq instance to manage dhcp for the lxcbr0 bridge.

This document is work in progress and we will keep adding to it.

The LXC developers are experimenting with a new format of container configuration which links to the main common container OS configuration file. In our opinion that has to potential to limit container portability if the templates are not in the exact same locations across distributions or unless the container start or LXC installer does not detect the template locations automatically, rather than spitting a generic cgroup error.

The LXC devs are focused on Ubuntu, a lot of packages like cgmanager work well on Ubuntu only, similar to unprivileged containers which depend on features that are not yet widely available and work seamlessly only on the latest versions of Ubuntu. This impacts cross platform compatibility and in our opinion seriously impedes and has impeded LXC adoption. And give rise to confusion about LXC containers, and the rise of niche alternatives that offer a fraction of the functionality that LXC does.

Recommended Posts

Leave a Comment

Login

Register | Lost your password?