Load balancing and failover of LXC containers using LVS
To quickly recap In this method the Nginx load balancer (lb) containers are configured for failover with keepalived. The lb instances serve and load balance applications from several containers. Its pretty simple to setup, high performance and efficient.
In Part II we are going to use LVS for the load balancing in containers, specifically LVS-DR. Linux virtual server (LVS) is part of the Linux HA project, fairly mature and widely used for load balancing.
LVS is part of the Linux kernel, a lot of guides are outdated and may refer to kernel modules or compilation but LVS and all required modules are part of most kernels and just need to be loaded.
With LVS load balancing happens in layer 4 as opposed to layer 7 as with Nginx/Haproxy. LVS has a number of modes and schedulers for load balancing. The main modes are LVS-NAT and LVS-DR and schedulers include round robin, weighted lest connections etc
LVS-DR
In many ways LVS is more efficient and transparent however there are tradeoffs as using Nginx or Haproxy gives you more finely grained control at the app and http level. LVS-DR especially is interesting as the load balancer is not a bottleneck as it would be if you are using Nginx, Haproxy or even LVS-NAT.
In LVS-DR the load balancer forwards the requests to the relevant apps or servers and those servers respond directly to the client. This is called direct routing. So there is no return path through the load balancer which could be bottleneck. For streaming, CDNs and downloads this makes a lot of sense.
For this article we are going to focus on using LVS-DR which stands for direct routing, and we are going to use LVS through Keepalived. Keepalived is a swiss army knife in our toolkit here as it is a single program which provides failover with VRRP and load balancing with LVS.
Now is the perfect time to plan your failover and load balancing network. For LVS-DR all instances need to be on the same subnet. For this example we are going to use containers operating in the default lxcbr0 network 10.0.3.0/24 but in the real world your network would include containers across several hosts with floating IPs operating on the same network.
We are not going to make this a networking tutorial but there are many ways to accomplish this, our LXC networking guides in the News and guides section can provide an outline.
Configuring LXC containers to use LVS-DR
For this guide on using LVS-DR in LXC containers let's start by picking a network of 4 containers, 2 will be the load balancer instances and 2 will be the application containers that need to be load balanced. You can have 'n' number of application containers.
The lb containers will load balance the application containers which in turn will respond directly to the clients. You can even have the load balancers as application containers but that increasing configuration and some complexity so we are avoiding it here. Let's define a few terms.
Host - LXC hosts
Containers - LXC containers
LB containers - Load balancer containers that are going to failover
Application containers - Containers to be load balanced
Virtual IP - This is the floating IP
First install keepalived in your 2 failover containers.
apt-get install keepalived
The keepalived configuration file is in /etc/keepalived/keepalived.conf
LVS is a kernel module that will NOT be available for use in containers until you load the LVS module and its corresponding load balancing scheduler modules in the hosts first. For this guide we are using the LVS RR load balancing algorithm which is a simple round robin. LVS supports a number load balancing schedulers so load the appropriate module.
modprobe ip_lv ip_lv_rr
This is how the keepalived config file should look for LVS-DR for lb1 and lb2 containers. Only for the lb2 slave container state config change the state config to 'SLAVE' and priority to a lower value than the 100 used for Master. We used 50.
! Configuration File for keepalived vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass pass } virtual_ipaddress { 10.0.3.250 } } virtual_server 10.0.3.250 80 { delay_loop 6 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 10.0.3.10 80 { weight 10 TCP_CHECK { connect_timeout 3 connect_port 80 } } real_server 10.0.3.20 80 { weight 10 TCP_CHECK { connect_timeout 3 connect_port 80 } } }
Let's understand the config. We have selected a random 10.0.3.250 as the failover IP asĀ you can see in the virtual_ip config. For using Keepalived to failover this is all you need.
But in this case we are going to use Keepalived for load balancing so we are going to add a virtual-server section. Notice the virtual_server IP is also the failover IP defined earlier. This is needed for LVS-DR. The lb_algo defined the load balancing scheduler to be used and the lb_kind defined the load balancing mode ie LVS-NAT, LVS-DR or LVS-TUN. We are using LVS-DR hence its defined as dr in lb_kind. The app server IPs that are going to be load balanced are 10.0.3.10 and 10.0.3.20.
The failover IP is the IP client requests are made. Let's look at the chain of events to understand this better.
- client requests resource on 10.0.3.250. Both load balancers are configured to hold this IP in keepalived in the virtual_ip section. The master load balancer holds it and it moves to the slave incase the master lb is down.
- Master lb receives the request and looks at its config and passes it to 10.0.3.10 or 10.0.3.20 depending on the load balancing method chosen. We are using round robin or rr as defined in lb_algo.
- If master lb is down keepalived will move the virtual IP 10.0.3.250 to the lb slave which will pass the client on to 10.0.3.10 or 10.0.3.20. You can see this at work by stopping the keeplived service in one of the lb containers.
- Now the application server will receive a request with destination IP 10.0.3.250, look at its own IP which is either 10.0.3.10 or 10.0.3.20 and reject it.
- This obviously won't work. So for LVS-DR to work we need to make a slight configuration on the app servers. Either add the virtual IP to the 'lo' interface of the app servers or use an iptables rule so the app servers can receive requests for 10.0.3.250
Add the virtual IP to the 'lo' interface of app server containers
ip addr add 10.0.3.250 dev lo
or
use an iptables rule to ensure they accept requests for 10.0.3.250 like below
iptables -t nat -A PREROUTING -p tcp -d 192.168.1.20 --dport 80 -j REDIRECT
- With that in place it should work seamlessly. Let's try it again. App server receives request for 10.0.3.250, sees it can respond to it thanks to the changes we made and responds to the client directly, hence LVS-DR and the direct routing. With LVS-NAT the app server would pass the resource to the load balancer which in turn would pass the resource on to the client. So with DR mode the load balance does not have much to do apart from directing.
You can add more app servers containers to the mix by adding them to the keepalived virtual_server section and the load balancer container will direct the load accordingly. We are using basic round robin (rr) for this guide but you can use any of the available LVS schedulers - wlc is recommended. Keepalived has a number of configuration options for failover and load balancing.
LXC capabilities
The idea behind this article is to show LXC containers can like VMs be used with LVS and other complex configurations transparently. To actually use these capabilities you would need to setup your network accordingly.
Please note most VPS and cloud providers do not support floating IPs, and those that do usually have their own load balancing systems in place like Amazon ELB.
Another thing to keep in mind is Keepalived uses multicast which is again not supported by most cloud and VPS providers. You can explore the use of unicast with latest versions of Keepalived.
In LVS-DR the lb containers can be in the host network and the app containers in the DMZ network. To make this scenario work with LVS-DR the lb containers need to have 2 network interfaces, so in the container config you need to define eth0 and eth1, eth0 in the host network and eth1 in the DMZ network so the lb containers can access the app servers containers.
The use cases for having failover at the container or VM level are limited, usually you would failover hardware servers hosting VMs and containers with keepalived managing floating IPs mapped to containers or VMs on the hardware nodes.