Statelessness and the Swiss engineer

The computing community was taken by surprise a couple of years ago by an engineer at CERN reminiscing fondly about pets and cattle in a scientific presentation.

It took a while for folks to realise said engineer was not exercised by fluffy pets or the oversupply of wholesome dairy animals in Switzerland but servers and virtual machines.

The idea being pets are servers that cannot be easily replaced and need to be nurtured, while cattle are representative of virtual machines (VMs) that are easily replaced.

This has little meaning for those of us fortunate enough not to grapple with issues of statefull and statelessness or scale out architectures, but we are going to force it down your throat anyway.

Please note we are at 10000 feet here and are going to make broad generalizations simply because we are covering a large area of computing, please inform the debate rather than condemn it. Let's keep it happy and pleasant.

Let's take a key problem of state and scaling. Normal computer users normally do not have to deal with this, but it becomes an issue in 'scale-out' scenarios.

Scale-up and scale-out

Let us introduce ourselves to the concept of 'scale-up' and 'scale-out'. Let's say you have a machine with a 2GHz processor and 2GB RAM that is running a mail server (not the best example but let's go with that). Your email server is overloaded, users are unhappy and you need to upgrade.

Now you can do it in 2 ways, buy a faster processor and more ram so you go get a 4GHz processor and 4GB RAM. This is called scaling up, you can easily see the problem here, next time you have to scale, you will need a 8GHz processor and even more ram, which may not be available or practical.

You also have a single point of failure. Any issue with your mail server and mail for the all users is down. An alternative is to scale-out, instead of upgrading your processor and ram, you get a new server to share the load with the old server. So you now have 2 machines instead of one, and if load increases you can add more. And if one mail server were to fail you still have one up. Easy right? No!

How are you going to distribute the load and ensure the inboxes are in sync? With scaling out you have to deal with the problem of state, and your data layer being in sync. 2 mail servers or more have to be in sync with each other or the user will see an inconsistent inbox.

Let's say an user logs in in the morning and is directed to Server 1, writes a few mails. Logs in an hour later and is directed to Server 2 and sees her mails missing and out of sync. You are toast!

In the case of the mail server one solution would be to replicate the inboxes, so both servers have a single view and copy of the user data, and downtime on one does not affect the other. Or have shared storage.

Typically one scales by decoupling or breaking down your application (typically composed of the app, database, app server, web server, caching layer) and building a loosely coupled architecture so the individual bits can all be on different instances (ie servers, virtual machines, cloud instances or containers) with a data management layer, session and state management, and you then begin to need things like load balancers to scale.

We are not going to go too much into detail here but this excellent presentation on scaling by Chris Munns at Amazon should help end-users understand the underlying architectures.

This is a typical problem of state and scale that companies like Facebook, Twitter, Gmail, Youtube and enterprises face everyday that exercises developers and ops teams. Devops would ideally like to simply automate the provisioning of new instances as demand grows and not think too much, but that is impeded by the reality of managing state and data layers as one scales.

What does this have to do with containers?

What has this got to do with containers? Well not much, a container is still a container and will not magically make you scalable. Docker has an approach that encourages building loosely coupled architectures and thinking of 'containers as apps'. For those who have read our LXC vs Docker article you will notice Docker does a lot of things differently which may seem counter intuitive, and the lay user might be wondering why.

This is the reason behind the layers, restricted container OS templates for single app containers, storing data outside containers and other decisions Docker has taken. The focus is more on the app than the container, whose existence is defined by the requirements of the app. This in many ways may make it useful for PAAS vendors offering app instances, but not without tradeoffs.

Can you get a normal LXC container to behave like Docker? Since Docker is built on LXC's underlying support for overlay file systems like aufs, overlayfs and bind mounts to share data between host and containers you obviously can, but Docker has already gone down that road and you may as well use Docker. The bigger question is what you are trying to achieve?

The idea of building using layers and easy updates to lets say 5 or 1000 containers instances in one go as a lay reader would imagine is a myth and dissipates quickly when you actually try to do it. First you have to realise in Docker these are layers to make single app, not layers of multiple apps in a single container, so you are not going to simply change one layer to update your stack.

The idea of an immutable infrastructure sounds good but is more difficult in practice since exploits are discovered everyday that need to be addressed asap. Using a read only base image that is immutable but needs to be rebuilt every time there is a security alert is counter productive. For instance a security update to Bash will need the underlying OS layer update for all your containers, if its a security update to PHP you need to change the PHP layer and none of it is a one click operation.

The separation of app data to a folder outside the container on the host or another container uses LXC's awesome bind mount feature which is available in a far simpler way on LXC.

That leaves us with loosely composed single app containers. Well, nothing stops users from deploying normal LXC containers as single apps. For use cases beyond very niche PAAS type scenarios, where one does not need to abstract containers away to some kind of a 'frozen app', normal LXC containers without the constraints placed by Docker may prove more simple and useful.

The problem of scaling out has little to with app instances and everything to do with the state data associated with the app instances. Simply having 50 app instances does not even begin to solve the problems of scaling out.

For users container technology opens new possibilities across a range of use cases, and with it questions on how best to use and deploy containers.

This article is a 10,000 feet attempt to introduce end users to the concepts of scale-up and scale-out. It's not meant to be a complete overview or a technical paper, and should not be read as such. All efforts have been made to ensure the article is accurate within those constraints. Please let us know in the comments section, or contact us to correct any inaccuracies to help promote informed discussion on Linux containers.

Stay Updated on Flockport News

Recommended Posts

Leave a Comment


Register | Lost your password?