Let’s imagine that you liked a bicycle at a sports shop and ordered one to be shipped to your home. How would you like it if the shop shipped everything seperately? The frame is in one box, while the wheels are in another. The gears and chain are a tangle, while the handlebar is just a stick with two rubber ends. The seat is a disembodied piece of cushion. Would you be happy with the sports shop and buy from them ever again? No, you’d love it if they ship it completely assembled. You want them to deliver it to you so that you are ready to go. The only fixing you would want to do is some tuning to the brakes and some adjustment to the seat height.
So why should software be any different? Why is it that when we install something, we should have to take care of installing the dependencies, hoping that they don’t clash with those of other applications? One app requires .NET run time version 4, but a new one you are about to install requires .NET run time 5. An app on Linux requires ImageMagick library, so now you have to go find it and install it first. To install Mac OS applications with dependencies, you need to learn about the entire ‘brew’ system.
What if someone simply ships an entire working app with all its dependencies as part of a ‘box’. Just like your bicycle, where the sports shop takes care of assembling the wheels, handlebar, gears, chain and the seat to the frame. That is the concept of containerised apps, these days made popular by Docker.
The traditional model for installing and running apps
In the days of regular apps, you had a computer with an operating system installed. Using this as the base, you’d install all the apps that you wanted on top of the operating system. This would mean the main software itself, along with a list of all the supporting software required to run that main software. The supporting software is often referred to as a library, with each library used by multiple applications in the machine. E.g. in a Windows machine, multiple applications like Adobe Photoshop and Microsoft Paint would use all the graphic libraries.
The problem of mundane repetition
There are some problems with traditional approach. To describe the first problem, let’s take our bicycle example again. We saw how the sports shop left us in a difficult situation. It was bad that we had to learn how to assemble the bicycle on our own. We needed how to use tools such as a wrench, screwdriver and a hammer to attach all the parts to one another, so that we could get a functional bicycle in the end.
What could be worse? Your friend buys a bicycle and he too gets a box with disconnected parts. Now the two of you need to work and get this bicycle assembled and out the door. Hopefully, you wrote down the steps when you learned the first time. Otherwise you’ll run into the same problems that you encountered then and you’d have to work through them again. Talk about deja vu.
All software admins face this problem from time to time. They need to install the same software in multiple machines. To their dismay, they must go through the same steps of installation, beginning from the first dependency upto the point where everything is working as desired, on every machine.
The problem of incompatibility
What if your friend has a different brand of bicycle and it has nuts and bolts different from the ones your bicycle does? What if this one takes a different wrench set to work with? Well, you have no choice but to buy this different wrench set. Because your friend and you have different bicycles and hence different types of parts, the two of you have two different wrench sets among yourselves. And if yet another friend gets another brand of bicycle with its own different parts, then you’d need yet another wrench set.
This often happens in machines as you install more software. One software depends on Java 8, while another depends on Java 9. So you’d usually have both installed. One needs NodeJS 8, while another needs NodeJS 12, so you bloat your computer’s precious space with more dependencies. But what if two versions of the same dependency refuse to co-exist on the same machine?
This used to be a common problem in late 90s and upto the middle of 2000s, but fortunately those days are over. Different versions of the same software library usually co-exist by installing themselves in different folders named after the version numbers. So an application can pick the correct version of the dependency. Newer versions are often made backward compatible, such that they support the same functionality as the older versions for a few years in a form called ‘deprecated’, before the library authors decide to retire the old functionality totally.
The containerised approach
But what if you could install software in true one-touch style? What if a piece of software comes bundled with everything in place to run? Just like a sports shop that delivers you a completely assembled bicycle?
This is the approach taken by a new generation of apps called containers. Let’s see an example. In the non-container days, if you wanted to get your own WordPress-powered blog running, you’d have needed to install PHP, MySQL, Apache or Nginx Web Server and the WordPress PHP code into your web server machine yourself. You’d also need to install the connectors that allow Apache to use PHP and PHP to talk to MySQL You’d need to set up your database and then key in the settings of that database during WordPress setup manually. If everything went alright, then you’d have your blog fully working. It was a process with its own long checklist and was often fraught with gotchas…..
…. until someone thought of putting everything together inside a container and shipping it all together, so that people could simply install the container into their machines and everything would just work. With minor personal tweaks of course. How does that work?
What makes a container?
The science of a container is too complicated for me to explain in a tech blog that is meant for a layman, but let me put it together in a few short points.
a. A physical machine (i.e. a desktop, laptop or a server machine) does not contain any application software directly. It contains just an operating system, the processor, memory and storage. No word processor, no graphic design package, no spreadsheet, nothing at all. In the containerisation approach, a physical machine is just like a piece of land where things are built. The machine exists because it provides the processing power, memory and storage.
b. All the useful software exist in the form of containers, which are like mini virtual machines inside the physical machine. These behave like individual computers inside our computer. Like a pregnant woman sustains a living form inside her, the physical machine simply supports the many virtual machines that have their own life.
c. Containers can talk to each other, but for most part each container is made to be self-sufficient for its own single purpose.
d. Containers have their own operating system which is a scaled down version of the operating system that runs on the main machine. This operating system only exists so that application software can be installed on it and they can run as processes inside the container.
e. Everything that is needed for an application to run is installed inside one container. This includes the main software and all of its dependencies.
f. A running container can be captured in a snapshot. This snapshot is a file that can be transferred and shared just like regular files. From this snapshot, a new container which is an exact copy of the original container can be made. Which means, this new container has the main software and all of its dependencies installed. This is what makes the container building process one-time. Every other running instance on other machines is a copy of the original one using the snapshot process.
Docker
There have been many container-based solutions in the past. The most popular ones were fully self-contained Java applications that used to run on Java-enabled platforms such as Tomcat, Springboot, etc. These Java containers could simply be copied from one machine to another and executed as long as the target machine also had a Java-based container platform such as Tomcat. There were also other solutions like Unikernels, LXD containers, etc. But none of them achieved the same popularity that Docker has achieved.
What is Docker? Docker is a container application platform that now powers most of the applications that run on servers around the world. A machine running Docker containers can be a Windows, Mac or Linux machine. The main reason for Docker to achieve so much popularity is the same reason by which iOS and Android achieved widespread adoption. Any guesses for the reason?
DockerHub. The Docker equivalent of the Apple App store and Google Play Store. DockerHub is a huge repository of Docker snapshots published by people around the world. When people started making applications to run on Docker, they started taking snapshots and publishing them on DockerHub, so others could start using those snapshots without having to start from scratch.
As a result, you have the world’s biggest repository of Docker-based applications today. There is one snapshot (called an image in Docker-speak) for every type of application you can think about, be it a simple web server, all the way to financial number-crunching apps and scientific DNA sequencing apps. As long as you have installed the Docker platform on your physical machine, you can pick any Docker image, turn it into a container and run with it, just like your fully assembled bicycle.
To learn more about containerisation and Docker, please visit the official Docker website.
Soon, the statement, “There is a container for that!” will be as ubiquitous as the statement for apps.
Conclusion
The world of containerised apps makes it easy to package a software application and its dependencies into an easy to run container. Container platforms also make it easy for you to take a snapshot of your fully working container and replicate the same software in as many machines as you like. You can even share the snapshot with the rest of the world. Very soon one of the software world’s most popular slogans will be, “There is a container for that!”