Containers

 · 

Tobias P.L. Wennberg

 ·  Tags: IT, blog

The term container in computer systems encompass many different technologies. Maybe you think of docker, qubernetes and cloud images; maybe flatpak, snap and appimage. Containerisation generally mean a "self contained program". That is a program that bundles all dependencies, leaving minimal requirements on the host. In the hosting world, containers is synonymous with docker, the project pioneering the OCI-container. Docker was not the first container-technology. Seven years before docker, linuxcontainer.org released the LXC-container. Way before that, virtual machines popularised. While VMs are generally distinguished from containers, the difference is not obvious.

The open container initiative is an initiative established in 2015 by docker et al. and governs the OCI specification that container runtimes adhere to. The OCI container may be called a process container. The container type generally spawns it's primary process directly, seldom opting to run an init system, and even more rarely a heavy init system such as systemd-init. This makes the container type decisively distinct from a virtual machine. The OCI container really is just a process, bundled with it's dependencies, run within a restricted Linux namespace - to manage the containers network, cgroups - to restrict the available hardware, and an OverlayFS - to give the container its own filesystem; through a runtime such as runc. These technologies is the backbone technique that allows the container to seem host-like, with its own file-system and ip-address on its own subnet. You can observe the limited process usage by running pstree on your host and look for the process run in the docker container.

OCI containers being so process close has performance advantages and keeps the container structure simple, but it makes the container clearly distinct from virtual machines. I generally find docker containers annoying to administrate and setup for long-term deployments, and I find them often to be the wrong tool for the job. Prebuilt container images are generally great to get a minimal running setup. But doing your own configuration often require you to build your own image, thereby removing the convenience of using docker in the first place, leading me to choose something more VM-like, or something closer to the host. In my self-hosting, I use OCI container when the official project of the software I want to install provide an image, such as jellyfin, and the image is precisely what I need. Be wary of prebuilt images. Since many does not receive updates, they can cause you to run old vulnerable software.

I find docker-compose to be a great tool for many types of software testing. Especially networking software or anything with multiple processes. This is mainly due to the test setup being defined in code, instead of tmux buffers. The environment being resource constrained is a huge bonus in some cases, removing the risk of resource exhaustion. I have found this particularly useful while doing prolonged testing, such as fuzz-testing, and other resource intensive testing, where I can leave the test running in the background while preserving resources for the task at hand; or while starting a java process. Docker is often used for automated testing. It is among other things the backbone of gitlabs CI/CD pipelines. This is probably because of the ease of bundling dependencies, removing the reliance of a central repository server that has the wrong version of a specific python package and is missing that critical dependency. A particularly col use of docker is Ansible molecule, where you can test the host orchestration much quicker than if you had to spin up virtual machines.

OCI containers is the de facto tool for "cloud images", and is thereby an essential part of cloud computing. Due too tools such as kubernetes and other cloud-native technologies, OCI is the technology used for "infinite scaling". OCI is used in these technologies due to the minimal overhead while leaving all of the dependency burden with the customer.

Virtual machines is the tool if you want a clearly distinct machine, but without the hassle and cost of physical hardware. The biggest difference between a container and VM is that a VM runs its own kernel, while a container piggyback on the host kernel. With any rule, their is a exception, this time being that the kata container runtime uses the KVM backend. The biggest advantage of a container over a VM is performance: since a container share the host kernel and starts the primary process directly, the container image is generally significantly smaller than the VM, it requires less RAM, storage, the networking may be faster. The startup time of a container is orders of magnitude lower thant that of a VM. The container can be more dynamic and only needs to allocate the resources it actually uses. While the hypervisor may have the same function, it requires specific client OS and is generally less efficient.

Advantages of VM is, among other things: predictable performance, since the resources are reserved; higher average stability since each VM has its own kernel; they are more dynamic in the sence that you can choose your kernel; and the client isolation is generally more secure. Since your hardware has builtin functions to separate virtual and physical machine, vm-escape attacks is rare and often hypervisor specific. Container escape attacks is more common. That may in part be due to the container technology being younger. I run two VM's in my homelab, an opnsense firewall and an openbsd firewall. I use VM because you can't run a non-linux container with a linux operating system. Even if I could, I would probably choose VM in these cases anyway due to stability and reliability of the VM solution. I don't believe there is a real performance cost since I would have guaranteed memory availability anyway, and the interfaces are passthrough.

LXC container is a different container technology developed by linuxcontainers.org. The container is distinct from OCI containers, aiming to be more VM-like while removing the overhead of running a VM. The project has been ongoing since 2008, seven years before docker. A LXC container starts most software you interact with, but share the host kernel, providing a near-VM experience. This is the type of container I make the most use of in my homelab. I want as low resource usage as possible, and especially dynamic resource allocation - due to my limited hardware. Thereby I avoid VMs. I do not like OCI containers due to the restrictions they impose, namely due to the lack of init system. I don't just want jellyfin, I want rsyslogd too so that I can send the logs to openobserve. LXC containers make that use case trivial, since it's the exact same way as with a VM, while keeping the resource usage small, resource limit configurable in a convenient manner.

A quick and dirty test show that a newly created alpine LXC-container allocate 12.3MiB RAM while a newly created alpine OCI container (started with podman) allocate 2MiB. While that is a six times increase on LXC-container, it is still very little resources either way since I will only use one container for any one purpose. When you are designing too scale and will be dynamically creating and destroying containers, the difference adds up.