The state of virtual machines in 2020

Published in

Level Up Coding

12 min readDec 18, 2019

Unless you have been living under a rock, you might have heard or probably even used Virtual Machines. In a very very light and brief definition, Virtual Machines can be called an emulator for OS. Without considering the complexity, it is analogous to you playing Gameboy Pokemon on your smartphone or PC. Pokemon was originally intended to run on a Gameboy console, but since the console is obsolete now, you can use an emulator to emulate the environment of the Gameboy on PC or smartphone. The game doesn’t know the difference if it is running on the console or the emulator. The same is with a Virtual Machine and Operating System. More on this later.

Background on VMs and Why They are Needed

In the mid 80’s, operating systems with GUI started to come out and since then, complex code upon complex code has been written for both software and the OS alike. Fast forwarding to today, there are n number of software ecosystems available in the market, and in this, there are 3 major operating systems to consider.

The above image represents market share of the major OS and their versions over the time period 2004-present.

Note: The above snippet is a Google Trends screenshot. It does not provide the actual usage data for every OS but the search popularity for the term.

What does the OS market share have to do with Virtual Machines you ask?

Suppose you are the creator of Photoshop, now you want your product to be used by as many people as it can, which means it should work well on all major OS and their older versions which are still supported (if the market share is large enough).

Going with our super stripped down definition of virtual machine, we can emulate macOS, Ubuntu, Windows 7 all in our Windows 10 machine, which is a far far better alternative than using all the different machines with its respective OS installed. The application can be run and be checked for bugs all from one place. VM can also be used to checkout other Operating Systems before switching or if you are like me, for curiosity.

Virtual Machines

A Virtual Mission (VM) is essentially an emulation of a real computer that runs an OS and executes programs like a real computer. It can also be called a program that acts as a virtual computer running on an existing system with virtual hardware. VMs run on top of a physical machine using a Virtual Machine Monitor (VMM) or commonly known as Hypervisor. A computer which runs a hypervisor with one or more virtual machines is called a host machine or a Bare-metal server, and each virtual machine is called a guest machine

Hypervisor

The hypervisor is a piece of hardware or firmware upon which VM are run. The host machine provides the VMs with resources, like RAM and CPU. The virtual hardware devices provided by the hypervisor map to real hardware on your host machine. All the resources are divided between the active VMs. The resources can also be distributed as per the needs of the VM for running heavy applications.

The VM running upon the Hypervisor is known as a guest machine. The guest machine consists of the application and other files it the applications requires to run like drives, system binaries etc. Due to security reasons VMs cannot directly use the hardware of the Host machine so it also has to carry an entire virtualized hardware stack of its own, which basically means to say that it carries the whole OS. VMs create such an environment that the applications running inside it do not have any clue about the VM or the Host system.

The benefit of a hosted hypervisor is that the underlying hardware is less important. The host’s operating system is responsible for the hardware drivers instead of the hypervisor itself, and is therefore considered to have more “hardware compatibility.” On the other hand, this additional layer in between the hardware and the hypervisor creates more resource overhead, which lowers the performance of the VM.

As you can already see that the VM demands a full fledged OS on top an existing one to run applications. This is a bit overwhelming and unnecessary. After knowing the basic components, a question might pop in your mind- What is the need for a hypervisor?

The hypervisor plays the role of a placenta between the guest and the host machine. It provides the VM with a platform which manages the needs of the guest and also executes the OS too. Hypervisor allows the host to share its resources with all the guests connected to it.

On a more clear note, let’s go back to the Pokemon example.

All emulators can be classified as a Virtual Machine as you can understand from the Pokemon example. The emulator is the VM software and you can consider the Pokemon game as the OS. The game runs inside the emulator and it does not know that it is running on a console or emulator. The hypervisor on the VM/emulator makes sure that the OS/game is provided with all the resources it needs for it’s proper functioning. The best example for that are the virtual buttons the emulator provides which are analogous to the real hardware buttons on the console which you can in-turn use to control the emulator.

The Downfall

In the early days when VMs were introduced, they were such a promising piece of software which lets you run another as a program on your existing machine, but the problem they posed is that they were slow and bulky.

They chunked up a lot of resources of the host which made the host compromise with the original programs on it. The users made peace with it as there was no other better alternative available at that time. Everyone was fine working with their VM and suddenly one day Docker came to light and it changed the game for good. If you are a programmer or tech geek you most probably have heard of Docker. It’s hard not to have heard about it with all the attention it is getting lately from developers, sys admins and even the big leagues by the likes of Google, Amazon and VMware have their services to built to support Docker.

What is Docker?

Docker is a tool designed to facilitate in creating, deploying, and running applications with the help of containers. Containers allow a developer to pack up an application with all the parts it needs such as binaries and dependencies into a package which can be shipped in one go. Doing so can assure that the application will run on any machine running on Linux kernel regardless of the settings the machine has or could differ from the machine on which the code was originally written on.

You can see in a way docker behaves kind of a virtual machine, but rather then creating a whole virtual OS, Docker allows applications to run on the same Linux kernel as the system. The applications only require to carry the data which is not already present on the host machines, instead of the whole OS as in the case of virtual machines. This provides Docker with a significant speed and performance boost while also reducing the application size. The most important plus point with Docker is that it is open source meaning anyone can contribute to Docker and expand its functionality to meet his/her own needs if additional features are required and are not supplied with Docker out of the box.

Regardless of whether or not you have an immediate use-case in mind for Docker, I still think it’s important to understand some of the fundamental concepts around what a container is and how it compares to a Virtual Machine.

Before we get started on how Docker works, first you should have an idea about the components that make up Docker. Let’s get started with

Containers

Containers are programs intended to isolate an application and its dependencies into a self contained unit, which can run anywhere while also removing the need of hardware allowing more effective use of the hosts resources. The goals of containers align with that of the Virtual Machine even though they differ on their approach to the same problem.

Docker Engine

Docker engine is the runtime and tooling on which Docker runs and manages containers and other images. It is very lightweight and runs on Linux Systems natively.

They are made up of

Docker Daemon
Docker Client
REST API for interaction to the Daemon

Docker Client

Client is the part with which you, the end-user interacts. Call it the UI part if you will.

Docker Daemon

When as an end-user you send a command to Docker through the client, the Daemon is what actually receives and executes the commands. These commands may include anything from building, running or for distribution. The Daemon runs on the host machine and you as a user can never directly interact with the Daemon.

Dockerfile

The file which consists of the instructions for creating a Docker image is called a Dockerfile. Once the dockerfile is setup the docker build command is used to create the docker image. It refers to a base image which created the most initial layer for the image. Subsequent instructions in the file create more layers on top of each other. More on this later.

Docker Image

The Docker images are read only templates that you build from the set of instructions given in the Dockerfile. In simple terms it is a mold or a blueprint containers, whose shape or the wire frame cannot be changed(read immutable) The image defines how the containers produced from it will look like and function. The image consist of the Dockerfile, dependencies and all the code your application needs to run, all bundled together.

As mentioned earlier the image contains a layered system created using the Dockerfile instructions. The dockerfile creates the base layer and the instructions help create subsequent layers.

The powerful structure and the speed of Docker is achieved by the layering file system it creates. This system is called the Union File System. Each instruction in the dockerfile created a new layer in the image or replace a layer beneath it.

For example Python is a popular base image. Additional layers can be added via instructions to added NumPy or Pandas as intermediate layers.

Union File Systems

Union File System is a stack able file system which helps docker in building image files. It can stack up files and directories from different file systems(branches) over each other. The branches having same path are merged in a single single directory which avoids the need to create copy of already existing layers.

(This is just a basic gist of Union File System, they are much more then explained above). There are 2 main benefits of the layered system

It is duplication free as explained above.
No segmentation- As all the layers are segregated making changes to different layers is pretty easy, because the changes are to be applied to that layer only and not to the full container.

Volumes

Volumes are the data part of the container. They are initialized when a container is created. These volumes are a separate part in the container from the Union system. The benefit of such un-dependency is that any changes made to the container through the union file system will leave the data volumes untouched. If something is to be changed in the volume they provide you with a way to directly interact and do changes to the volumes. As a bonus these volumes can be shared and reused among multiple containers as they contain raw data.

Now that we got all the technical aspects covered lets talk about the Docker technology in general, mainly about containers. If you dig enough you will find that containers are not at all new concepts. Some old Linux container technologies are BSD jails, Solaris Zones and even Google has its own container tech for years. So now the question pops in your head Why suddenly the hype around Docker now??

And the answer to that question is

The biggest reason is that everyone loves the Docker Whale 😜😉

On a serious note with Docker we get the Docker Hub. Docker Hub is sort of the Play Store where you can get thousands of the Docker Images created by the community and are readily available to use. And as we saw above we can use these images with little to no modification for our use.
Speed The containers are very lightweight and hence fast. As containers directly run on the kernel of the host, they need very few resources to get up and running as compared to a VM.
Modular- As you saw we can easily create containers and move it around as we please between machines without breaking functionalities. We can segregate multiple functionalities into multiple containers and can make them run in isolation to each other. For example we can run a node server on one container and simultaneously an instance of Apache on other. The container architecture also makes it easy to scale an application effectively without much hassle.

Are Virtual Machines dead?

The short answer is No. Docker brings some nice tricks on the table but yet is not a full blown solution for the VM problem. While Docker is certainly gaining a lot of steam, it wont be replacing VM any time soon. Both Docker and Containers will still continue to grow bur there are certain scenarios where using VM will still be better suited.

For example- Docker is more suitable if you want to run many instances of a single application while VMs are generally preferred where we have to run multiple applications on multiple servers.

As discussed above, containers allows you to break your application into functional and discrete parts letting you divide the application into smaller entities certainly creates more and more moving part to manage at once, which on a huge application can get out of hand real quick.

While there are a lot of perks related to the use of docker, one certain issue always pops us. Security. The infrastructure of docker which made it so compelling, itself is the cause of the concern. As you already know that Docker containers use the same kernel as the host OS, the barriers between the host and other containers becomes thinner. Containers are equipped to make system calls to the host kernel which just provides the attackers with an expanded surface area to strike on. On the other hand VM does not face this issue as they make calls to the kernel through the Hypervisor(hypercalls), which provides with a layer of abstraction between the VM and the host kernel providing more secure solution as compared to that of Docker. Developers are likely to pick up VM if security is a very huge concern for them (which is 90% of times) as its infrastructure which made it slow and less popular provides with higher degree security.

Off-course an argument arises here stating that VM had a lot of time to evolve/ Docker is still in infancy etc etc, concerns like security are certain to evolve as docker and containers get more and more exposure and might be in the future docker is the only thing we need but for now the debate lies in the lap of the first hand users who use them everyday and have a decision to make. But on a more general note at the present moment Docker and VM need to coexist to cater to the different requirements of the industry.

Conclusion

I hope this piece made you aware and equipped with knowledge regarding both the Docker and VM technologies. This article should be taken as a stepping stone if you are interested to learn more about them or even use one of them in a project.