Kata Containers — Kubernetes, Firecracker VMM with Kata

Kata Containers on Kubernetes and Kata Firecracker VMM support

Gokul Chandra
7 min readApr 16, 2019

--

Enjoy the speed of containers while still keeping the security of virtual machines?? what does this mean and how different is this approach when compared with existing container architecture?? Traditional containers use Linux control groups, cgroups, for managing and allocating resources and namespaces to provide container isolation. Shared kernel between the host operating system and the guest containers, leaving the other container workloads in a cluster vulnerable if one container is comprised. This issue is one of the big drivers behind Kata Containers originated form Intel’s Clear Containers project started in 2015.

In Kata, each container has its own lightweight virtual machine and mini-kernel, providing container isolation via hardware virtualization facilitated by Kata’s six components: Agent, Runtime, Proxy, Shim, Kernel and packaging of QEMU. It is designed to be architecture agnostic, run on multiple hypervisors and be compatible with the OCI specification for Docker containers and CRI for Kubernetes.

The recent addition of CRI (Container Runtime Interface) to Kubernetes means Kata Containers can be controlled by any OCI (Open Container Initiative) compatible CRI implementation, CRI-O (lightweight alternative to using Docker as the runtime for kubernetes) being the main one. Clear Containers can now receive container annotations to let it know when and how to run pod VMs or container workloads within those pods. In Kubernetes clusters with CRI-O and cc-runtime as the default container runtime, the launch of a pod results in the creation of a VM. Then, when a container is added to that pod, it is launched as a container inside the pod’s VM.

Pod-VM’s Architecture — CRI-O

Kata Containers on Kubernetes

The key difference between the Kata approach and other container engines is that Kata uses hardware-backed isolation as the boundary for each container or collection of containers in a Kubernetes container pod. As of today Kata can be installed with CRI-O or Containerd and cannot be leveraged with dockershim. As the architecture uses QEMU (hardware virtualization), Kata will only work if nested-virtualization support is available on the virtual machines.

kata-runtime: The Open Containers Initiative (OCI) compliant runtime which is called by an upper level orchestrator (such as Docker or a Kubernetes CRI-shim). With Kata installed we can start pods which will make use of Kata Containers.

Kata-Runtime Installation

CRI-O: CRI-O is an implementation of the Kubernetes CRI (Container Runtime Interface) to enable using OCI (Open Container Initiative) compatible runtimes. It is a lightweight alternative to using Docker as the runtime for Kubernetes.

CRI-O

As seen above docker is replaced with CRI-O and all the client operations can be done using “CRICTL”.

kata-proxy and kata-shim: proxy binary runs on the host to help conversion of multiplex messages into demultiplex messages and executed per pod. shim binary runs on the host to handle stdio and signaling for container processes.

Kata-Proxy and Kata-Shim

Along with the above binaries, artifacts associated with the VM itself like : containers.img and vmlinuz are included.

Enabling Kubelet to use CRI-O as runtime:

Extra arguments can be provided to Kubelet configration to use CRI-O as a CRI replacing conventional Docker runtime.

CRIO Configuration

Initializing Kubernetes with CRI-O:

CRI-O as CRI

Kubeadm with the above configuration can be used to bootstrap a Kubernetes cluster with CRI-O. Crictl replaces docker cli.

Using Crictl in place of Docker-CLI
Using Crictl in place of Docker-CLI

CNI can be installed the same way users do with Kubeadm and docker.

Kubernetes Cluster

Running Kata Containers as Kubernetes Pods:

The CRI-O project supports the ability to provide a secondary runtime to handle untrusted workloads using Kubernetes annotations. In CRI-O this is called the untrusted-runtime. This means, in an environment with workloads of various levels of trust, CRI-O allows your Kubernetes cluster to be composed of a mix of runc and cc-runtime based pods. Note, if this `untrusted runtime` is not provided in the CRI-O configuration, then all workloads will make use of the trusted runtime, which defaults to runc.

By default all the kubernetes control-plane components are created on runc as a specific annotation is not set by the Kubeadm while creating the containers. As seen below the 18 Kubernetes control plane components are defaulted to runc as specific annotation to use cri-o was not mentioned in the manifests.

RunC

No control-plane container/pod is identified as a Kata workload and will not be listed in “kata-runtime list”.

Creating a sample container with the ‘kubernetes.cri-o’ set in the manifest above the

Sample Manifest

The above manifest creates test-kata vm-containers on Kubernetes using CRI-O and Kata container runtime. The same will not be listed on runc as a specific annotation to use cri-o was set.

Crictl inspecting Kata VM Containers

kata-runtime list output:

Kata-Runtime

As Kata uses QEMU to create light-weight VM’s the pods created with the manifest above are listed in the QEMU processes as seen below.

QEMU Supporting Kata to emulate Virtual Machines

As seen above Kubernetes can leverage CRI-O & Kata-Runtime to create pod-vm’s where Kubernetes can accept/operate them as pods along with providing all the properties/features that comes with a virtual-machine.

Firecracker VMM + Kata Containers

Firecracker form AWS is a Virtual Machine Manager (VMMs) used to create and manage microVMs with a minimalistic and simple design reducing memory overhead. Firecracker is exclusively designed for running transient and short-lived processes like functions and serverless workloads which require a faster start and higher density with minimal resource utilization. Firecracker was developed using the language Rust as a way to enhance the backend implementation of AWS Lambda and AWS Fargate.

Kata Containers vs Firecracker:

Kata executes containers within QEMU based virtual machines. Firecracker is a cloud-native alternative to QEMU that is purpose-built for running containers safely and efficiently, and nothing more. Firecracker is being positioned as a next-generation of Kata that would be more focused on modern workloads. Firecracker also allows the use of container runtimes like Containerd to manage containers as microVMs. This allows Docker and container orchestration frameworks like Kubernetes to use Firecracker. However, initial integration with Kubernetes is limited to external APIs.

Creating a MicroVM using Firecracker

Firecracker VMM supports a totally api-driven operations schema which can easily tie with the serverless and lambda architecture. Once Firecracker-API starts to serve requests users can submit guest-kernel, guest-filesystem, configuration, starting and stopping the micro-vm through API calls.

Below is a simple example showing how a VM can be created:

Providing kernel_image_path and file-system:

File System and Kernel Configuration

Configuring resources and starting the VM:

Resource Configuration

With this a micro-vm will be created on the host and the whole architecture uses Rust.

MicroVM Creation

The Firecracker process exposes REST API via a UNIX socket, which can be used to manage the lifecycle of a microVM. Users can only access them through UART/serial console because they don’t even run SSH. Apart from the serial console, these microVMs may be connected to a virtual NIC, a block device and a one-button keyboard.

Running Kata containers utilizing Firecracker VMM/Hypervisor

The 1.5.0-rc2 release of Kata Containers introduces support for the Firecracker hypervisor. This is an initial release and evolving. Docker should be configured to use kata-runtime and vhost_vsock is leveraged to enable communication between a hypervisor and virtual machine.

{
"runtimes": {
"kata": {
"path": "/opt/kata/bin/kata-runtime"
}
},
"storage-driver": "devicemapper"
}

With this docker configuration users can now run kata containers utilizing firecracker. A flag can be passed with docker cli to run containers as shown below:

docker run --runtime=kata -itd --name=kata-test alpine sh

In this case QEMU is not used as a hypervisor, Kata uses Firecracker hypervisor to create VM’s.

Firecracker as Hypervisor to host Virtual Machines

With this support users can use Firecracker with Kubernetes to create micro-vm’s once the project evolves.

Kata works well in an environment where users need the efficiency of a container stack with a higher level of security than running containers side by side in a single kernel. Multiple scenarios such as network functions virtualization, edge computing, development and testing, and containers as a service can greatly make use of Kata. NFV on Kubernetes can benefit significantly as Kata can provide all the features that a VM can have to containers which means NFV (like: multiple interfaces, DPDK, SRIOV etc.) can be made easy in container space in the future using this approach. In addition, Kata’s small footprint and high level of security will make it well suited to edge deployments where resources are limited.

--

--