vSphere Integrated Containers 0.4 – Inspecting VCH and ContainerVM

Last week, I had an interesting conversation with my friend Michael on vSphere Integrated Containers (VIC) in it’s current version 0.4. We discussed some of the key concepts and how they relate to other container implementations out there. I decided to summarize the key observations with a little more detail here as I expect this information to be interesting for operations teams once they start running VIC.
Please note: this is based on the currently available Open Source VIC project in version 0.4 running on vSphere 6.0 in my homelab.
For simplicity reasons, I decided to go with a “standalone ESXi” installation of my Virtual Container Host (VCH) in this example.

First, I created a new container host called “VCH001” on my ESXi host from my PhotonOS based worker VM:
root@photonbox [ /workspace/vic ]# ./vic-machine-linux create --bridge-network 'VCH Bridge' --external-network 'VM Network' --image-datastore mydatastore --target 'root@esxi01.think-v.com' --name VCH001

The output of this command shows the necessary details:
INFO[2016-08-06T19:51:56Z] Please enter ESX or vCenter password:
INFO[2016-08-06T19:52:00Z] ### Installing VCH ####
INFO[2016-08-06T19:52:00Z] Generating certificate/key pair - private key in ./VCH001-key.pem
INFO[2016-08-06T19:52:02Z] Validating supplied configuration
INFO[2016-08-06T19:52:05Z] Firewall status: DISABLED on /ha-datacenter/host/esxi01.think-v.com/esxi01.think-v.com
INFO[2016-08-06T19:52:05Z] Firewall configuration OK on hosts:
INFO[2016-08-06T19:52:05Z] /ha-datacenter/host/esxi01.think-v.com/esxi01.think-v.com
INFO[2016-08-06T19:52:05Z] License check OK
INFO[2016-08-06T19:52:05Z] DRS check SKIPPED - target is standalone host
INFO[2016-08-06T19:52:07Z] Creating Resource Pool VCH001
INFO[2016-08-06T19:52:07Z] Creating appliance on target
INFO[2016-08-06T19:52:07Z] Network role client is sharing NIC with external
INFO[2016-08-06T19:52:07Z] Network role management is sharing NIC with external
INFO[2016-08-06T19:52:09Z] Uploading images for container
INFO[2016-08-06T19:52:09Z] bootstrap.iso
INFO[2016-08-06T19:52:09Z] appliance.iso
INFO[2016-08-06T19:52:22Z] Waiting for IP information
INFO[2016-08-06T19:52:42Z] Waiting for major appliance components to launch
INFO[2016-08-06T19:52:44Z] Initialization of appliance successful
INFO[2016-08-06T19:52:44Z]
INFO[2016-08-06T19:52:44Z] Log server:
INFO[2016-08-06T19:52:44Z] https://VCH_IP:2378
INFO[2016-08-06T19:52:44Z]
INFO[2016-08-06T19:52:44Z] DOCKER_HOST=VCH_IP:2376
INFO[2016-08-06T19:52:44Z]
INFO[2016-08-06T19:52:44Z] Connect to docker:
INFO[2016-08-06T19:52:44Z] docker -H VCH_IP:2376 --tls info
INFO[2016-08-06T19:52:44Z] Installer completed successfully

More details about the inner workings can be found in the VIC 0.4 blogposts by Cormac that are also listed in the link section below. In this post I’d like to focus more on the topic of state information and how this is handled in VIC 0.4.

First of all, it is important to understand the difference between VCHs in VIC in comparison to other (in this case linux-based) container solutions. While each container in a N:1 model (containers:linux) has its private namespace, the underlying shared kernel provides the container control plane to look into containers and perform process-related actions (start, stop, …). Runtime environment and control plane are directly coupled.

In VIC, the runtime/execution environment of the container is a so called containerVM (based on PhotonOS) which is decoupled from it’s “control plane”, the Virtual Container Host itself. This creates a new layer of abstraction where communication flow but also state information needs to be captured and made available.

To establish a secure communications path between these two components, VIC also introduces the concept of a Tether to connect into the actual containerVM. This concept is part of the Port Layer Abstractions that allows VIC to be extensible. More details are described on the VIC Container Abstractions documentation page.

Let me share a summary of how the VCH and containerVMs actually look like on the infrastructure – and where information on state is actually stored. At first, let me go into the VMX file of the VCH. As expected, there are two vNICs attached:

ethernet0.virtualDev = "vmxnet3"
ethernet0.networkName = "VM Network"
ethernet0.pciSlotNumber = "192"
ethernet0.uptCompatibility = "TRUE"
ethernet0.present = "TRUE"
ethernet1.virtualDev = "vmxnet3"
ethernet1.networkName = "VCH Bridge"
ethernet1.pciSlotNumber = "224"
ethernet1.uptCompatibility = "TRUE"
ethernet1.present = "TRUE"

Here, we also find the boot disk that got transferred with the deployment of the VCH:

ide0:0.deviceType = "cdrom-image"
ide0:0.fileName = "appliance.iso"
ide0:0.present = "TRUE"

The general approach for storing state information is described in the Configuration persistence mechanism overview documentation. According to this, VIC actually makes use of the vSphere extraConfig and guestinfo mechanisms to store relevant information. But where do extraConfig and guestinfo actually reside? In a normal vSphere VM, this information is stored in the VMX file of the VM (and remember, a container in VIC actually is a VM – the containerVM).

Starting a simple “hello-world” container should trigger the whole workflow that also creates a new VM. But let’s go through it step by step:

root@photonbox [ /workspace/vic ]# docker -H VCH_IP:2376 --tls run -it hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
a3ed95caeb02: Pull complete
c04b14da8d14: Pull complete
Digest: sha256:548e9719abe62684ac7f01eea38cb5b0cf467cfe67c58b83fe87ba96674a4cdd
Status: Downloaded newer image for library/hello-world:latest

Looking at the recently executed containers from my worker VM, we can see the following reference:

root@photonbox [ /workspace/vic ]# docker -H VCH_IP:2376 --tls ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2cf7f483bf6e hello-world "/hello" Less than a second ago Stopped jolly_panini

So our container ran as ID 2cf7f483bf6e. How does that containerVM actually look on our standalone ESXi host and even more interestingly, where does the information about the container (from docker ps -a) come from?

First of all, there is a newly created VM named 2cf7f483bf6e7f32daa53f51ca388d5fb153f78d3a74d313318099086638ad58 – just as expected. Looking at the VMX file, we’ll find a lot of session information that we already found in docker ps -a:

guestinfo./common/name = "jolly_panini"
guestinfo./sessions|2cf7f483bf6e7f32daa53f51ca388d5fb153f78d3a74d313318099086638ad58/common/name = "jolly_panini"
guestinfo./sessions|2cf7f483bf6e7f32daa53f51ca388d5fb153f78d3a74d313318099086638ad58/cmd/Path = "/hello"
guestinfo./repo = "hello-world"

The containerVM mounts the “bootstrap.iso” from the VCH001’s VM folder (that also got deployed via the vic-machine installer):

ide0:0.deviceType = "cdrom-image"
ide0:0.fileName = "/vmfs/volumes/d37f7a1b-0ab13c48/VCH001/bootstrap.iso"
ide0:0.present = "TRUE"

The containerVM also has a serial connection to the VCH (explanation):

serial0.allowGuestConnectionControl = "FALSE"
serial0.fileType = "network"
serial0.fileName = "tcp://VCH_IP:8080"
serial0.network.endPoint = "client"
serial0.yieldOnMsrRead = "TRUE"
serial0.present = "TRUE"
serial0.hardwareFlowControl = "TRUE"

The containerVM’s network adapter is connected on the “VCH Bridge” portgroup and therefore only talks to the VCH. This is where the container traffic is flowing, management and control plane traffic is going via serial0.

ethernet0.virtualDev = "vmxnet3"
ethernet0.networkName = "VCH Bridge"
ethernet0.pciSlotNumber = "192"
ethernet0.uptCompatibility = "TRUE"
ethernet0.present = "TRUE"

The containerVM also has it’s own harddisk (attached VMDK):

scsi0.virtualDev = "pvscsi"
scsi0.present = "TRUE"
scsi0:0.deviceType = "scsi-hardDisk"
scsi0:0.fileName = "2cf7f483bf6e7f32daa53f51ca388d5fb153f78d3a74d313318099086638ad58.vmdk"
scsi0:0.present = "TRUE"

To delete the VCH and the containerVMs, vic-machine-linux is called with the “delete” option:

root@photonbox [ /workspace/vic ]# ./vic-machine-linux delete --target esxi01.think-v.com --user root --name VCH001
INFO[0000] Please enter ESX or vCenter password:
INFO[2016-08-06T20:50:24Z] ### Removing VCH ####
INFO[2016-08-06T20:50:28Z] Removing VMs
INFO[2016-08-06T20:50:33Z] Removing images
INFO[2016-08-06T20:50:34Z] Removing volumes
INFO[2016-08-06T20:50:36Z] Removing appliance VM network devices
INFO[2016-08-06T20:50:38Z] Bridge network was not created during VCH deployment, leaving it there
INFO[2016-08-06T20:50:40Z] Removing Resource Pool VCH001
INFO[2016-08-06T20:50:40Z] Completed successfully

 

In summary, all container state information is kept close to the containerVM, stored in the VMX file. VCH and containerVM use the ISO-files that are tranferred during the vic-machine install process. VIC also introduces a new level of abstraction between control plane and execution environment that allows VIC to be extensible for future usecases.

 

Additional links around VIC 0.4 – most of them by Cormac:

My VMworld 2016 session proposals (NFV, Containers/Docker, …)

VMworld session voting time has come and I just wanted to share my session proposals – your votes would be appreciated:

On the multi-dimensional evolution of platforms and applications

I’d like to touch on a topic that I am seeing in several of my areas of interest right now. In general, it’s related a lot to the overall topic of “First, Second and Third Platform” but I’d like to focus more on the individual implications for multiple domains. Over the course of the last few months, I have been involved in several discussions around different platforms and applications as well as their individual evolution and maturity. My personal observation is that both don’t necessarily evolve synchronously. Therefore, it is important to not only identify the phase that you are currently in but also to understand the operational implications of the “generation disconnect” between app and platform.

 

Evolution of Platforms

As mentioned above, I’d like to relate my observations to the three platform generations below. I’d like to point out that these three generations are subdivided in several different technologies and can look and feel different in specific use-cases or fields of application. The common sense around the generations is:

platform_evolution

 

In addition to these phases, I see different “implementations” of the respective platform generation. Take Client-Server as one example – this can be a physical-server-only model, this also stretches to server virtualization and potentially even to “VM-oriented” hosting or even cloud services. My friend Massimo also wrote a nice piece on this.

 

Evolution of Applications

One of my key observations is that there is no simple 1:1 connection between applications and platforms. With the rise of 2nd generation platforms, not all applications from the 1st platform have been dropped and immediately been available for the next-generation platform. It’s actually an evolution for applications that are still business-relevant and therefore make sense to be optimized for the next-generation platform. And here comes the important observation: I believe there are (at least) three phases in an application evolution cycle that is happening for each platform generation – or potentially even in each concrete implementation of the platform generation. I’ll call theses phases “Unchanged”, “Optimized” and “Purpose-built” for now:

application_evolution

But how does that fit in the overall platform picture? I’ll try to merge the previous two pictures into one. It also shows a potential application evolution path across platform generations. As you can see, there can be a slight overlap between “purpose-built” of the previous and the “unchanged” phase of the next-generation platform.

evolution

But let’s move on to two concrete examples that I see applicable.

 

Example 1: Network Functions Virtualization

I’ll start with Network Functions Virtualization (NFV). NFV is a Telco industry movement that is supposed to provide hardware independence, new ways of agility, reduced time to market for carrier applications & services, cost reduction and much more – basically, it’s about the delivery of promises of Cloud Computing (and Third Platform) for the Telco industry (read more about it here). The famous architectural overview as described by ETSI can be seen below:

ETSI NFV

NFV differentiates between several functional components such as the actual platform (NFVI = Network Functions Virtualization Infrastructure), the application (VNF = Virtual Network Function), the application manager (VNF manager) and e.g. the orchestration engine.

So how could this look in reality? Let’s assume the VNF manager detects a certain usage pattern of its VNF and that VNF is reaching it’s potential maximum scale for the currently deployed amount of VNF instances. The VNF manager then talks to the Orchestrator that could then trigger e.g. the scale-out of the VNF by deploying additional worker instances on the underlying infrastructure/platform resources. The worker instances of the VNF could then automatically be included in the load distribution and have instant integration into necessary backend services where applicable. All of that happens via open APIs and standardized interfaces – it looks and feels a lot like a typical implementation example for a “third platform” including the “purpose-built” app.

Now into a quick reality check. ETSI’s initial NFV whitepaper is from October 2012. It basically describes the destination that the industry is aiming for. And while there might be some examples where VNFs, NFVI and Orchestration are already working hand in hand, there is still a lot of work to do. Some of the “NFV applications” (or VNFs) might have been just „P2V“’ed (1:1 physical to virtual conversion) onto a virtualization platform and basically have the same configuration, same identity and are kept as close to its physical origins as possible. This allows a VNF provider/vendor to keep existing support procedures and organizations while offering their customers a “NFV 1.0 product” that is providing some early benefits of NFV (hardware independence, faster time to market, …). But this also implies that you transfer some of the configurations that made perfect sense in the physical world over to the virtual world – where it only makes questionable sense. In this case, I’d actually talk about a move from a “purpose-built” app from the first platform to an “unchanged” app on the second platform. 

One example: one physical server in a telco application had 30*300GB harddisks, had 2*4Core CPUs and 128GB RAM. It never used more than 1TB of storage and average utilization has been below 4 CPUs and 32GB RAM. The “unchanged” version of this app would be a 1:1 conversion with all (unnecessary) resource overhead provided in a virtual machine. The “optimized” version of this app is a right-sized application (so only 1TB storage, 4 CPUs and 32GB RAM) that is also leveraging easy configuration files for installation as well as crash-consistent and persistent data management to allow backup & restore as VM. But a “purpose-built” version of that app would leverage the underlying NFVI APIs, would allow scale-out deployment options based on actual demand as well as optimizations that are e.g. encryption at every layer of the application to ensure global deployment models even in the face of lawful interception relevance, etc.

 

Example 2: Microservices, Containers & Docker

My next example are microservices and their close friends containers. They are promising a new generation of application architecture and are drivers for the “3rd platform” architecture. One of this movements famous poster-childs is Docker. Docker is a great (new) way to package and distribute applications in “containers” that contain applications or just pieces of a larger application architecture. Newly developed applications usually follow a scale-out design, some might be written with something like the “12 factor app” manifesto in mind (or the 15 factors according to Pivotal). Coming back to the pictures above: a 12 factor app could be considered “purpose-built” for the “third platform”.

But how many applications have been built for this? There are many great examples for microservices-oriented applications by the “cloud-native” companies such as Google, Amazon, Facebook and the likes. Adrian Cockcroft also gives inspirational talks about these topics around the globe. But I actually expect many applications to stay mainly unchanged as they are optimized for their current platform. At the same time, some of them might become available as (Docker) containers as part their next release. But again – if you look into the details, you’ll find the same application in a different wrapper. RAR is now ZIP (for my German readers: “Aus Raider wird nun Twix…”). But will these potentially “single-container-applications” run well on a Cloud-Native/third platform architecture? They might not! To put it in a picture:

So in this case, it is actually important to understand these application limitations and expectations towards the platform (what about data persistence, security, platform resilience, networking, …) to make sure it runs smoothly in production. Coming back to Massimo’s blogpost – you can run your old Windows NT4 on a Public Cloud, but does it make sense?

Summary

Just like the continuous evolution of platforms that expose new characteristics and capabilities, there is also an ongoing evolution of applications. It is important to understand the key aspects of the application architecture and it’s deployment model before making a platform decision. The word “VNF” does not necessary imply the alignment with NFV and the word “Docker” does not automatically describe a Cloud-Native or microservices-oriented application.

 

Edits:

18.05.2016: added picture (containerizing legacy applications)

Install vSphere Integrated Containers v0.1 via VMware Photon OS TP2

I just wanted to share a few simple steps on how to set up vSphere Integrated Containers (VIC) v0.1 (released on April 4, 2016) on VMware Photon OS. You can follow most steps easily via copy and paste but please be aware that you should not run this in your production environment! I worked with a local installation running in VMware Fusion. The vSphere Integrated Containers Code on GitHub is in a preview state but please read the project status directly on GitHub. It also requires a few environmental variables to be available (e.g. DHCP). You can find the binaries for the install on Bintray as they are available as a pre-packaged .tar.gz file (direct download link).

First we need to install Photon OS TP 2 (fresh install from the ISO) and give it 1 CPU and 2GB of memory. I used the “full” install type while going through the installer. You can find the Photon OS TP2 ISO files on Bintray – I used the full ISO (direct download link). Cormac just told me that you can also use the “smaller” Photon OS or the OVA and add git, wget, tar and go to the OS as prerequisites for the install of VIC.

Why 2GB RAM? I ran into some issues (like “go build github.com/vmware/govmomi/vim25/types: /usr/lib/golang/pkg/tool/linux_amd64/6g: signal: killed”) while installing govmomi with less than 2GB. It was fixed by adding more memory – you should be able to go back to less than 2GB after the install.

After the install, we login via DCUI (e.g. via Fusion) and enable remote SSH access. You can either edit the sshd_config file with the editor of your choice of follow the commands below.

# Enable root login in /etc/ssh/sshd_config
sed -i 's/#PasswordAuthentication/PasswordAuthentication/g' /etc/ssh/sshd_config
sed -i 's/PermitRootLogin no/PermitRootLogin yes/g' /etc/ssh/sshd_config
systemctl restart sshd

Once you noted the IP address of your system, you can connect to it via ssh. I updated tdnf (Photon OS package manager) and the Photon OS system as a first action.

# Upgrade tdnf and Photon packages
tdnf upgrade tdnf -y
tdnf upgrade -y

Then, we set up a working directory for VIC, I called it “VCHmaster”.

# Create working directory /home/VCHmaster
mkdir /home/VCHmaster
cd /home/VCHmaster

Next up is the download and extraction of the VIC artifacts from Bintray (placed in /home/VCHmaster)

# Download files from https://bintray.com/vmware/vic/Download
wget https://bintray.com/artifact/download/vmware/vic/vic_0.1.0.tar.gz
tar -xzf vic_0.1.0.tar.gz
chmod +x vic/install.sh

Before installing govmomi, you need to set some environmental variables (GOPATH and adding the bin directory to your PATH):

# Set Go variables
mkdir /home/VCHmaster/govmw
export GOPATH=/home/VCHmaster/govmw
PATH=$PATH:/home/VCHmaster/govmw/bin

We can then install the vSphere API Go library:

# install Go library for the VMware vSphere API
go get github.com/vmware/govmomi/govc

At this point, we are all set to install the first Virtual Container Host (VCH). The command is pretty straight forward. All you need is a deployment endpoint (in my case my homelab ESXi host) with a datastore.

# install your first VCH
cd vic
./install.sh -g -t 'user:password@IP_ADDRESS' -i ESXi_datastore VCH-name

When we run the install.sh script, this is the output that we are seeing:

root [ /home/VCHmaster ]# ./install.sh -g -t 'root:PASSWORD@ESXi_IP_ADDRESS' -i ESXi_datastore VCH-name
# Generating certificate/key pair - private key in VCH-name-key.pem
# Logging into the target
# Uploading ISOs
[05-04-16 12:00:38] Uploading... OK
[05-04-16 12:00:41] Uploading... OK
# Creating vSwitch
# Creating Portgroup
# Creating the Virtual Container Host appliance
# Adding network interfaces
# Setting component configuration
# Configuring TLS server
# Powering on the Virtual Container Host
# Setting network identities
# Waiting for IP information
#
# SSH to appliance (default=root:password)
# root@VCH_APPLIANCE_IP_ADDRESS
#
# Log server:
# https://VCH_APPLIANCE_IP_ADDRESS:2378
#
# Connect to docker:
docker -H VCH_APPLIANCE_IP_ADDRESS:2376 --tls --tlscert='VCH-name-cert.pem' --tlskey='VCH-name-key.pem'


DOCKER_OPTS="--tls --tlscert='VCH-name-cert.pem' --tlskey='VCH-name-key.pem'"
DOCKER_HOST=APPLIANCE_IP_ADDRESS:2376

To make the variables persistent across multiple sessions, you’ll have to add these e.g. to your bash profile. I’ll keep it simple (edit in /root/.bash_profile):

# Begin ~/.bash_profile
# Written for Beyond Linux From Scratch
# by James Robertson <jameswrobertson@earthlink.net>
# updated by Bruce Dubbs <bdubbs@linuxfromscratch.org>


# Personal environment variables and startup programs.
GOPATH=/home/VCHmaster/govmw
PATH=$PATH:/home/VCHmaster/govmw/bin

At this point in time, there is not much more to explore. You run a few docker commands already but again – this a v0.1 with limited functionality.

(Edits: added the Bintray download links for VIC and Photon OS. Thanks Manuel & Cormac for your feedback!)

Cloud-Native Applications – Link Collection

vmware_cna

I started to collect the most comprehensive and important links around (VMware) Cloud-Native Applications on a dedicated page on this blog. I’ll keep it updated over time. If you feel something is missing, just ping me on Twitter and I’ll add the link/material.

You can find the page at:

Cloud-Native Applications