My VMworld 2016 session proposals (NFV, Containers/Docker, …)

VMworld session voting time has come and I just wanted to share my session proposals – your votes would be appreciated:

On the multi-dimensional evolution of platforms and applications

I’d like to touch on a topic that I am seeing in several of my areas of interest right now. In general, it’s related a lot to the overall topic of „First, Second and Third Platform“ but I’d like to focus more on the individual implications for multiple domains. Over the course of the last few months, I have been involved in several discussions around different platforms and applications as well as their individual evolution and maturity. My personal observation is that both don’t necessarily evolve synchronously. Therefore, it is important to not only identify the phase that you are currently in but also to understand the operational implications of the „generation disconnect“ between app and platform.

 

Evolution of Platforms

As mentioned above, I’d like to relate my observations to the three platform generations below. I’d like to point out that these three generations are subdivided in several different technologies and can look and feel different in specific use-cases or fields of application. The common sense around the generations is:

platform_evolution

 

In addition to these phases, I see different „implementations“ of the respective platform generation. Take Client-Server as one example – this can be a physical-server-only model, this also stretches to server virtualization and potentially even to „VM-oriented“ hosting or even cloud services. My friend Massimo also wrote a nice piece on this.

 

Evolution of Applications

One of my key observations is that there is no simple 1:1 connection between applications and platforms. With the rise of 2nd generation platforms, not all applications from the 1st platform have been dropped and immediately been available for the next-generation platform. It’s actually an evolution for applications that are still business-relevant and therefore make sense to be optimized for the next-generation platform. And here comes the important observation: I believe there are (at least) three phases in an application evolution cycle that is happening for each platform generation – or potentially even in each concrete implementation of the platform generation. I’ll call theses phases „Unchanged“, „Optimized“ and „Purpose-built“ for now:

application_evolution

But how does that fit in the overall platform picture? I’ll try to merge the previous two pictures into one. It also shows a potential application evolution path across platform generations. As you can see, there can be a slight overlap between „purpose-built“ of the previous and the „unchanged“ phase of the next-generation platform.

evolution

But let’s move on to two concrete examples that I see applicable.

 

Example 1: Network Functions Virtualization

I’ll start with Network Functions Virtualization (NFV). NFV is a Telco industry movement that is supposed to provide hardware independence, new ways of agility, reduced time to market for carrier applications & services, cost reduction and much more – basically, it’s about the delivery of promises of Cloud Computing (and Third Platform) for the Telco industry (read more about it here). The famous architectural overview as described by ETSI can be seen below:

ETSI NFV

NFV differentiates between several functional components such as the actual platform (NFVI = Network Functions Virtualization Infrastructure), the application (VNF = Virtual Network Function), the application manager (VNF manager) and e.g. the orchestration engine.

So how could this look in reality? Let’s assume the VNF manager detects a certain usage pattern of its VNF and that VNF is reaching it’s potential maximum scale for the currently deployed amount of VNF instances. The VNF manager then talks to the Orchestrator that could then trigger e.g. the scale-out of the VNF by deploying additional worker instances on the underlying infrastructure/platform resources. The worker instances of the VNF could then automatically be included in the load distribution and have instant integration into necessary backend services where applicable. All of that happens via open APIs and standardized interfaces – it looks and feels a lot like a typical implementation example for a „third platform“ including the „purpose-built“ app.

Now into a quick reality check. ETSI’s initial NFV whitepaper is from October 2012. It basically describes the destination that the industry is aiming for. And while there might be some examples where VNFs, NFVI and Orchestration are already working hand in hand, there is still a lot of work to do. Some of the „NFV applications“ (or VNFs) might have been just „P2V“’ed (1:1 physical to virtual conversion) onto a virtualization platform and basically have the same configuration, same identity and are kept as close to its physical origins as possible. This allows a VNF provider/vendor to keep existing support procedures and organizations while offering their customers a „NFV 1.0 product“ that is providing some early benefits of NFV (hardware independence, faster time to market, …). But this also implies that you transfer some of the configurations that made perfect sense in the physical world over to the virtual world – where it only makes questionable sense. In this case, I’d actually talk about a move from a „purpose-built“ app from the first platform to an „unchanged“ app on the second platform. 

One example: one physical server in a telco application had 30*300GB harddisks, had 2*4Core CPUs and 128GB RAM. It never used more than 1TB of storage and average utilization has been below 4 CPUs and 32GB RAM. The „unchanged“ version of this app would be a 1:1 conversion with all (unnecessary) resource overhead provided in a virtual machine. The „optimized“ version of this app is a right-sized application (so only 1TB storage, 4 CPUs and 32GB RAM) that is also leveraging easy configuration files for installation as well as crash-consistent and persistent data management to allow backup & restore as VM. But a „purpose-built“ version of that app would leverage the underlying NFVI APIs, would allow scale-out deployment options based on actual demand as well as optimizations that are e.g. encryption at every layer of the application to ensure global deployment models even in the face of lawful interception relevance, etc.

 

Example 2: Microservices, Containers & Docker

My next example are microservices and their close friends containers. They are promising a new generation of application architecture and are drivers for the „3rd platform“ architecture. One of this movements famous poster-childs is Docker. Docker is a great (new) way to package and distribute applications in „containers“ that contain applications or just pieces of a larger application architecture. Newly developed applications usually follow a scale-out design, some might be written with something like the „12 factor app“ manifesto in mind (or the 15 factors according to Pivotal). Coming back to the pictures above: a 12 factor app could be considered „purpose-built“ for the „third platform“.

But how many applications have been built for this? There are many great examples for microservices-oriented applications by the „cloud-native“ companies such as Google, Amazon, Facebook and the likes. Adrian Cockcroft also gives inspirational talks about these topics around the globe. But I actually expect many applications to stay mainly unchanged as they are optimized for their current platform. At the same time, some of them might become available as (Docker) containers as part their next release. But again – if you look into the details, you’ll find the same application in a different wrapper. RAR is now ZIP (for my German readers: „Aus Raider wird nun Twix…“). But will these potentially „single-container-applications“ run well on a Cloud-Native/third platform architecture? They might not! To put it in a picture:

So in this case, it is actually important to understand these application limitations and expectations towards the platform (what about data persistence, security, platform resilience, networking, …) to make sure it runs smoothly in production. Coming back to Massimo’s blogpost – you can run your old Windows NT4 on a Public Cloud, but does it make sense?

Summary

Just like the continuous evolution of platforms that expose new characteristics and capabilities, there is also an ongoing evolution of applications. It is important to understand the key aspects of the application architecture and it’s deployment model before making a platform decision. The word „VNF“ does not necessary imply the alignment with NFV and the word „Docker“ does not automatically describe a Cloud-Native or microservices-oriented application.

 

Edits:

18.05.2016: added picture (containerizing legacy applications)

Performance tuning of Telco and NFV workloads

Today, VMware released a new technical whitepaper for performance tuning of Telco and NFV workloads in vSphere. You can download the paper here: https://www.vmware.com/resources/techresources/10479

NFV_perf_WP

 

From the document description:

The vSphere ESXi hypervisor provides a high-performance and competitive platform that effectively runs many Tier 1 application workloads in virtual machines. By default, ESXi has been heavily tuned for driving high I/O throughput efficiently by utilizing fewer CPU cycles and conserving power, as required by a wide range of workloads.

However, Telco and NFV application workloads are different from the typical Tier I enterprise application workloads, in that they tend to be any combination of latency sensitive, jitter sensitive, or demanding high packet rate throughputs or aggregate bandwidth, and therefore need to be tuned for best performance on vSphere ESXi.

This white paper summarizes the findings and recommends best practices to tune the different layers of an application’s environment for Telco and NFV workloads.

MWC15: VMware vCloud for NFV with Integrated OpenStack

During Mobile World Congress 2015, VMware announced VMware vCloud for NFV with Integrated OpenStack (Link1 / Link2) – a new offering for Telcos to support their journey and success with Network Functions Virtualization (NFV). Core details from the Press Release:

 

  • VMware vCloud for NFV Helps CSPs Achieve Sustainable Cost Reductions, Improve Time To Market
  • VMware Offers CSPs a Fast, Simple Path to OpenStack Adoption
  • Multi-Vendor vCloud NFV Platform Supports 40+ Virtual Network Functions from 30+ Vendors

 

The offering is tailored to the needs for Telcos to run and manage a scalable horizontal NFV Infrastructure (NFVI). It will consist of VMware’s proven Software-Defined Datacenter components: vSphere, vRealize Operations, Virtual SAN, NSX, vCloud Director and vCloud API and it will also add VMware Integrated OpenStack (VIO).

vCloud NFV

To find out more about VMware’s announcements during Mobile World Congress, check out:

To find out more about NFV with VMware, check out the microsite http://www.vmware.com/go/NFV

The VMware vCloud for NFV is the only production proven, multi-vendor NFV cloud platform and supports over 40 Virtual Network Functions (VNFs) from over 30 ecosystem partners.

With the vCloud for NFV Platform, communication service providers (CSPs) can leverage VMware’s industry-leading cloud infrastructure for faster time to market of new and differentiated services while driving sustainable cost reductions through a cloud operations model.

The VMware vCloud for NFV features VMware vSphere®, the industry-defining compute virtualization solution for the cloud; VMware NSX™, the only network virtualization platform that delivers the entire networking and security model from L2-L7 in software; VMware Virtual SAN, software-defined storage that reduces storage CapEx and OpEx; and VMware vCloud Director, a management tool for Telco cloud architectures

By moving Network Functions Virtualization into production today, VMware customers are accelerating their transformation into next-generation cloud providers, building the operational expertise needed to succeed in the cloud era ahead of their competition.

This transformation is possible as a result of the deep multi-tenancy/multi-vendor capabilities of the VMware platform combined with highly developed operations support services that deliver FCAPS for the cloud and the open application programming interfaces (APIs) for integration northbound to applications and service orchestration platforms.

Software-Defined Telco – NFV in Production with VMware

Last night, my blog post about „Software-Defined Telco – NFV in Production with VMware“ went live on the VMware Office of the CTO blog. After my VMworld 2014 presentation about the operational considerations in Telco Cloud / Network Functions Virtualization environments and other recent blog posts on this site, this article provides a comprehensive overview on the architectural and technical aspects of NFV with VMware:

It is a very exciting time for the Telco industry right now! In this blog post, I will share some updates and observations on VMware’s current involvement in NFV. Telco providers around the globe are working with VMware on both proof-of-concepts as well as production deployments… Continue Reading…

Operational Considerations for Network Functions Virtualization – Part 1

Just a few weeks back I had the pleasure to present at VMworld 2014 in San Francisco. My session „OPT2029 – Considerations for Operational Efficiency in Telco Cloud Deployments“ covered various aspects around Network Functions Virtualization Infrastructure and it’s impact on existing operational models.

Why is NFV important after all? Well, let’s take a look at ETSI’s summary of the key benefits for Network Operators and their customers:

  • Reduced operator CAPEX and OPEX through reduced equipment costs and reduced power consumption
  • Reduced time-to-market to deploy new network services
  • Improved return on investment from new services
  • Greater flexibility to scale up, scale down or evolve services
  • Openness to the virtual appliance market and pure software entrants
  • Opportunities to trial and deploy new innovative services at lower risk

Now, I’d like to start with a quick overview picture around Network Functions Virtualization that I created from ETSI’s NFV overview documents:

ETSI NFV

As you can see, NFV is split up into various parts:

  • NFVI or Network Functions Virtualization Infrastructure
  • VNF or Virtual Network Function(s)
  • NFV M&O or Management and Orchestration
  • Service, VNF and Infrastructure Description
  • OSS / BSS

In this post, I’d like to focus on considerations around introducing NFVI:

ETSI NFVI

One very important thing is that you will only see most of the NFVI benefits come to live if you concentrate on as few NFV Infrastructures as possible. Each NFVI not only means fragmentation of resources but also operational complexity as each „silo“ will have specifics that need to be operated in a separate way. Even though the underlying resources will most likely be consumed via API, the actual infrastructure requires operational procedures. So for the following parts, I will focus on a shared NFVI environment, not fragmented NFVIs:

Fragmented NFVI

As with most IT-related infrastructures, terminology and methodology from ITIL comes very handy to describe and differentiate the necessary processes for Service Strategy, Service Design, Service Transition and of course Service Operation. My experience with Service Operation in a virtual environment is very similar to VMware’s Cloud Operations methodology and publications around Operations Transformation. Some of the ITIL functions need to be much closer aligned than in traditional operations, e.g. Demand, Capacity, Performance, Incident, Problem and Configuration Management.

Also, in a completely shared environment, there will be a logical separation between „Tenants“ and „Provider“ or in our case VNF and NFVI. But this implies a new central function for taking care of NFVI holistically: a NFVI Center Of Excellence. This NFVI COE will be covered in the next part of this series.

Network Functions Virtualization

Virtualization and Cloud Computing (IaaS) have been around for quite some time now. Many industries have introduced a „Virtualization-first“ or even a „Cloud-first“ policy for new applications in their datacenters. IT departments and their customers have seen significant benefits over the past five or even ten years.

At the same time, there are areas where hardware-centric deployments are still dominant. But even these industries are seeing major changes. One great example are Telco providers world-wide.

As a result, the European Telecommunications Standards Institute (ETSI) has established an Industry Specification Group which is focussing on a very interesting topic called Network Functions Virtualization (NFV).

Here is what ETSI is saying about NFV:

Telecoms networks contain an increasing variety of proprietary hardware appliances. To launch a new network service often requires yet another appliance and finding the space and power to accommodate these boxes is becoming increasingly difficult, in addition to the complexity of integrating and deploying these appliances in a network. Moreover, hardware-based appliances rapidly reach end of life: hardware lifecycles are becoming shorter as innovation accelerates, reducing the return on investment of deploying new services and constraining innovation in an increasingly network-centric world.

Network Functions Virtualisation (NFV) aims to address these problems by evolving standard IT virtualization technology to consolidate many network equipment types onto industry standard high volume servers, switches and storage. It involves implementing network functions in software that can run on a range of industry standard server hardware, and that can be moved to, or instantiated in, various locations in the network as required, without the need to install new equipment.

So it’s all about time to market, agility and cost savings through standardization and reduction of operational complexity. It’s about bringing the benefits of virtualization and Infrastructure-/Platform-as-a-Service to Telco environments. ETSI has also published several NFV Use Cases that can be found in GS NFV 001.

For now, I’d like to share a few links and resources. I’ll post more about this topic in the near future.

– VMware CEO Pat Gelsinger at Mobile World Congress 2014 (Video).

– Blogpost by Ben Fathi (VMware CTO, follow him on Twitter) about NFV – Transforming the Operational Model of the Network.

– VMware’s Principal Engineer Bruce Davie (follow him on Twitter) on NFV and Network Virtualization in 2014.

– VMware’s Solution Exchange that has a special area focussed on Network Functions Virtualization.

– There is a great whitepaper by Lightreading available.

– Blog: „VMware guiding telecom industry on journey towards network function virtualization and software-defined networking“.