Scott Drummonds on Virtualization

vSphere 4.1: Performance Improvements


Last week I took my first vacation in a year and a half.  I had not missed a single day of work in 18 months.  So last week, when I was galavanting through Spain and running terrified, screaming, and covered in sangria through the streets of Pamplona, VMware made its biggest announcement in over a year: the launch of vSphere 4.1.  My old team put out what looks to be a wonderful “What’s New in Performance” paper so I want to take a few minutes to add my thoughts to some of the great work VMware has done.

Calling attention to a subset of the performance features in this launch, I will augment the published documentation with my own comments.

Wide VM NUMA Support

A “wide VM” is defined by VMware as a virtual machine whose memory is too large for a single NUMA node.  In this case, some of the memory must be placed on a remote node, which has a relatively higher memory latency.  ESX 4.0 would place as much memory as possible on a single node, then arbitrarily spill the rest over to other nodes.  ESX 4.1 now recognizes memory locality of reference and places frequently accessed memory on the local node, potentially eliminating remote memory access penalties.  Expect big gains with wide virtual machines running Java or very active databases.

Memory Compression

When I wrote on Steve Herrod’s preview of memory compression at PEX, I am sure you knew this feature’s release was imminent.  VMware’s documentation is sufficient on the gains provided by this feature, so I will not repeat those gains here.  The key thing to remember about memory compression is that it greatly reduces the need for swap.  For years VMware administrators have feared the spectre of memory swapping and have left memory woefully underutilized, even in consolidated environments.  With memory compression in place, you should more confidently push active memory closer to 100%.

Storage IO Control (SIOC)

Before the new VMware documentation, you had my article on SIOC and a delightful video previewing the feature to whet your appetites.  Now that the feature is out, I want to repeat the moral of the story: SIOC will save your high priority applications’ in the event of storage contention.  But if you storage performance stinks before SIOC, it will continue to stink with SIOC.  SIOC just buys you time for your mission critical applications so you can correct that storage problem.

Faster vMotion

(I’ll take a break here and point out the change of spelling from VMotion to vMotion.  This innocuous change will surely be missed by the large numbers of people that misspell VMWare [sic].  In truth, the case of vMotion is not particularly critical, but those of you grammatical pedants like myself take note.)

Many customers, already happy with vMotion, will scratch their heads as to what is left to be improved in this feature.  But a large number of you have tried evacuating 100 virtual machines from a host.  At two virtual machines at a time, this evacuation would have taken tens of minutes.  VMware was not limiting the vMotion concurrency for no good reason; they wanted to guarantee 100% correctness.  Careful evaluation, experimentation, and critical code improvements allowed the vMotion engineering team to greatly improve the efficiency of a migration in vSphere 4.1.  The result is that virtual machines more efficiency use the vMotion network which means VMware can qualify and support more virtual machines being concurrently migrated.

Part of this efficiency change included a decrease in the virtual machine switchover time, during which the application is unresponsive.  In every production environment I have seen, this switchover time was quite small, resulting in no application downtime.  But as processor performance and memory access time improved, and with vMotion efficiency remaining flat, eventually pages would be touched faster than vMotion could migrate them. This would result in vMotion failures.

The new vMotion efficiency improvements have dropped application switchover times to minuscule levels, guaranteeing zero application downtime for many years to come.

Network IO Control

Missing from the recently published performance document is an overview on Network IO Control (NetIOC).  In truth, I may be responsible for its lack of inclusion.  Apologies.  But luckily performance engineering released a best practices document on this wonderful new feature.

NetIOC is the network version of SIOC and may be even more important than SiOC in 10 Gb network environments and infrastructure using converged network adapters.  Let us be honest: the best practice of giving dedicated network hardware to each vSphere network traffic stream is so 2007.  It’s time to consolidate network and put everything on fewer 10 Gb adapters.  But this is going to create occasional network contention that would benefit from the same resource prioritization that CPU and memory shares have provided for years.

NetIOC will help prioritize your network streams in such an environment. Converge your networks and investigate NetIOC.

4 Responses

Great summary of the new features Scott. I’m also happy to hear you executed a well-deserved vacation.


  • Great post, thanks Scott!


  • Great Summary as always Scott, loving the new SOIC and NOIC features, finally complete control of all resources.

  • […] vSphere 4.1: Performance Improvements « Pivot Point […]