Every couple of months I receive a request for an explanation as to why performance counters in a virtual machine cannot be trusted. While it is unfairly cynical to say that in-guest counters are never right, accurate capacity management and troubleshooting should rely on the counters provided by vSphere in either vCenter or esxtop. The explanation is too short to merit a white paper but I hope a blog article will serve as the authoritative comment on the subject.
Usually this issue arises inside a new VMware customer or an established customer that has added new staff to the virtualization team. In both cases the administrators are familiar with existing tools and require good reason to retool their thinking and environment around a new measurement system.
I was discussing the response to these concerns with my friend and colleague Kaushik Banerjee, the head of VMware’s outbound engineering group. Kaushik and I spend a lot of time thinking about communicating technical details to our customers and in this case we chose different approaches to answer the question. Both responses are complementary, so choose the one that suits your needs.
Kaushik’s Approach: The Killer Examples
Kaushik suggested that if we show cases where the guest OS’s counters were obviously wrong that a naturally suspicious VI admin would never trust the guest counters again. To that end, I offer the following screen shots to make our point.
This screen shot shows two counters available in Perfmon inside a Windows guest with the vmStatsProvider installed (available by default since vSphere). The darker, red line is the CPU utilization as reported by the guest. The lighter, greenish (?) line is CPU utilization of the virtual machine, from the host’s perspective. This is the real CPU utilization passed up to the host by vmStatsProvider. Notice how the host is always reporting higher utilization than the guest. This is due to one of the reasons why guest counters cannot be trusted: they are unaware of hypervisor overheads.
This second screen shot shows a different case where the host utilization is lower than that reported by the guest. Again, the dark red line represents the guest OS’s report of CPU utilization and the lighter line shows the real CPU utilization as reported by ESX.
The reason the host shows lower utilization than the guest is because the guest is unaware that it is only getting a fraction of the host’s CPU, time-sliced by ESX’s scheduler. In this case the virtual machine was contending for CPU with other active virtual machines but this just the same principle would apply had a CPU limit been set.
Scott’s Approach: A Detailed Explanation
My approach to convincing VI admins to avoid guest tools is based on bottomless thirst for information that is common to technophiles. If I can provide an explanation for the underlying system of resource scheduling and manipulation, then our admins will be able to deal with the guest counter issue and maybe solve other issues with their newfound knowledge.
There are four reasons why guest counters cannot be trusted:
- The guest is unaware of virtualization overheads. As screen shot one showed above, the hypervisor will increase the CPU load as it virtualizes the hardware for the guest operating system. That additional CPU work is not seeing by guest tools.
- The guest is unaware that it is only seeing the portion of CPU that ESX’s scheduler is allowing it to see. Because of contention or resource restrictions, virtual machines only get a slice of the CPU’s time. When a guest thinks it is getting 100% of the CPU it may not know that the processor is being shared by eight other virtual machines. See the second screen shot above.
- Time skew in virtual machines can change the sample window for time-based counters. This means that the guest may have measured 10 milliseconds of time passage during a read command when 12 milliseconds have elapsed. This is more common on older versions of ESX and when the host CPU is saturated. More on this below.
- The virtual machines are unaware that they are being de-scheduled when idle, which means that they appear to be working more of the time than they are. Consider a case where a virtual machine is idle 90% of the time. If ESX does not schedule the VM during its idle time then the guest will think that its processor queues are full 100% of the time that it is being executed.
The time drift explanation (item three) was historically the most problematic for VMware. On older versions of our products time drift was common. As ESX has matured we have reduced the amount of drift which has improved the accuracy of guest counters. But the timer hardware is still being virtualized in software running on the host CPU. This means that if host processor is fully utilized, the timer may not be scheduled on time, resulting in a delay in some ticks and a resultant skew in guest time.
Performance Best Practices and Benchmarking Guidelines. This white paper was the last version that we printed that included benchmarking best practices, which contains some discussion on the need to measure performance from outside the host-under-test.
Timekeeping in Virtual Machines. This document–not updated since VI3 but still accurate in its theory–will give the background on VMware-based time keeping and provide an explanation as to how skew can occur.
VMware vSphere™ 4: The CPU Scheduler in VMware® ESX™ 4. This white paper provides great detail how the scheduler works which will fully explains the notion of time slicing.