<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Pivot Point &#187; scheduler</title>
	<atom:link href="http://vpivot.com/tag/scheduler/feed/" rel="self" type="application/rss+xml" />
	<link>http://vpivot.com</link>
	<description>Scott Drummonds on Virtualization</description>
	<lastBuildDate>Wed, 08 Sep 2010 08:37:56 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How Many Virtual CPUs Per VM?</title>
		<link>http://vpivot.com/2010/04/30/how-many-virtual-cpus-per-vm/</link>
		<comments>http://vpivot.com/2010/04/30/how-many-virtual-cpus-per-vm/#comments</comments>
		<pubDate>Fri, 30 Apr 2010 04:22:42 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[vcenter]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=403</guid>
		<description><![CDATA[Virtual machine sizing is a tricky issue for many VMware administrators.  It is important to find the right number of virtual CPUs to maximize application performance and minimize wasted CPU cycles.  The optimal number of vCPUs can never be easily identified.  But I can offer a few suggestions to help get this [...]]]></description>
			<content:encoded><![CDATA[<p>Virtual machine sizing is a tricky issue for many VMware administrators.  It is important to find the right number of virtual CPUs to maximize application performance and minimize wasted CPU cycles.  The optimal number of vCPUs can never be easily identified.  But I can offer a few suggestions to help get this number right.</p>
<p><span id="more-403"></span><br />
ESX must expend CPU cycles to maintain running virtual CPUs whether they are being used by an application or not.  This means that host efficiency drops as more vCPUs are put on the server.  But applications that scale well with CPUs will deliver greater performance when their virtual machines have been given more CPUs.  The administrator must therefore balance the desires of an individual application&#8217;s owner with the needs of the entire cluster&#8217;s of applications.</p>
<p>There are several resources that VI administrators can use to inform their decisions in virtual machine sizing.  I have listed some of them below.</p>
<h2>Bruce Herndon&#8217;s Cost-of-SMP Article</h2>
<p>Last summer the VMmark team&#8217;s Bruce Herndon published <a href="http://blogs.vmware.com/performance/2009/06/measuring-the-cost-of-smp-with-mixed-workloads.html">an article on the cost of SMP</a>.  I summarized his findings in <a href="http://vpivot.com/2009/09/29/four-things-you-should-know-about-esx-4s-scheduler/">a vPivot article I wrote on the ESX 4 scheduler</a>.  There are two key messages that you can take away from these posts to inform your decisions on virtual machine sizing:</p>
<ul>
<li>Over-sized virtual machines only hurt system performance when the server&#8217;s CPUs are saturated.  When utilization is low, unneeded vCPUs only penalize the system&#8217;s CPU utilization, not the applications&#8217; performance.</li>
<li>Unneeded 2-way virtual machines are not very harmful to the environment.  But administrators should be very careful with 4-way virtual machines and larger.</li>
</ul>
<h2>Co-stop and Ready Time</h2>
<p>Ready time indicates a vCPU waiting for an available core when it has work to perform.  Co-scheduling stop time (or co-stop time) indicates a vCPU being paused by the scheduler to allow its sibling vCPUs to catch up.  These two counters can help administrators recognize a certain kind of stress due to limited CPU resources.</p>
<p>Ready time is generally a sign of the unavailability of CPU.  Correction usually requires the administrator reducing work on the host (migrating virtual machines, decreasing vCPU count, etc.) or increasing CPU capacity (more hosts or faster CPUs).  Co-stop time is a sign that the scheduler is allowing vCPUs to develop skew while it runs portions of virtual machines on available cores.  Considerable numbers for these counters are 10% ready time and 3% co-stop time.  There is no guarantee that application performance is suffering if these thresholds are crossed, but a problem may be present.</p>
<p>The important thing about ready time and co-stop time is that they are signs that you are using all of the CPU you have available to you.  This could be a Good Thing.  But it could also be a surprise to you.  When these counters get high it is a good time to start asking yourself if you capacity usage meets your expectations.  If not, you should inspect your virtual machines to be sure that the applications are using the vCPUs you have given them.  If your guest tools show poor in-guest utilization then decrease those VM sizes.  That will free up resources in the cluster for more virtual machines.</p>
<h2>Application Scalability Information</h2>
<p>I wish we lived in a world where every ISV published data showing their applications&#8217; abilities to scale with cores.  Unfortunately for us, many software vendors have for years allowed their customers to assume that each doubling of cores would double the performance of the application.  VMware has chosen to provide some scalability information so our customers know <a href="http://www.vmware.com/pdf/Perf_ESX40_Oracle-eval.pdf">how well</a> or <a href="http://www.vmware.com/files/pdf/consolidating_webapps_vi3_wp.pdf">how poorly</a> applications scale.  But every customer of a software company deserves to have the vendor provide guidance on sizing the server.  And those vendors deserve the right to put these results out on their own products.  Go talk to your ISV to get the information you need to size your virtual machines.</p>
<h2>CPU Usage Calculations and CapacityIQ</h2>
<p>I am belatedly updating this post with a fourth way of identifying oversized virtual machines: mathematical calculation or Capacity IQ.</p>
<p>When a virtual machine consistently uses only a fraction of its vCPU resources it is possible that the virtual machine can be downsized and still deliver the same application performance.  The calculation to determine this is simple: multiply the vCPU count by utilization and round up.  Set the virtual machine&#8217;s vCPU count to the result of that calculation.</p>
<p>If you own CapacityIQ it will make this calculation for you for every virtual machine in your data center.  Here is an screenshot of its recommendations based on virtual machine CPU and memory utilization.  Click for a clearer picture.</p>
<div id="attachment_512" class="wp-caption alignnone" style="width: 310px"><a href="http://vpivot.com/wp-content/uploads/2010/04/capiq_vm_size_recs.png"><img src="http://vpivot.com/wp-content/uploads/2010/04/capiq_vm_size_recs-300x102.png" alt="" title="Capacity IQ Recommending VM Resize" width="300" class="size-medium wp-image-512" /></a><p class="wp-caption-text">CapacityIQ monitors CPU and memory utilization to recommend VM downsizing.</p></div>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/04/30/how-many-virtual-cpus-per-vm/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>vSphere 4.0, Hyper-Threading, and Terminal Services</title>
		<link>http://vpivot.com/2010/03/17/vsphere-4-0-hyper-threading-and-terminal-services/</link>
		<comments>http://vpivot.com/2010/03/17/vsphere-4-0-hyper-threading-and-terminal-services/#comments</comments>
		<pubDate>Wed, 17 Mar 2010 22:23:28 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[hyper-threading]]></category>
		<category><![CDATA[intel]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[terminal services]]></category>
		<category><![CDATA[vsphere]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=333</guid>
		<description><![CDATA[I recently wrote a blog article detailing Hyper-Threading (HT) and its effect on vSphere.  An astute reader pointed out, a recent update to Project VRC&#8217;s terminal services analysis suggests disappointment with HT on vSphere.  We spent a lot of time looking at those results to understand why they contradicted the body of performance data, which [...]]]></description>
			<content:encoded><![CDATA[<p>I recently wrote <a href="http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/">a blog article detailing Hyper-Threading (HT) and its effect on vSphere</a>.  An astute reader pointed out, a recent update to <a href="http://www.virtualrealitycheck.net/">Project VRC</a>&#8217;s terminal services analysis suggests disappointment with HT on vSphere.  We spent a lot of time looking at those results to understand why they contradicted the body of performance data, which show HT offering 10-30% gain on vSphere. What we discovered led us to create a vSphere patch that would allow users to improve performance in some benchmarking environments.</p>
<p><span id="more-333"></span>Among the many results presented by VRC, the configurations that most perplexed us were the two and four virtual machine configurations, each with four vCPUs per virtual machine.  The configuration with two virtual machines looked good and matched our internal numbers.  In this configuration there are a total of eight vCPUs on the host which maps each to its own physical core on the Xeon 5500 series processor.  The problem arose when the virtual machine count was increased to four, resulting in 16 total vCPUs.  In this configuration each vCPU is paired with one logical, Hyper-Threaded core.  Project VRC showed this configuration supporting no more desktops than the two-VM configuration, which suggests no value to Hyper-Threading on this configuration.</p>
<p>It took us some time to understand the reason for these results, but we eventually identified a very specific condition where ESX&#8217;s scheduler enforces fairness in scheduling vCPUs at at cost of throughput.  ESX&#8217;s scheduler has long be subject of the intensive scrutiny of a large number of VMware engineers to guarantee fair access to the processor for each virtual machine.  It is because of this fairness that VMware&#8217;s customers can rely on CPU resource controls.  But, when fairness goes too far, throughput may be sub-optimal.</p>
<p>Hyper-Threading presents particular problems to fairness because of the non-linear performance it delivers.  A thread will run at one speed when it has full access to a physical core, at another speed when it is sharing a core, and at third speed when sharing a core with a different thread.  As a result, ESX&#8217;s scheduler will sometimes pause a thread to enforce fairness.  These pauses are more common when Hyper-Threading is present to account for its lack of uniformity in thread performance.  If the host lacks vCPUs that are ready to run, the result is CPU utilization below saturation, leaving CPU cycles unused.</p>
<p>There are three specific conditions that can excite this condition:</p>
<ol>
<li>A Xeon 5500 series processor is present with Hyper-Threading enabled,</li>
<li>CPU utilization is near saturation, and</li>
<li>A roughly one-to-one mapping between vCPUs and logical processors.</li>
</ol>
<p>In this scenario, VMware vSphere favors fairness over throughput and sometimes pauses one vCPU to dedicate a whole core to another vCPU, eliminating gains provided by Hyper-Threading.  In cases outside of these three conditions, the performance of VMware vSphere 4 meets the high expectations of VMware&#8217;s R&amp;D team and its customers.  Of course production environments rarely (never?) have a one-to-one ratio of vCPUs to logical processors.  This occurs when there are only four 4-way virtual machines on a Xeon 5500 system, for example.</p>
<p>But environments such as Project VRC&#8217;s are simplifications of production environments meant to understand the capabilities of virtual platforms.  VMware has provided a patch to Project VRC that will allow them to improve throughput in their environment.  We are going to release this patch and its documentation to the general public within a couple of weeks.  I do not expect that any of VMware&#8217;s customers will benefit from the changes is allows, but I will later document the patch and its usage for anyone that cares to experiment.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/03/17/vsphere-4-0-hyper-threading-and-terminal-services/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Hyper-Threading on vSphere</title>
		<link>http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/</link>
		<comments>http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/#comments</comments>
		<pubDate>Sat, 06 Mar 2010 18:05:38 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[hyper-threading]]></category>
		<category><![CDATA[intel]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[vmkernel]]></category>
		<category><![CDATA[vmmark]]></category>
		<category><![CDATA[vsphere]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=328</guid>
		<description><![CDATA[I continue to receive many questions from our customers on the expected performance gains of the new version of Hyper-Threading in Intel&#8217;s Core i7 processors.  The answer requires a little bit of discussion on Hyper-Threading, a little bit on ESX, and comes with some performance data.  If you are still interested, read on.
On [...]]]></description>
			<content:encoded><![CDATA[<p>I continue to receive many questions from our customers on the expected performance gains of the new version of Hyper-Threading in Intel&#8217;s Core i7 processors.  The answer requires a little bit of discussion on Hyper-Threading, a little bit on ESX, and comes with some performance data.  If you are still interested, read on.</p>
<p><span id="more-328"></span>On VI3, many of VMware&#8217;s customers disabled Hyper-Threading on their older, Netburst architecture Intel processors.  Intel has vaguely described the new Hyper-Threading as more efficient than the previous generation and I believe this to be due to a shorter pipeline and an improved ability to context switch pipeline stage data.  Long pipelines&#8211;such as the Netburst era Xeons of model numbers x1xx and x2xx&#8211;are more likely to suffer bubbles during context switches and are therefore penalized versus shorter pipeline products, such as the Core i7.  Furthermore, by pushing and restoring pipeline stage data during a hardware context switch, the new HT can reduce pipeline bubbles.</p>
<p>But the gains vSphere users experience as a result of the new Hyper-Threading also comes from changes in ESX.  ESX&#8217;s scheduler must make decisions as to when to co-locate two worlds on a physical core to take advantage of Hyper-Threading.  In some conditions the scheduler will perform this co-location and in others it will allow a world to run on the core by itself.  The decision to execute worlds concurrently instead of serially on a physical core can be informally called the scheduler&#8217;s <em>trust</em> of Hyper-Threading.  The vSphere scheduler <em>trusts</em> Hyper-Threading more than the VI3 scheduler did.  This amplifies the effect of HT.</p>
<p>I am now going to bore you with a disclaimer before I give you any data showing the effect of Hyper-Threading.  The value of HT will vary from workload to workload and the ultimate authority of HT&#8217;s value is the end-user.  The following numbers are the result of informal analysis and VMware that should only be used as a guide in your own analysis.  Please do not make purchasing decisions on this information, which is devoid of the detail we would normally commit to a white paper.</p>
<table id="newspaper-a">
<tbody>
<tr>
<th>Workload</th>
<th>Observed Throughput Gain Due to HT</th>
</tr>
<tr>
<td>VMmark</td>
<td>24%</td>
</tr>
<tr>
<td>SPECjbb</td>
<td>10%</td>
</tr>
<tr>
<td>Consolidated SQL</td>
<td>19%</td>
</tr>
</tbody>
</table>
<p>In addition to the gains we informally cite here, I can say that we have not yet seen a workload where the new Hyper-Threading slows down consolidated performance.  As far as we can tell, the new Hyper-Threading should be left enabled in 100% of virtualized environments.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Inaccuracy of In-guest Performance Counters</title>
		<link>http://vpivot.com/2010/02/10/inaccuracy-of-in-guest-performance-counters/</link>
		<comments>http://vpivot.com/2010/02/10/inaccuracy-of-in-guest-performance-counters/#comments</comments>
		<pubDate>Wed, 10 Feb 2010 23:33:43 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[perfmon]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[timekeeping]]></category>
		<category><![CDATA[vmkernel]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=268</guid>
		<description><![CDATA[Every couple of months I receive a request for an explanation as to why performance counters in a virtual machine cannot be trusted.  While it is unfairly cynical to say that in-guest counters are never right, accurate capacity management and troubleshooting should rely on the counters provided by vSphere in either vCenter or esxtop.  The [...]]]></description>
			<content:encoded><![CDATA[<p>Every couple of months I receive a request for an explanation as to why performance counters in a virtual machine cannot be trusted.  While it is unfairly cynical to say that in-guest counters are never right, accurate capacity management and troubleshooting should rely on the counters provided by vSphere in either <a href="http://communities.vmware.com/docs/DOC-5600">vCenter</a> or <a href="http://communities.vmware.com/docs/DOC-9279">esxtop</a>.  The explanation is too short to merit a white paper but I hope a blog article will serve as the authoritative comment on the subject.</p>
<p><span id="more-268"></span>Usually this issue arises inside a new VMware customer or an established customer that has added new staff to the virtualization team.  In both cases the administrators are familiar with existing tools and require good reason to retool their thinking and environment around a new measurement system.</p>
<p>I was discussing the response to these concerns with my friend and colleague Kaushik Banerjee, the head of VMware&#8217;s outbound engineering group.  Kaushik and I spend a lot of time thinking about communicating technical details to our customers and in this case we chose different approaches to answer the question.  Both responses are complementary, so choose the one that suits your needs.</p>
<h2>Kaushik&#8217;s Approach: The Killer Examples</h2>
<p>Kaushik suggested that if we show cases where the guest OS&#8217;s counters were obviously wrong that a naturally suspicious VI admin would never trust the guest counters again.  To that end, I offer the following screen shots to make our point.</p>
<div id="attachment_285" class="wp-caption alignnone" style="width: 610px"><a href="http://vpivot.files.wordpress.com/2010/02/utilization_guest_higher.jpg"><img class="size-full wp-image-285" title="Guest Utilization Higher Than Host" src="http://vpivot.files.wordpress.com/2010/02/utilization_guest_higher.jpg" alt="Guest Utilization Higher Than Host" width="600" /></a><p class="wp-caption-text">Perfmon&#39;s counters show utilization higher in the guest than the host reports.</p></div>
<p>This screen shot shows two counters available in Perfmon inside a Windows guest with the <a href="http://vpivot.com/2009/09/17/using-perfmon-for-accurate-esx-performance-counters">vmStatsProvider</a> installed (available by default since vSphere).  The darker, red line is the CPU utilization as reported by the guest.  The lighter, greenish (?) line is CPU utilization of the virtual machine, from the host&#8217;s perspective.  This is the real CPU utilization passed up to the host by vmStatsProvider.  Notice how the host is always reporting higher utilization than the guest.  This is due to one of the reasons why guest counters cannot be trusted: they are unaware of hypervisor overheads.</p>
<p>This second screen shot shows a different case where the host utilization is lower than that reported by the guest.  Again, the dark red line represents the guest OS&#8217;s report of CPU utilization and the lighter line shows the real CPU utilization as reported by ESX.</p>
<div id="attachment_284" class="wp-caption alignnone" style="width: 610px"><a href="http://vpivot.files.wordpress.com/2010/02/utilization_host_higher.jpg"><img class="size-full wp-image-284" title="Host Utilization Higher Than Guest" src="http://vpivot.files.wordpress.com/2010/02/utilization_host_higher.jpg" alt="Host Utilization Higher Than Guest" width="600" /></a><p class="wp-caption-text">Perfmon&#39;s counters report a higher CPU utilization than ESX&#39;s.</p></div>
<p>The reason the host shows lower utilization than the guest is because the guest is unaware that it is only getting a fraction of the host&#8217;s CPU, time-sliced by ESX&#8217;s scheduler.  In this case the virtual machine was contending for CPU with other active virtual machines but this just the same principle would apply had a CPU limit been set.</p>
<h2>Scott&#8217;s Approach: A Detailed Explanation</h2>
<p>My approach to convincing VI admins to avoid guest tools is based on bottomless thirst for information that is common to technophiles.  If I can provide an explanation for the underlying system of resource scheduling and manipulation, then our admins will be able to deal with the guest counter issue and maybe solve other issues with their newfound knowledge.</p>
<p>There are four reasons why guest counters cannot be trusted:</p>
<ol>
<li>The guest is unaware of virtualization overheads.  As screen shot one showed above, the hypervisor will increase the CPU load as it virtualizes the hardware for the guest operating system.  That additional CPU work is not seeing by guest tools.</li>
<li>The guest is unaware that it is only seeing the portion of CPU that ESX&#8217;s scheduler is allowing it to see.  Because of contention or resource restrictions, virtual machines only get a slice of the CPU&#8217;s time.  When a guest thinks it is getting 100% of the CPU it may not know that the processor is being shared by eight other virtual machines.  See the second screen shot above.</li>
<li>Time skew in virtual machines can change the sample window for time-based counters.  This means that the guest may have measured 10 milliseconds of time passage during a read command when 12 milliseconds have elapsed.  This is more common on older versions of ESX and when the host CPU is saturated.  More on this below.</li>
<li>The virtual machines are unaware that they are being de-scheduled when idle, which means that they appear to be working more of the time than they are.  Consider a case where a virtual machine is idle 90% of the time.  If ESX does not schedule the VM during its idle time then the guest will think that its processor queues are full 100% of the time that it is being executed.</li>
</ol>
<p>The time drift explanation (item three) was historically the most problematic for VMware.  On older versions of our products time drift was common.  As ESX has matured we have reduced the amount of drift which has improved the accuracy of guest counters.  But the timer hardware is still being virtualized in software running on the host CPU.  This means that if host processor is fully utilized, the timer may not be scheduled on time, resulting in a delay in some ticks and a resultant skew in guest time.</p>
<h2>References</h2>
<p><a href="https://www.vmware.com/pdf/VI3.5_Performance.pdf">Performance Best Practices and Benchmarking Guidelines</a>.  This white paper was the last version that we printed that included benchmarking best practices, which contains some discussion on the need to measure performance from outside the host-under-test.</p>
<p><a href="http://www.vmware.com/pdf/vmware_timekeeping.pdf">Timekeeping in Virtual Machines</a>.  This document&#8211;not updated since VI3 but still accurate in its theory&#8211;will give the background on VMware-based time keeping and provide an explanation as to how skew can occur.</p>
<p><a href="http://www.vmware.com/files/pdf/perf-vsphere-cpu_scheduler.pdf">VMware vSphere™ 4: The CPU Scheduler in VMware® ESX™ 4</a>.  This white paper provides great detail how the scheduler works which will fully explains the notion of time slicing.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/02/10/inaccuracy-of-in-guest-performance-counters/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>Four Things You Should Know About ESX 4&#039;s Scheduler</title>
		<link>http://vpivot.com/2009/09/29/four-things-you-should-know-about-esx-4s-scheduler/</link>
		<comments>http://vpivot.com/2009/09/29/four-things-you-should-know-about-esx-4s-scheduler/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 06:00:18 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[vmkernel]]></category>
		<category><![CDATA[vmmark]]></category>
		<category><![CDATA[vsphere]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=11</guid>
		<description><![CDATA[[This is the last re-post of old community content.  But this content is important enough to be worth a re-post.]
I spend a great deal of time answering customers&#8217; questions about the scheduler.  Never have so many questions been asked about such an abstruse component for which so little user influence is possible.  But [...]]]></description>
			<content:encoded><![CDATA[<p><em>[This is the last <a href="http://communities.vmware.com/blogs/drummonds/2009/08/21/four-things-you-should-know-about-esx-4s-scheduler">re-post of old community content</a>.  But this content is important enough to be worth a re-post.]</em></p>
<p>I spend a great deal of time answering customers&#8217; questions about the scheduler.  Never have so many questions been asked about such an abstruse component for which so little user influence is possible.  But CPU scheduling is central to system performance, so VMware strives to provide as much information on the subject as possible.  In this blog entry, I want to point out a few nuggets of information on the CPU scheduler.  These four bullets answer 95% of the questions I get asked.</p>
<p><span id="more-11"></span></p>
<h2>Item 1: ESX 4&#8217;s Scheduler Better Uses Caches Across Sockets</h2>
<p>On UMA systems at low load levels, virtual machine performance improves when each virtual CPU (vCPU) is placed on its own socket.  This is because providing each vCPU its own socket also gives it the entire cache on that CPU.  On page 18 of a <a class="jive-link-external" href="http://www.vmware.com/files/pdf/perf-vsphere-cpu_scheduler.pdf">recent paper on the scheduler written by Seongbeom Kim</a>, a graph highlights the case where vCPU spreading improves performance.</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6674/Picture+2.png" alt="Picture 2.png" width="620" /></p>
<p>The X-axis represents different combinations of VM and vCPU counts.  SPECjbb is memory intensive and shows great gains with increases in CPU cache.  The few cases that show dramatic benefit due to the ESX 4.0 scheduler are benefiting from the distribution of vCPUs across sockets.  Very large gains are possible in this somewhat uncommon case.</p>
<h2>Item 2: Overuse of SMP Only Slows Consolidated Environments At Saturation</h2>
<p>For years customers have asked me how many vCPUs they should give to their VMs.  The best guidance, &#8220;as few as possible&#8221;, seems too vague to satisfy.  It remains the only correct answer, unfortunately.  But <a class="jive-link-external" href="http://blogs.vmware.com/performance/2009/06/measuring-the-cost-of-smp-with-mixed-workloads.html">a recent experiment performed by Bruce Herndon&#8217;s team</a> sheds some light on this VM sizing question.</p>
<p>In this experiment we ran VMmark against VMs that were configured outside of VMmark specifications.  In one case some of the virtual machines were given too few vCPUs and in another they were given too many.  Because VMmark&#8217;s workload is fixed, increasing the VMs&#8217; sizes does not increase the work performed by the VMs.  In other words, the system&#8217;s score does not depend on the VMs&#8217; vCPU count.  Until CPU saturation, that is.</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6675/Picture+3.png" alt="Picture 3.png" width="620" /></p>
<p>Notice that the scores are similar between the undersized, right-sized, and over-sized VMs.  Up until tile 10 (60 VMs) they are nearly identical.  There is a slight difference in processor utilization that begins to impact throughput (score) as the system runs out of CPU.  At that point the additional vCPUs waste cycles which degrades system performance.  Two points I will call out from this work:</p>
<ul>
<li>Sloppy VI admins that provide too many vCPUs need not worry about performance when their servers are under low load.  But performance will suffer when CPU utilization spikes.</li>
<li>The penalty of over-sizing VMs gets worse as VMs get larger.  Using a 2-way VM is not that bad, but unneeded use of 4-way VMs when one or two processors suffice can cost up to 15% of your system throughput.  I presume that unnecessarily eight vCPUs would be criminal.</li>
</ul>
<h2>Item 3: ESX Has Not Strictly Co-scheduled Since ESX 2.5</h2>
<p>I have documented ESX&#8217;s relaxation of co-scheduling previously (<a class="jive-link-wiki" href="http://communities.vmware.com/docs/DOC-4960">Co-scheduling SMP VMs in VMware ESX Server</a>).  But this statement cannot be repeated too frequently: ESX has not strictly co-scheduled virtual machines since version 2.5.   This means that ESX can place vCPUs from SMP VMs individually.  It is not necessary to wait for physical cores to be available for every vCPU before starting the VM.  However, as Item 3 pointed out, this does not give you free license to over-size your VMs.  Be frugal with your SMP VMs and assign vCPUs only when you need them.</p>
<h2>Item 4: The Cell Construct Has Been Eliminated in ESX 4.0</h2>
<p>In the performance best practices deck that I give at conferences I talk about the benefits of creating small virtual machines over large ones.  In versions of ESX up to ESX 3.5, the scheduler used a construct called a cell that would contain and lock CPU cores.  The vCPUs from a single VM could never span a cell.  With a ESX 3.x&#8217;s cell size of four this meant that VMs never spanned multiple four-core sockets.  Consider this figure:</p>
<p><img class="jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6688/Picture+1.png" alt="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6688/Picture+1.png" /></p>
<p>What this figure shows is that a 4-way VM on ESX 3.5 can only be placed in two locations on this hypothetical two-socket configuration.  There are 12 combinations for a 2-way VM and eight for a uniprocessor VM.  The scheduler has more opportunities to optimize VM placement when you provide it with smaller VMs.</p>
<p>In ESX 4 we have eliminated the cell lock so VMs can span multiple sockets, as item one states.  Continue to think of this placement problem as a challenge to the scheduler that you can alleviate.  By choosing multiple, smaller VMs you free the scheduler to pursue opportunities to optimize performance in consolidated environments</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/09/29/four-things-you-should-know-about-esx-4s-scheduler/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
