<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Pivot Point &#187; vmmark</title>
	<atom:link href="http://vpivot.com/tag/vmmark/feed/" rel="self" type="application/rss+xml" />
	<link>http://vpivot.com</link>
	<description>Scott Drummonds on Virtualization</description>
	<lastBuildDate>Wed, 08 Sep 2010 08:37:56 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>SPECvirt Released</title>
		<link>http://vpivot.com/2010/07/26/specvirt-released/</link>
		<comments>http://vpivot.com/2010/07/26/specvirt-released/#comments</comments>
		<pubDate>Mon, 26 Jul 2010 05:27:56 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[specvirt]]></category>
		<category><![CDATA[vmmark]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=607</guid>
		<description><![CDATA[SPEC has diligently working on an industry standard version of VMmark since something like 2006.  The first version of their product is complete and was released during my recent holiday.  I have been talking with colleagues and customers about SPECvirt for years and would like to talk about what SPECvirt is and what [...]]]></description>
			<content:encoded><![CDATA[<p>SPEC has diligently working on an industry standard version of VMmark since something like 2006.  The first version of their product is complete and was <a href="http://www.spec.org/virt_sc2010/press/release.html">released</a> during <a href="http://www.e-scott.net/blog/?p=339">my recent holiday</a>.  I have been talking with colleagues and customers about SPECvirt for years and would like to talk about what SPECvirt is and what it is not.</p>
<p><span id="more-607"></span>VMmark is clearly the reigning king of consolidation benchmarks and anything that enters its arena must stand against its standard.  VMmark pioneered a new method of benchmarking that resonates with virtualization experts.  It tests system performance by adding fixed load virtual machines instead of scaling up a single application to system saturation.  Traditional benchmarks tune up their load generation against a single instance but VMmark piles on virtual machines until the system is capable of no more work.</p>
<p>VMmark is one of VMware&#8217;s many industry-leading initiatives and was started when VMware worked closely with server vendors that wanted to benchmark their servers&#8217; ability to run virtual machines.  VMmark was conceived many years ago, well before VMware had competition.  It is because of this fact that I scratch my head at claims that VMmark is biased towards VMware.  There was no commercial implementation of Xen when VMmark was specified and Microsoft was only dreaming of entering the market.</p>
<p>But even in an environment devoid of competition, customers want certainty that their benchmarks are not hiding flaws in a product.  SPEC has for years been developing honest benchmarks that survive the crucible of debate among its large member community.  SPECvirt, or more properly SPECvirt_sc2010, is the result of this vigorous debate.  You can read up on SPECvirt in the <a href="http://www.spec.org/virt_sc2010/docs/SPECvirt_FAQ.html">FAQ</a> released coincident with the product&#8217;s launch.  But I will add a few comments and comparisons here.</p>
<ol>
<li>SPECvirt costs $3000 to purchase.  VMmark is free.  But VMmark requires commercial software and versions of SPEC benchmarks that are not free.  Depending on your licensing model, you may find VMmark or SPECvirt cheaper.  But the prices of each are essentially comparable.</li>
<li>VMmark uses the most common applications in the data center (like Apache and Microsoft Exchange).  SPECvirt does not mandate application choice for the system under test.
<ul>
<li>This is a Good Thing, because you may now choose a configuration that models your environment by running the exact applications you run.</li>
<li>This is a Bad Thing, because five different testers may choose five different application sets in their tests resulting in incomparable results.</li>
</ul>
</li>
<li>SPECvirt cannot be run against a cluster of hosts.  But VMmark cannot, either.  We will have to wait for an update to one of these benchmarks before we can properly test DRS clusters and their competitive equivalents.</li>
<li>There is only <a href="http://www.spec.org/virt_sc2010/results/specvirt_sc2010_perf.html">one published SPECvirt result</a>, courtesy of IBM running KVM.  There are a boatload of <a href="http://www.vmware.com/products/vmmark/results.html">VMmark results</a>, as one would expect of a more mature product.  It will be interesting to watch the rate of submissions of these two benchmarks over the coming year or two.</li>
<li>SPECvirt runs three workloads and an idle virtual machine in its tile.  One of those workloads, tested by SPECweb, is implemented with three virtual machines.  The end product is a six-VM tile that looks very much like VMmark&#8217;s six-VM tile.</li>
</ol>
<p>For years we have seen online and in-person griping about VMware&#8217;s misunderstood benchmark restriction in its EULA.  Both VMmark and SPECvirt can be run on any supported hypervisor.  So now its time for all the hypervisor vendors to put up or shut up.  Run one of these benchmarks on your product and compare the results against existing published results.  Then the world will know where your product stands.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/07/26/specvirt-released/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Hyper-Threading on vSphere</title>
		<link>http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/</link>
		<comments>http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/#comments</comments>
		<pubDate>Sat, 06 Mar 2010 18:05:38 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[hyper-threading]]></category>
		<category><![CDATA[intel]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[vmkernel]]></category>
		<category><![CDATA[vmmark]]></category>
		<category><![CDATA[vsphere]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=328</guid>
		<description><![CDATA[I continue to receive many questions from our customers on the expected performance gains of the new version of Hyper-Threading in Intel&#8217;s Core i7 processors.  The answer requires a little bit of discussion on Hyper-Threading, a little bit on ESX, and comes with some performance data.  If you are still interested, read on.
On [...]]]></description>
			<content:encoded><![CDATA[<p>I continue to receive many questions from our customers on the expected performance gains of the new version of Hyper-Threading in Intel&#8217;s Core i7 processors.  The answer requires a little bit of discussion on Hyper-Threading, a little bit on ESX, and comes with some performance data.  If you are still interested, read on.</p>
<p><span id="more-328"></span>On VI3, many of VMware&#8217;s customers disabled Hyper-Threading on their older, Netburst architecture Intel processors.  Intel has vaguely described the new Hyper-Threading as more efficient than the previous generation and I believe this to be due to a shorter pipeline and an improved ability to context switch pipeline stage data.  Long pipelines&#8211;such as the Netburst era Xeons of model numbers x1xx and x2xx&#8211;are more likely to suffer bubbles during context switches and are therefore penalized versus shorter pipeline products, such as the Core i7.  Furthermore, by pushing and restoring pipeline stage data during a hardware context switch, the new HT can reduce pipeline bubbles.</p>
<p>But the gains vSphere users experience as a result of the new Hyper-Threading also comes from changes in ESX.  ESX&#8217;s scheduler must make decisions as to when to co-locate two worlds on a physical core to take advantage of Hyper-Threading.  In some conditions the scheduler will perform this co-location and in others it will allow a world to run on the core by itself.  The decision to execute worlds concurrently instead of serially on a physical core can be informally called the scheduler&#8217;s <em>trust</em> of Hyper-Threading.  The vSphere scheduler <em>trusts</em> Hyper-Threading more than the VI3 scheduler did.  This amplifies the effect of HT.</p>
<p>I am now going to bore you with a disclaimer before I give you any data showing the effect of Hyper-Threading.  The value of HT will vary from workload to workload and the ultimate authority of HT&#8217;s value is the end-user.  The following numbers are the result of informal analysis and VMware that should only be used as a guide in your own analysis.  Please do not make purchasing decisions on this information, which is devoid of the detail we would normally commit to a white paper.</p>
<table id="newspaper-a">
<tbody>
<tr>
<th>Workload</th>
<th>Observed Throughput Gain Due to HT</th>
</tr>
<tr>
<td>VMmark</td>
<td>24%</td>
</tr>
<tr>
<td>SPECjbb</td>
<td>10%</td>
</tr>
<tr>
<td>Consolidated SQL</td>
<td>19%</td>
</tr>
</tbody>
</table>
<p>In addition to the gains we informally cite here, I can say that we have not yet seen a workload where the new Hyper-Threading slows down consolidated performance.  As far as we can tell, the new Hyper-Threading should be left enabled in 100% of virtualized environments.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Top Five VROOM! Entries for 2009</title>
		<link>http://vpivot.com/2009/10/06/top-five-vroom-entries-for-2009/</link>
		<comments>http://vpivot.com/2009/10/06/top-five-vroom-entries-for-2009/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 21:45:34 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[citrix]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[vista]]></category>
		<category><![CDATA[vmmark]]></category>
		<category><![CDATA[workstation]]></category>
		<category><![CDATA[xenapp]]></category>
		<category><![CDATA[xenserver]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=118</guid>
		<description><![CDATA[I love VMware&#8217;s performance blog, VROOM!  It is our most popular performance communication vehicle and its content is backed by a stellar engineering team with unmatched integrity.  Each article details the nuances of VMware performance and educates on application and platform best practices.  I love all the articles but am always surprised as to which [...]]]></description>
			<content:encoded><![CDATA[<p>I love VMware&#8217;s performance blog, VROOM!  It is our most popular performance communication vehicle and its content is backed by a stellar engineering team with unmatched integrity.  Each article details the nuances of VMware performance and educates on application and platform best practices.  I love all the articles but am always surprised as to which our readers find most popular.  Here is a countdown of the five entries most read in 2009.</p>
<p><span id="more-118"></span></p>
<h2>Number 5: <a href="http://blogs.vmware.com/performance/2007/07/comparing-intel.html">Comparing Intel Dual-Core and Quad-Core Using VMmark</a></h2>
<p>This article received 4,700 views in 2009.  I am not surprised that this two-year-old article is so popular, though.  Even this morning at <a href="http://blogs.vmware.com/vmtn/2009/09/intel-vmware-live-chat-tuesday-oct-6.html">Intel&#8217;s live chat</a> I was asked for advice on selecting core/socket configurations.  Everyone wants to know if the cores on the largest servers can be put to good use.  The article summed up findings with this graph:</p>
<img title="Quad versus dual core parts." src="http://blogs.vmware.com/photos/uncategorized/2007/07/18/quadblog2.jpg" alt="VMmark results showing quad-core HP servers versus dual-core HP servers." width="600" />
<p>Doubling cores or sockets does not exactly double performance.  In this case the quad-core system produced 70% more throughput than its equivalent dual-core configuration.</p>
<h2>Number 4: <a href="http://blogs.vmware.com/performance/2008/05/100000-io-opera.html">100,000 I/O Operations Per Second, One ESX Host</a></h2>
<p>In VMware&#8217;s early days customers were concerned that ESX lacked the storage throughput for demanding workloads.  In our first effort to dispel this rumor, we partnered with EMC to demonstrate a handful of VMs driving 100,000 IOPS on a single host.  A year later we updated those results on vSphere and <a href="http://blogs.vmware.com/performance/2009/05/350000-io-operations-per-second-one-vsphere-host-with-30-efds.html">showed VMs on ESX demanding 365,000 IOPS from EMC&#8217;s Enterprise Flash Devices (EFDs)</a>.  But the original article remains the more popular and has received 8,000 hits in 2009.</p>
<h2>Number 3: <a href="http://blogs.vmware.com/performance/2009/01/virtualizing-xenapp-on-xenserver-50-and-esx-35-1.html">Virtualizing XenApp on XenServer 5.0 and ESX 3.5</a></h2>
<p>I guess that people love <a href="http://www.catalyst.burtongroup.com/Na09/PlayerVideo011.html">a good argument</a>.  Our customers have heard enough misinformation on vSphere and XenServer performance that our performance team wanted to weigh in with a rare competitive comparison.  That article has received 13,400 hits in 2009.  It details an experiment in which we used a workload that will eventually be released as View Planner to demonstrate scalability performance of VMware&#8217;s and Citrix&#8217;s respective products.</p>
<img title="Maximum Desktop Counts on XenServer and vSphere" src="http://blogs.vmware.com/.a/6a00d8341c328153ef01053704d486970c-pi" alt="In a heterogeneous workload based on a large number of applications, VMware vSphere outperforms Citrixs offering." width="600" />
<p>Desktop and terminal services performance is continually a hot topic.  Unlike enterprise applications, there is no industry standard for measuring virtual desktop performance.  VMware has favored an approach that uses many operations on the most common desktop applications.  Every other public comparison either restricts is measurements to a single operation or favors easy automation over application relevancy.</p>
<h2>Number 2: <a href="http://blogs.vmware.com/performance/2007/11/ten-reasons-why.html">Ten Reasons Why Oracle Databases Run Best on VMware</a></h2>
<p>In his first public work at VMware, our Chief Performance Architect, Richard McDougall, showed us how good he is at evangelizing technology to large audiences.  His article has received 14,300 hits in 2009 despite its 2007 publication date.  Oracle databases are incredibly popular topics among VI administrators and Richard was the first to point out that VI3&#8217;s storage throughput ability was far beyond the needs of the average 4-way Oracle database:</p>
<img title="VI3 Storage Performance Exceeds Oracle Database Needs" src="http://blogs.vmware.com/photos/uncategorized/2007/11/13/iops_2.png" alt="In 2007, VI3 supported IO capabilities to support up to 80 4-way Oracle DBs." width="600" />
<p>Back in 2007 we were trying to convince everyone to virtualize their demanding Oracle databases.  We knew that VI3 could handle it and our early adopters knew, too.  But since vSphere launched, everyone has known.  If you think that vSphere cannot handle your most demanding Oracle databases, <a href="http://www.vmware.com/pdf/Perf_ESX40_Oracle-TPC-C-eval.pdf">think again</a>.</p>
<h2>Number 1: <a href="http://blogs.vmware.com/performance/2007/05/windows_vista_p.html">Windows Vista Performance in VMware Workstation 6.0</a></h2>
<p>I attribute the fact that fact that this aging article has received 15,000 hits in 2009 to the incredible size of consumer desktop products as opposed to enterprise software and hardware.  But it is worth mentioning that this article was written by Zhelong Pan, the developer responsible for esxtop, who has also written <a href="http://communities.vmware.com/docs/DOC-9279">the most popular article on the VMware performance communities</a>.  That guy has the magic touch!</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/10/06/top-five-vroom-entries-for-2009/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Four Things You Should Know About ESX 4&#039;s Scheduler</title>
		<link>http://vpivot.com/2009/09/29/four-things-you-should-know-about-esx-4s-scheduler/</link>
		<comments>http://vpivot.com/2009/09/29/four-things-you-should-know-about-esx-4s-scheduler/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 06:00:18 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[vmkernel]]></category>
		<category><![CDATA[vmmark]]></category>
		<category><![CDATA[vsphere]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=11</guid>
		<description><![CDATA[[This is the last re-post of old community content.  But this content is important enough to be worth a re-post.]
I spend a great deal of time answering customers&#8217; questions about the scheduler.  Never have so many questions been asked about such an abstruse component for which so little user influence is possible.  But [...]]]></description>
			<content:encoded><![CDATA[<p><em>[This is the last <a href="http://communities.vmware.com/blogs/drummonds/2009/08/21/four-things-you-should-know-about-esx-4s-scheduler">re-post of old community content</a>.  But this content is important enough to be worth a re-post.]</em></p>
<p>I spend a great deal of time answering customers&#8217; questions about the scheduler.  Never have so many questions been asked about such an abstruse component for which so little user influence is possible.  But CPU scheduling is central to system performance, so VMware strives to provide as much information on the subject as possible.  In this blog entry, I want to point out a few nuggets of information on the CPU scheduler.  These four bullets answer 95% of the questions I get asked.</p>
<p><span id="more-11"></span></p>
<h2>Item 1: ESX 4&#8217;s Scheduler Better Uses Caches Across Sockets</h2>
<p>On UMA systems at low load levels, virtual machine performance improves when each virtual CPU (vCPU) is placed on its own socket.  This is because providing each vCPU its own socket also gives it the entire cache on that CPU.  On page 18 of a <a class="jive-link-external" href="http://www.vmware.com/files/pdf/perf-vsphere-cpu_scheduler.pdf">recent paper on the scheduler written by Seongbeom Kim</a>, a graph highlights the case where vCPU spreading improves performance.</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6674/Picture+2.png" alt="Picture 2.png" width="620" /></p>
<p>The X-axis represents different combinations of VM and vCPU counts.  SPECjbb is memory intensive and shows great gains with increases in CPU cache.  The few cases that show dramatic benefit due to the ESX 4.0 scheduler are benefiting from the distribution of vCPUs across sockets.  Very large gains are possible in this somewhat uncommon case.</p>
<h2>Item 2: Overuse of SMP Only Slows Consolidated Environments At Saturation</h2>
<p>For years customers have asked me how many vCPUs they should give to their VMs.  The best guidance, &#8220;as few as possible&#8221;, seems too vague to satisfy.  It remains the only correct answer, unfortunately.  But <a class="jive-link-external" href="http://blogs.vmware.com/performance/2009/06/measuring-the-cost-of-smp-with-mixed-workloads.html">a recent experiment performed by Bruce Herndon&#8217;s team</a> sheds some light on this VM sizing question.</p>
<p>In this experiment we ran VMmark against VMs that were configured outside of VMmark specifications.  In one case some of the virtual machines were given too few vCPUs and in another they were given too many.  Because VMmark&#8217;s workload is fixed, increasing the VMs&#8217; sizes does not increase the work performed by the VMs.  In other words, the system&#8217;s score does not depend on the VMs&#8217; vCPU count.  Until CPU saturation, that is.</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6675/Picture+3.png" alt="Picture 3.png" width="620" /></p>
<p>Notice that the scores are similar between the undersized, right-sized, and over-sized VMs.  Up until tile 10 (60 VMs) they are nearly identical.  There is a slight difference in processor utilization that begins to impact throughput (score) as the system runs out of CPU.  At that point the additional vCPUs waste cycles which degrades system performance.  Two points I will call out from this work:</p>
<ul>
<li>Sloppy VI admins that provide too many vCPUs need not worry about performance when their servers are under low load.  But performance will suffer when CPU utilization spikes.</li>
<li>The penalty of over-sizing VMs gets worse as VMs get larger.  Using a 2-way VM is not that bad, but unneeded use of 4-way VMs when one or two processors suffice can cost up to 15% of your system throughput.  I presume that unnecessarily eight vCPUs would be criminal.</li>
</ul>
<h2>Item 3: ESX Has Not Strictly Co-scheduled Since ESX 2.5</h2>
<p>I have documented ESX&#8217;s relaxation of co-scheduling previously (<a class="jive-link-wiki" href="http://communities.vmware.com/docs/DOC-4960">Co-scheduling SMP VMs in VMware ESX Server</a>).  But this statement cannot be repeated too frequently: ESX has not strictly co-scheduled virtual machines since version 2.5.   This means that ESX can place vCPUs from SMP VMs individually.  It is not necessary to wait for physical cores to be available for every vCPU before starting the VM.  However, as Item 3 pointed out, this does not give you free license to over-size your VMs.  Be frugal with your SMP VMs and assign vCPUs only when you need them.</p>
<h2>Item 4: The Cell Construct Has Been Eliminated in ESX 4.0</h2>
<p>In the performance best practices deck that I give at conferences I talk about the benefits of creating small virtual machines over large ones.  In versions of ESX up to ESX 3.5, the scheduler used a construct called a cell that would contain and lock CPU cores.  The vCPUs from a single VM could never span a cell.  With a ESX 3.x&#8217;s cell size of four this meant that VMs never spanned multiple four-core sockets.  Consider this figure:</p>
<p><img class="jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6688/Picture+1.png" alt="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6688/Picture+1.png" /></p>
<p>What this figure shows is that a 4-way VM on ESX 3.5 can only be placed in two locations on this hypothetical two-socket configuration.  There are 12 combinations for a 2-way VM and eight for a uniprocessor VM.  The scheduler has more opportunities to optimize VM placement when you provide it with smaller VMs.</p>
<p>In ESX 4 we have eliminated the cell lock so VMs can span multiple sockets, as item one states.  Continue to think of this placement problem as a challenge to the scheduler that you can alleviate.  By choosing multiple, smaller VMs you free the scheduler to pursue opportunities to optimize performance in consolidated environments</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/09/29/four-things-you-should-know-about-esx-4s-scheduler/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
