Pivot Point

Scott Drummonds on Virtualization

Optimizing Memory Utilization

10 Comments »

My recent series of blog articles have discussed ESX memory management the the performance specter of host swapping. My last article attempts to correct the misconception that VMware recommends against over-commit memory.  In that article I suggested that memory over-commit is requirement in optimizing memory utilization. Today I want to provide a specific example to show why this is true.   I am have also included tips for identifying host swapping in your environments.

Understanding the Bottleneck

Let me show the value of over-commit and danger of swapping by way of an example. I will choose the following typical values to demonstrate my point:

  • All virtual machines are on a single host which has 32 GB of RAM installed.
  • Each virtual machine is sized to 8 GB of RAM.
  • Each virtual machine has 25% active memory (%ACTV in esxtop and “Active” in vCenter).
VM Count Active Memory in Host Comments
3 3 * 8 GB * 25% = 6 GB Without memory over-commit, only 18% of the host’s memory is actively in use.  What a waste!
12 12 * 8 GB * 25% = 24 GB Memory is over-committed by 200% but only 75% is actively being used.  In this aggressive consolidation virtual machines will run at full speed until usage exceeds 100% of host memory.
18 18 * 8 GB * 25% = 36 GB, limited to 32 GB by host These virtual machines want 36 GB of RAM but are limited to the 32 GB that is installed on the host. ESX must swap to allow these machines to run and performance will suffer greatly.

A virtual machine’s active memory is dictated by the application and its usage.  But the VI admin has complete control over the number of virtual machines in the environment which means host active memory can be influenced by adding or removing virtual machines.  Because virtual machine active memory is always equal to or less than 100% the only way to drive the host active memory to 100% is to over-commit memory.  This is why hypervisors that do not support memory over-commit are simply not viable for data centers where memory optimization is a priority.

Identifying and Correcting the Bottleneck

The ongoing occurrence of swapping is identified by a non-zero swap rate in either esxtop or vCenter.  In addition to swap rate, esxtop provides a swap wait time in its CPU panel.  When swap rate exceeds hundreds of kilobytes per second or swap wait time exceeds a couple percentage points, it is time for corrective action.

There are three possible solutions to this problem:

  1. Balance the virtual machines’ memory usage by moving virtual machines from hosts with higher amounts of memory usage to hosts with lower amount of memory usage.
  2. Run fewer virtual machines.
  3. Buy more memory.

Designing Your Infrastructure to Simplify Memory Management

Ultimately I owe you a full white paper on memory management to provide a sufficient answer. But I want to give you two ideas of the tools and techniques that I will be describing when in this future paper.  First, place host swap files on solid state disk (SSD) stores to improve their performance.  With the right SSD device it may be possible to eliminate swap penalties.  Second, even if SSDs are unavailable consider consolidating multiple swap files onto a single store.  This will make swap rate monitoring very easy but may compound the performance penalties of swapping.

Stay tuned and VMware will provide more documentation on memory management in 2010.

10 Responses

Wouldn’t TPS make a difference? Active pages will also bw deduped. So although 25% is active ESX might store way less..

    • TPS makes some of the calculations I provided a little more complex. The amount of active memory would decrease some, as would the inactive memory, which is immaterial to the calculation. The value of TPS is also workload-dependent, making it even more difficult to apply to a hypothetical calculation. TPS can only help this calculation, but by how much I have no way of knowing. So I eliminated it for the sake of simplicity.

      • But TPS normally gives excellent results, since we tend to have too many identical OSes (even if the applications differ). I just posted this on Gabe’s blog, will paste again:

        —————————————-

        Some real data, from a customer cluster, in full production. I looked at 4 hosts from 10 available:

        Host 1 – 16 VMs – 12GB Shared – 30 GB Used – 64GB Total
        Host 2 – 14 VMs – 11GB Shared – 30 GB Used – 64GB Total
        Host 3 – 15 VMs – 10GB Shared – 32 GB Used – 64GB Total
        Host 4 – 25 VMs – 30GB Shared – 32 GB Used – 64GB Total

        Yes, I have a host right now, with 30GB Shared (I looked many times, and confirmed with the performance data on vCenter to be sure).

        These are 99% Windows Server 2003 VMs, which explains the huge amount of page sharing.

        Hosts are still with too much available memory, but would probably be near the limits without TPS. Also, MEMCTL is zero on all cases, ballooning did not even kicked in. I can *easily* double the VM density on this situation, since the memctl driver would save even more memory.
        ———————————

  • Hello Scott,
    Is 25% active memory true in all cases ? If so then design which i proposed here will be totally void…http://vcp4.wordpress.com/2010/01/08/one-min-design-guide-of-40-2gb-vms/

    • There is no number that is “true in all cases”. As I say in the article, “A virtual machine’s active memory is dictated by the application and its usage”. So, the active memory will change from VM to VM, from system to system and DC to DC. 25% is an arbitrary selection, probably near average.

      Scott

  • [...] The Cost Per Application Calculator makes it clear that investing in VMware vSphere 4 significantly reduces your datacenter hardware footprint and associated costs.  Scott Drummonds, the VMware performance expert, recently explained how memory overcommit is the only way to effectively use all of the physical RAM in a hypervisor. [...]

  • [...] The Cost Per Application Calculator makes it clear that investing in VMware vSphere 4 significantly reduces your datacenter hardware footprint and associated costs.  Scott Drummonds, the VMware performance expert, recently explained how memory overcommit is the only way to effectively use all of the physical RAM in a hypervisor. [...]

  • [...]   Why is it important to track active memory –check this post [...]

  • Leave a Reply

    Blog WebMastered by All in One Webmaster.

    Switch to our mobile site

    WP SlimStat