vPivot

Scott Drummonds on Virtualization

Optimizing Memory Utilization

16 Comments »

My recent series of blog articles have discussed ESX memory management the the performance specter of host swapping. My last article attempts to correct the misconception that VMware recommends against over-commit memory.  In that article I suggested that memory over-commit is requirement in optimizing memory utilization. Today I want to provide a specific example to show why this is true.   I am have also included tips for identifying host swapping in your environments.

Understanding the Bottleneck

Let me show the value of over-commit and danger of swapping by way of an example. I will choose the following typical values to demonstrate my point:

  • All virtual machines are on a single host which has 32 GB of RAM installed.
  • Each virtual machine is sized to 8 GB of RAM.
  • Each virtual machine has 25% active memory (%ACTV in esxtop and “Active” in vCenter).
VM Count Active Memory in Host Comments
3 3 * 8 GB * 25% = 6 GB Without memory over-commit, only 18% of the host’s memory is actively in use.  What a waste!
12 12 * 8 GB * 25% = 24 GB Memory is over-committed by 200% but only 75% is actively being used.  In this aggressive consolidation virtual machines will run at full speed until usage exceeds 100% of host memory.
18 18 * 8 GB * 25% = 36 GB, limited to 32 GB by host These virtual machines want 36 GB of RAM but are limited to the 32 GB that is installed on the host. ESX must swap to allow these machines to run and performance will suffer greatly.

A virtual machine’s active memory is dictated by the application and its usage.  But the VI admin has complete control over the number of virtual machines in the environment which means host active memory can be influenced by adding or removing virtual machines.  Because virtual machine active memory is always equal to or less than 100% the only way to drive the host active memory to 100% is to over-commit memory.  This is why hypervisors that do not support memory over-commit are simply not viable for data centers where memory optimization is a priority.

Identifying and Correcting the Bottleneck

The ongoing occurrence of swapping is identified by a non-zero swap rate in either esxtop or vCenter.  In addition to swap rate, esxtop provides a swap wait time in its CPU panel.  When swap rate exceeds hundreds of kilobytes per second or swap wait time exceeds a couple percentage points, it is time for corrective action.

There are three possible solutions to this problem:

  1. Balance the virtual machines’ memory usage by moving virtual machines from hosts with higher amounts of memory usage to hosts with lower amount of memory usage.
  2. Run fewer virtual machines.
  3. Buy more memory.

Designing Your Infrastructure to Simplify Memory Management

Ultimately I owe you a full white paper on memory management to provide a sufficient answer. But I want to give you two ideas of the tools and techniques that I will be describing when in this future paper.  First, place host swap files on solid state disk (SSD) stores to improve their performance.  With the right SSD device it may be possible to eliminate swap penalties.  Second, even if SSDs are unavailable consider consolidating multiple swap files onto a single store.  This will make swap rate monitoring very easy but may compound the performance penalties of swapping.

Stay tuned and VMware will provide more documentation on memory management in 2010.

16 Responses

Wouldn’t TPS make a difference? Active pages will also bw deduped. So although 25% is active ESX might store way less..

    • TPS makes some of the calculations I provided a little more complex. The amount of active memory would decrease some, as would the inactive memory, which is immaterial to the calculation. The value of TPS is also workload-dependent, making it even more difficult to apply to a hypothetical calculation. TPS can only help this calculation, but by how much I have no way of knowing. So I eliminated it for the sake of simplicity.

      • But TPS normally gives excellent results, since we tend to have too many identical OSes (even if the applications differ). I just posted this on Gabe’s blog, will paste again:

        —————————————-

        Some real data, from a customer cluster, in full production. I looked at 4 hosts from 10 available:

        Host 1 – 16 VMs – 12GB Shared – 30 GB Used – 64GB Total
        Host 2 – 14 VMs – 11GB Shared – 30 GB Used – 64GB Total
        Host 3 – 15 VMs – 10GB Shared – 32 GB Used – 64GB Total
        Host 4 – 25 VMs – 30GB Shared – 32 GB Used – 64GB Total

        Yes, I have a host right now, with 30GB Shared (I looked many times, and confirmed with the performance data on vCenter to be sure).

        These are 99% Windows Server 2003 VMs, which explains the huge amount of page sharing.

        Hosts are still with too much available memory, but would probably be near the limits without TPS. Also, MEMCTL is zero on all cases, ballooning did not even kicked in. I can *easily* double the VM density on this situation, since the memctl driver would save even more memory.
        ———————————

  • Hello Scott,
    Is 25% active memory true in all cases ? If so then design which i proposed here will be totally void…http://vcp4.wordpress.com/2010/01/08/one-min-design-guide-of-40-2gb-vms/

    • There is no number that is “true in all cases”. As I say in the article, “A virtual machine’s active memory is dictated by the application and its usage”. So, the active memory will change from VM to VM, from system to system and DC to DC. 25% is an arbitrary selection, probably near average.

      Scott

  • […] The Cost Per Application Calculator makes it clear that investing in VMware vSphere 4 significantly reduces your datacenter hardware footprint and associated costs.  Scott Drummonds, the VMware performance expert, recently explained how memory overcommit is the only way to effectively use all of the physical RAM in a hypervisor. […]

  • […] The Cost Per Application Calculator makes it clear that investing in VMware vSphere 4 significantly reduces your datacenter hardware footprint and associated costs.  Scott Drummonds, the VMware performance expert, recently explained how memory overcommit is the only way to effectively use all of the physical RAM in a hypervisor. […]

  • […]   Why is it important to track active memory –check this post […]

  • Hi Scott,

    You posted this message a while ago over on vmware:

    http://communities.vmware.com/blogs/drummonds/2009/09/09/love-your-balloon-driver

    Couple of questions

    [1]

    You mentioned because Java has its own memory management it is wise to set a reservation for Java.

    Are there ANY OTHER enteprise applications/situations where the VM’s OS can not “see” what is going on inside its own memory, as in the case with Java? Or is it just Java that causes this issue?

    [2]

    For Java, you mention it is wise to set a reservation to account for the JVM,OS, and heap. Say I have a 4GB VM, and I calculate JVM+OS+Heap = 1GB, and I set a reservation for 1GB.

    Because the JVM/OS/Heap will load first, does that mean that those components will end up “landing” inside that 1GB i’ve reserved? And thus, if balloon driver DOES inflate, at least it won’t swap out fundamental JVM/heap components? Is this reason for recommending to set the mem. reservation for java?

    Or another way of putting it, what ends up sitting in that 1GB that i’ve reserved?.. and can this change during the lifetime of a VM? How do I ensure that the JVM/OS is what ends up in the 1GB i’ve reserved, as is the intention?

    Regards
    mark

    • Mark,

      [1] Anything that runs an interpreter/simulator like Java in the guest OS could cause the same problem. I am defining interpreter/simulator here as something that manages its threads’ memory inside the process. I am aware of no other applications except Java that behave this way, although I would expect the same out of .NET.

      [2] You cannot tell what ends up in the reserved memory. But it does not matter where the memory is until the ballon driver starts to inflate. At that time it will start to take memory from the guest OS, forcing it to page out low-priority processes. Usually it will do the right thing and choose the most active process(es).

      Scott

      • Hi Scott,

        Thanks for your response.

        I’m still not clear on the mechanics of why setting a memory reservation for Java workloads is beneficial?

        I am thinking the following concepts:

        [1]
        Because there is a memory reservation, there is less memory available for the balloon driver to inflate into, thus reducing the possibility of swapping to disk key areas of JRE memory.

        [2]
        If the Guest OS can’t “see” inside the JRE memory, how is it actually going to “do the right thing” and swap out the most active processes? Are we in fact just rolling the dice here and hoping the Guest OS won’t swap out key JRE memory or is there more science than that?

        Would really appreciate it if you could break this down to the next level of detail, as I’m still confused.

        Regards,
        mark

        • Mark,

          The guest will avoid swapping memory that is in use. The Java memory space is generally being used in its entirety because garbage collection will be sweeping the entire heap regularly. So, setting reservations to account for everything actively in use (including the entire heap) protect enough memory for the guest processes to run without being paged out by the guest OS.

          With respect to comment [2], this is not an exact science. But we can reasonably trust any guest’s memory management will page out less active processes over more active ones. But it is true that if the active memory spiked above the un-ballooned memory then the guest will have to swap. It is just as likely to swap Java memory as anything else so at this point performance will probably suffer.

          Scott

  • Hi Scott,

    Thanks for your response.

    According to the vSphere Resource Management Guide (vsp_41_resource_mgmt.pdf) page 107, the maximum amount of memory reclaimed by the balloon driver is 65% of the total memory allocated to a VM.

    If I have a 4 GB VM, and I calculate OS, JVM + Heap = 1GB, which is 25% of total allocated, I know the balloon driver won’t hit at least 35% of my memory.

    Therefore, in situations where OS, JVM + Heap is less than 35% of total allocated VM memory, does it actually make any difference setting a reservation on that VM?

    Regards
    mark