vPivot

Scott Drummonds on Virtualization

Windows Guest Defragmentation

10 Comments »

Today at VMware Partner Exchange I had a lunchtime discussion with a partner of ours that makes a Windows file system (NTFS) defragmentation tool. He related anecdotes of incredible performance acceleration credited to defragmentation and quoted a few numbers based on his test environment. When he asked me what VMware’s recommendations were on the subject I remained uncharacteristically silent. Do we have best practices on this?

When people ask me about file system fragmentation I explain that fragmentation can come from two sources: the guest file system or VMFS.  In 2009 we included experiments in our thin provisioning white paper that showed that both internal and external fragmentation in VMFS have no significant effect on performance.  As for guest fragmentation, VMware has avoided the business of optimizing native operating systems so there is no extant, official guidance.

More precisely, the large number of mappings from the guest file to the disk make it difficult to know how changes to each can impact the system’s performance as a whole.  But in talking with this partner I realized that there are two inescapable truths that suggest guest defragmentation is critical in a virtualized environment:

  1. Defragmentation can decrease the number of disk commands and the resultant IOPS.
  2. The fewer IOs, the more efficient the virtualization.

Guest defrag tools will order each file’s blocks sequentially in the guest file system.  This will enable the guest to make a few number of calls to larger, contiguous data than had the blocks been separated on the guest file system.  By making fewer calls to larger blocks, the following things happen:

  • The array can leverage its faster sequential access capabilities to improve storage throughput.
  • The hypervisor handles fewer SCSI messages from the guest resulting in lower overhead.
  • The smaller number of commands results in fewer outstanding operations in the 32-element HBA queue, which allows more virtual machines to access the storage concurrently.

I have not found out how much consolidated workloads have to gain from guest defragmentation.  Nor have I quantified the impact to shared storage of a shift from a larger number of small commands to a smaller number of large commands.  But I am going to work with this partner and see if we can publish some numbers in 2010.

10 Responses

Hi Scott,

I think in the “power user” windows community this “speedup” always gets touted but I dont think it is all that significant.

I wonder if it would increase the efficiency of any SAN dedep though? That would increase cache hits and lower space requirements. Again I’m not sure this would be a speedup and depends how the block sizes of SAN and guest NTFS compare.

Tom

  • Thanks for this post; I think the topic isn’t getting enough attention.

    A few thoughts…

    Enterprise storage arrays have optimized read and write patterns, and as such defrag can work against these optimizations.

    On thin provisioned VMDKs, defrag can cost one to lose the storage savings of the thin format as all blocks are copied and rewritten inside of the VMDK, thus increasing its size. I discussed this topic here:

    http://blogs.netapp.com/virtualstorageguy/2009/10/vce-101-thin-provisioning-part-1-the-basics.html

    Once a thin VMDK gets inflated, it takes a lot of effort to deflate, including the final step of Storage VMotion.

    This deflation process has ripple effects in terms of infrastructure scaling, manageability, and possible networking bandwidth (specifically if one is using SRM).

    In addition, defrag writes out blocks that gets reversed by arrays with production data deduplication technologies.

    I covered these points here:

    http://blogs.netapp.com/virtualstorageguy/2009/10/vce-101-thin-provisioning-part-2-going-beyond.html

    It is NetApp’s official position that defragmenting GOS file systems is not required, and actually adds IO load to the controller, which is unnecessary.

    Regarding the VMware thin provisioning report… I’d like to highlight a gap. The test bed is devoid of any negative impacts associated with thin provisioning. This is a result of the tested.

    The block allocation ‘blip’ seen with TP VMDKs (note – its not that big of a perf hit but it is there) is only seen with VMFS. It is not seen with NFS.

    The I/O generating tool, IOMeter, pre-created the files, which I/O is, generate. As the I/O measurement came after the allocation of new blocks (for the TP VMDK to grow) the tests should provide results equal to thick VMDKs.

    An interesting test would have been start with multiple thin and a multiple thick VMDKs on a shared VMFS datastore, and time how long it takes IOMeter to create the file in each VMDK.

    Again, while I disagree with the summary of this post I do want to say thanks for bringing up this topic and hopefully we will have more discussions / comments.

    • Fragmentation seems so valuable on your desktop because you have a single large disk drive that can play ping-pong if the disk is forced to bounce the head around for almost any read operation it has to do. In larger scale environments… those VM’s are spread across multiple physical disks and the arrays (regardless of vendor) have technology to try and reduce the amount of head movement you have to do. Net/Net – Put a couple of 5 disk RAID 5 disk groups on your local PC and you won’t really see the need to defragment it either
      😉

    • Hi Vaughn,

      I work for one of the major defrag vendors, and agree that this topic isn’t getting enough attention as well.

      I also agree with the caveats you mention (true for many CoW-type technologies). These are important for admins to be aware of. But, I do have to disagree with your conclusion (and agree with Scott).

      There is benefit to be gained from defragmentation. Largely from the fact that you eliminate unnecessary I/O to allow for greater VM density, more efficient use of existing infrastructure, etc…

      I should also add that different defrag vendors offer varying solutions to address these caveats. Not all defrag is the same, and third parties tend to keep pace with industry’s technology progression more than free or built-in tools.

      I suggest we (you, Scott, myself, and any other applicable vendors) get in contact. I think if we share ideas and opinions, maybe do some tests, we can come to a universally agreed position on the subject. Our common customers win.

  • I do not see much of point in defragmenting guests stored on SAN array, data blocks is scattered all around anyway. Defragmentation also nullifies benefits of vSphere’s changed block tracking which is essential for Data Recovery to work properly.

  • Split IOs on NTFS occur because the IO request is larger than the allocation unit size of the file system. An example would be an 8K IO request on an NTFS Filesystem that was formatted with the default allocation unit size of 4K. Not too many years back, most of the server vendors shipped automated build programs for loading the Windows OS. These programs started with a FAT partition that was later converted to NTFS. That resulted in an allocation unit size of 512 bytes. As you can imagine, this resulted in a lot of excessive split IOs. It’s no accident that the default allocation unit size is 4K. The page file on a windows system is accessed in 4K pages. This extreme case mismatch did cause system slowdowns back then. To prevent unnecessary split IOs, you need an allocation unit size as close as possible to the average IO size on the volume. This is of course from the OS perspective. VMware and other hypervisor layers aggregate IO and may or may not write to the storage with the same IO size. Likewise, the storage itself may have a storage virtualization layer with a specific IO size. The disks themselves have sectors which may be as small as 512 bytes or as large as 4K on some newer drives. Choosing an allocation unit size isn’t as easy as it once was. You may just end up shifting work in the guest to work in the hypervisor or work in the storage controller or even on the disk itself.
    Disks, or technically the files that reside on disks that are larger than the allocation unit size, become fragmented for many reasons. If a have a logical drive, D:, and it only has one file on it, that file will never become fragmented. If I have multiple files, but I fill them to their maximum size on creation, those files will never become fragmented. Fragmentation happens for a combination of reasons. When you are extending multiple files, the extents are interleaved causing “fragmentation”. If you delete a file, NTFS uses a first fit algorithm. As you write a new file in the gap, if the file is larger than the gap it could become “fragmented”. All of this discussion assumes an explicit placement on the disk where block 1 is followed by block 2 and then three, etc. That explicit placement doesn’t exist in virtualized or thin provisioned volumes. De-duplication as well does not use explicit placement. Even with fixed or thick disk, defragmentation would have no impact on random IO over a large working set. It could potentially improve sequential read IO.
    So, the moral of the story is to know your stack. If the primary application is writing 4K IOs, the hypervisor is writing 8K IOs, and the storage controller is writing 16K IOs to a disk may have a sector size of 4K. Allocation unit size would determine which layer does the work; it wouldn’t reduce the work in any but the extreme edge cases. Likewise, disk defragmentation may improve sequential read workloads, but is unlikely to improve random workloads. In addition, it is unlikely to improve performance on thin provisioned, deduped, or virtualized volumes.

    John

  • […] defragmentation of VMs a good thing? Scott Drummonds asks the same question in this blog post. My only comment: avoid defragmentation with thin provisioned disks (array-level or […]

  • […] I have received questions about guest defragmentation tools for years.  Until today I could only pose theories as the value guest defragmentation.  But previous theories spawned new research and one of VMware’s partners is now putting […]

  • Scott, Did anything come of this? Did any work happen with the partners?