Scott Drummonds on Virtualization

VMware Linked Clone IO Implications


A month ago I was in Palo Alto seeing old friends and talking VMware. I spent some time with an old friend in performance engineering talking about vSphere’s implementation of linked clones. Conceptually I know linked clones require an additional map to translate the guest’s view of a contiguous file system into a reordered collection of blocks in multiple VMDK files. But I was not sure exactly how this technology worked and love hearing details. I want to share a few comments about this discussion.

Because some of the details are key to a proprietary technology, at my friend’s request I will omit some of the low-level details. Furthermore, VMware shared with me their intentions for future versions of vSphere that I cannot talk futures. Talk to your VMware representative under NDA and he can fill in the details on future products and their expected release dates.

Now that the boilerplate disclaimer is out of the way, let’s continue.

Regardless of virtual disk type–thick, thin, linked clone, whatever– from the guest operating system’s perspective it’s disk is a “normal” direct attached storage (DAS) disk. Operating systems play some tricks with this contiguous empty space and sometimes are just lazy about file placement. In the “tricks” category OSes will often place critical files near the disk’s edge to benefit from the faster read rates at the platter’s periphery. In the “lazy” category OSes will place new files in the first large unused space that is available. I learned this from my friend Bob Nolan at PerfectDisk (Raxco) during our joint work on guest defragmentation over a year ago. This lazy placement produces fragmented files and free space, both of which harm performance.

The point of this discussion on tricks and laziness is that file placement from the perspective of an OS user is arbitrary. But from the perspective of a vSphere administrator looking at a linked clone the result are minimally sized files. A curious admin will then ask: how does the mapping from the guest’s view to the VMDK work?

VMware has built above VMFS a management layer engineering calls the delta disk. (This is the same name used for snapshots’ delta disk files.) It is at the delta disk layer that guest IOs are mapped to the VMDKs on the VMFS volume. This mapping happens a very granular level. The delta disk tracks IOs that are small enough to avoid internal fragmentation for every IO size.

The benefit of this mapping process is that all IOs from the guest–no matter how logically distant they are from each other–can be placed contiguously in a VMDK. This is very good for storage efficiency. The downside is that the additional layer of indirection can increase latency for IOs. At its worse, one IO in the guest could result in many IOs at VMFS. The first to read the metadata contains the guest-to-VMDK mapping and then possibly many IOs to collect the data that may be spread throughout the VMDKs.

VMFS has for a long time buffered for linked clone metadata. This reduces the need to fetch metadata before every linked clone IO. Note that this is not the same thing as host-based file buffering, which has never been present in any version of ESX.

The fun thing about talking to VMware engineering is you learn all the amazing things they are working on in future releases. The bad thing about being a blogger is that I know all about these things but am prohibited from sharing them here. But for those of you with an NDA with VMware I recommend you buy your system engineer a beer and ask him to tell you about where VMware is going with VMFS. Linked clones are incredibly important to VMware’s strategy with end user computing (nee VDI) and VMware has some righteous plans for continued innovation with linked clones and a future version of VMFS.

And if you don’t have an NDA with VMware or access to a system engineer that can talk you through this stuff, try to attend VMworld this year in Las Vegas. I have no doubt that VMware’s engineering teams will be previewing all of these technologies if they have not already released them.

4 Responses

Nice post Scott! Just out of curisosity, does your article implies that Full Clones (Thin Provisioned) desktops would have a lower IO latency than Linked Clones? If so, how do you think storage cache would play in a Linked Clone environment when compared to a full clone environment?


    • Hi, Andre,

      Thin disks work on a different technology than linked clones, which are sometimes called sparse disks by VMware engineering. Thin disks have no overhead for mapping fine-grained IO to packed VMDKs. This was shown with performance data in http://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf.

      Thin disks are therefore better performing than linked clones. However, because they lack the ability to map IOs of any size to a packed file, the resultant VMDK is less space efficient.

  • you dickhead. all high and mighty with the “ooo, look at me. i have access to vmware secrets, i’m so special and you must love me for it and think i’m great because of it.”

    you spent more time in this post telling everyone you know something we don’t that actually writing anything of any value. that’s the trouble with most bloggers today, especially you, that they think that having a blog makes them important and in the know – well, you aren’t and it doesn’t.

    • Thank you for the thoughtful reply. It took three years and a hundred and something posts for someone to finally recognize what I have been doing. Your recognition is a wonderful validation of my effort. Really all of this would never have been possible without you. Thanks!