Last year at VMworld 2009, Irfan Ahmad and Ajay Gulati presented a preview of an unreleased technology VMware is calling Storage IO Control (SIOC). SIOC is a feature aimed squarely at the number one cause of VMware performance problems: underperforming storage. Year after year I see misconfigured storage slowing virtualized applications with VMware blame for the problem. Now VMware hopes to add a new tool to our administrators’ toolboxes to help them identify and mitigate underperforming storage.
Storage problems are most easily identified by high device latency. When storage takes a long time to service IOs (over 20ms by my definition), application owners will soon start complaining. The goal of SIOC is to identify this trend at the VMFS volume level and take corrective action to protect high priority virtual machines. This requires two key innovations in vSphere:
- VMFS volume latency calculation
- Throughput reduction through device queue resizing
Latency is calculated at the VMFS volume by SIOC by weighting the average of the latencies of all virtual disks on the volume. Latency is weighted by the number of IOs (IOPS) with slight modifications to account for the expected difference in latency for different IO sizes. When the volume’s weighted average latency crosses a user-defined threshold, SIOC acts. This means reducing the virtual machines’ aggregate throughput to protect mission critical virtual machines.
The fundamental change provided by SIOC is volume-wide resource management. With vSphere 4 and earlier versions of VMware virtualization, storage resource management is performed at the server level. This means a virtual machine on its own ESX server gets full access to the device queue. The result is unfettered access to the storage bandwidth regardless of resource settings, as the following picture shows.
SIOC will throttle virtual machine throughput once a volume’s normalized latency crosses a threshold. The throughput is limited by decreasing each virtual machine’s access to the queue to an amount defined by its relative shares. The following figure shows this in action.
The net effect of SIOC is that, under contention, virtual machines with higher shares are given prioritized access to the storage. This allows administrators to protect important virtual machines.
SIOC is going to provide a major benefit to VMware’s customers that I am sure will be appreciated by everyone. But I want to give everyone an important warning: SIOC is not a storage panacea. Poorly performing storage before SIOC remains poorly performing after SIOC has taken action. But with SIOC enabled, VMware’s administrators will know that mission critical virtual machines will be protected from the fluctuating demands on shared storage.
This feature has so far only been demonstrated by VMware in our lab environments. While we hope to make this tool available in production versions of vSphere as soon as possible, we have not yet committed SIOC for any specific launch date nor to any specific version of vSphere.