Scott Sauer recently asked me a tough question on Twitter. My roaming best practices talk includes the phrase “do not use PVSCSI for low-IO workloads”. When Scott saw a VMware KB echoing my recommendation, he asked the obvious question: “Why?” It took me a couple of days to get a sufficient answer.
One technique for storage driver efficiency improvements is interrupt coalescing. Coalescing can be thought of as buffering: multiple events are queued for simultaneous processing. For coalescing to improve efficiency, interrupts must stream in fast enough to create large batch requests. Otherwise the timeout window will pass with no additional interrupts arriving. This means the single interrupt is handled as normal but after a useless delay.
An intelligent storage driver will therefore coalesce at high IO but not low IO. In the years we have spent optimizing ESX’s LSI Logic virtual storage adapter, we have fine-tuned the coalescing behavior to give fantastic performance on all workloads. This is done by tracking two key storage counters:
- Outstanding IOs (OIOs): Represents the virtual machine’s demand for IO.
- IOs per second (IOPS): Represents the storage system’s supply of IO.
The robust LSI Logic driver increases coalescing as OIOs and IOPS increase. No coalescing is used with few OIOs or low throughput. This produces efficient IO at large throughput and low latency IO when throughput is small.
Currently the PVSCSI driver coalesces based on OIOs only, and not throughput. This means that when the virtual machine is requesting a lot of IO but the storage is not delivering, the PVSCSI driver is coalescing interrupts. But without the storage supplying a steady stream of IOs there are no interrupts to coalesce. The result is a slightly increased latency with little or no efficiency gain for PVSCSI in low throughput environments.
LSI Logic is so efficient at low throughput levels that there is no need for a special device driver to improve efficiency. The CPU utilization difference between LSI and PVSCSI at hundreds of IOPS is insignificant. But at massive amounts of IO–where 10-50K IOPS are streaming over the virtual SCSI bus–PVSCSI can save a large number of CPU cycles. Because of that, our first implementation of PVSCSI was built on the assumption that customers would only use the technology when they had backed their virtual machines by world-class storage.
But VMware’s marketing engine (me, really) started telling everyone about PVSCSI without the right caveat (“only for massive IO systems!”) So, everyone started using it as a general solution. This meant that in one condition–slow storage (low IOPS) with a demanding virtual machine (high OIOs)–PVSCSI has been inefficiently coalescing IOs resulting in performance slightly worse than LSI Logic.
But now VMware’s customers want PVSCSI as a general solution and not just for high IO workloads. As a result we are including advanced coalescing behavior in PVSCSI for future versions of ESX. More on that when the release vehicle is set.
PVSCSI In A Nutshell
If you plodded through the above technical explanation of interrupt coalescing and PVSCSI I applaud you. If you just want a summary of what to do, here it is:
- For existing products, only use PVSCSI against VMDKs that are backed by fast (greater than 2,000 IOPS) storage.
- If you have installed PVSCSI in low IO environments, do not worry about reconfiguring to LSI Logic. The net loss of performance is very small. And clearly these low IO virtual machines are not running your performance-critical applications.
- For future products*, PVSCSI will be as efficient as LSI Logic for all environments.
(*) Specific product versions not yet announced.
Update: February 16
The simple, almost austere KB on this rare occurrence raised more questions than answers. You may notice that the KB has been updated with text from this blog since the blog’s original publication. A white paper on PVSCSI that had been under construction for quite some time was also released with a VROOM! article we often pair with such a white paper.