vPivot

Scott Drummonds on Virtualization

SIOC Event: Ignore or Panic?

4 Comments »

A colleague of mine recently reported an alarming event witnessed at a customer of both EMC and VMware. After enabling Storage IO Control (SIOC), vCenter reported a possible problem with a datastore. The specific text of the event is:

External I/O workload detected on shared datastore running Storage I/O (SIOC) for congestion management

Googling this text will bring you to a VMware KB article with the following explanation:

This event indicates that the datastore associated with the event is accessed by some workloads that are not managed by Storage I/O Control (SIOC) congestion management. This might indicate that the datastore is running in an unsupported configuration which produces an unnecessary reduction in throughput for virtual machines managed by SIOC. … However, there are some supported configurations that can also produce this event, namely: Storage array performs a system operation such as RAID reconstruction or replication…

This means if you are running SIOC and replication, you are probably going to see this event. If you are not running SIOC, you should be. And if you are not replicating your data, I have an EMC sale representative I would like to introduce you to. 🙂

I talked with team VMware about this to collect their thoughts on the event. I want to sum up a few salient points of the electronic conversation:

  1. We all believe that the event is very important and customers need to know when a volume’s media is being shared by workloads that SIOC cannot control. This condition means that SIOC may be unable to correct performance problems and VI administrators should be informed.
  2. From a technology perspective, vCenter’s ability to detect the presence of IO demands outside of the virtual infrastructure is pretty impressive. It works by recognizing a lack of latency improvements when the storage queue is decreased. The fundamental premise of SIOC is that a decrease in storage throughput should improve latency. When this is not the case, vSphere can assume that another workload, not managed by vSphere, is interacting with the storage.  This detection results in this event.
  3. We all recognize that it is kind of silly that an event is raised for an extraordinarily common condition like the presence of SIOC and replication. EMC and VMware are looking into improving this and a fix will come at some future (unknown) date.

While we are waiting for an improvement in the event, your best course of action when this event occurs is to not panic and not ignore.  We should all do our part to educate VMware’s customers of the event’s purpose.  VMware, EMC, and the rest of VMware’s storage partner community will get an elegant solution in place eventually.

4 Responses

nice one Scott, great info!

  • Great info, if i am not mistaken, this applies to all storagevendors doing replikation/sharing IO with a SIOC-enabled datastore. VCB for example (May you rest in peace).

  • I’ve also seen this event in response to a “create VM from template” command.

  • […] today’s post I want to update and amplify thoughts from an old post on Storage IO Control (SIOC).  VMware customers that are using SIOC may sometimes see the following vCenter alarm: Non-VI […]