vPivot

Scott Drummonds on Virtualization

Justifying SSDs

10 Comments »

Ever since I saw the results of VMware’s first performance work on EMC’s Enterprise Flash Drives (EFDs) I knew the storage world was about to change.  Even though I love the idea of SSD, I still struggle with the justification of their purchase.  I have had trouble quantifying the value of an EFD and fearlessly committing customers’ money to their purchase.  In this article I want to offer a few thoughts on these devices as I formulate my own ideas as to when SSDs are needed and how we can all enjoy their benefits.

Let us start this discussion with an scientific perspective.  The following graph shows trend lines of performance by disk size.  I have greatly reduced the complexity of this chart by choosing some convenient numbers for throughput (such as 100 IOPS per SATA device) and size (such as 200 GB for an EFD).

A scatter graph showing storage capacity as compared to throughput for SATA, FC, and EFD disks.

The capacity/performance trend of today's disk technologies.

This graph shows the performance and capacity provided by up to 200 disks of three different technologies: SATA, Fibre Channel, and solid state disks.  The first salient point that this figure screams at me is that the spread between the trend lines is extreme.  The capacity/performance trends of SATA and Fibre Channel are relatively similar.  EFDs are a long way from those two trend lines.

This large spread between these two technologies suggests two things to me:

  1. The world will not much longer support the existence of two rotating disk technologies.  Assuming capacity and performance are the only criteria in disk selection, the differentiation between these two is not significant enough to merit both their presence in the market.
  2. Since real workloads’ demands will never fall exactly on a trend line, some customers are in the unfortunate position of forcing their workloads to one of these trend lines.  This will likely result in extreme inefficiency in performance or capacity.

Let me first comment on the efficiency problem.  Any workload–or in the case of virtualization, a consolidation of multiple workloads–can be mapped to the above graph as a dot whose coordinates are defined by its capacity and performance requirements.  This dot will almost certainly fall somewhere between the SATA and EFD trend lines.

You can determine the number of either single technology (SSD or rotating disks) needed for that workload by force-fitting the dot onto a trend line.  Do this by tracing a line either up or right until it intersects the device trend line.  At that point, a homogenous collection of that device type is supporting capacity and performance with a surplus of one of these.  Since these trend lines are so far from each other, if the workload fell squarely between the two trend lines then you are guaranteed to have either a very, very large surplus of performance or a very, very large surplus of capacity.

It is because of the extreme separation of these trends that storage vendors must offer dynamic performance capabilities such as auto-tiering and large caches that come with dynamic placement algorithms.  In this world of incredible performance/capacity divergence, big caches and auto-tiering are not just nice to have.  They are essential to making efficient storage decisions.

Now back to the idea of the vast separation between rotating and solid state disks.  Because large caches and auto-tiering are so essential, those technologies will make better decisions if they have a greater spread between the two extremes.  The greater the spread, the more cheaply capacity can be provided to low-use workloads and the more cheaply IOPS can be supplied to IO-intensive workloads.  This means that technologies like EMC FAST and FAST Cache are going to drive greater spread between disk technologies over time.

Lastly, since block autoplacement benefits most from the spread of these trend lines, the future of intermediate trends is uncertain.  I therefore reason that Fibre Channel disks will eventually fall out of the market.  But know that before this happens storage arrays will need to be optimally placing blocks.  Block placement is great today, but is a long way from optimal.

EMC is placing blocks in its unified storage in 1 GB chunks and in the Symmetrix line using blocks smaller than 1 MB.  As long as the storage block placement is larger than the smallest IO size, there will be some inefficiency.  But as long as the block placement is smaller than the virtual disk size, customers are benefiting from a configuration that is much more efficient than the worst case force-fit above.

I recently sat in on a customer presentation in Singapore from EMC’s president of the Symmetrix and Virtualization Product Group, Brian Gallagher.  Brian talked about a lot of things but a slide he showed covering this same topic offered a more precise view on the mixed use of solid state and rotating disks.

Figure shows different disk skew--amount of read/write to percentage of data--and possible configurations to minimize cost and disks.

Disk skew is another means of choosing disk types and counts to minimize cost and meet performance and capacity requirements.

We define skew here as the amount of IO that occurs on a subset of data. Unlike my simplistic analysis above, which assumes some flat capacity/performance requirements of the entire data set, real workloads access data in non-uniform ways. Locality of reference exists in the great majority of workloads and auto-tiering systems will produce outstanding performance/capacity fits when they can move a small number of hot blocks to the faster disk types.

The figure above contains some very rough calculations that in no way should be used as a substitute for careful calculation in your environment. But they do suggest four possible configurations where small amounts of flash, paired with very precise auto-tiering such as that provided by the VMAX, can result in storage layouts that are faster, cheaper, and smaller than configurations without these technologies.

There is a point here that bears repeating, since it has not yet penetrated the entire market. The addition of “expensive” solid state disks can produce a storage configuration that is less expensive than the configuration that lacks them.

I think I have chosen a fun time for a visit into the world of storage. I spent so much time diagnosing storage problems at VMware that I developed a lot of strong ideas about how customers should be better using stroage and how storage vendors could better support their customers. The entire storage industry is developing technologies that leverage the huge spread in performance/capacity between existing disks and those technologies are going to drive that spread even further. And as these optimizations mature, customers are going to derive even greater benefits.

10 Responses

Solid state, auto-tiering, caches, and prognostication on the future of storage: http://vpivot.com/2010/12/14/justifying-… #emc #vmware

  • In looking at storage performance many vendors and customers don’t spend enough time looking at the areas of skew and locality of reference. I am glad you brought this up because I think this is an area, at least for rotating media, that folks need to spend more time on. I have found using the appropriate tracing tools to expose the make up of a workload has given way and opportunity to develop products that identfiy and re-balance hot blocks. Careful use of introducing SSDs to assist in this area does give way to more optimized solutions and I agree customers will see better performance. Disks may get larger in capacit but don’t think you will see faster rotation speeds, at least in the near term, since we are approaching the speed the industry has not been able to stablize. SSDs are in integrated solutions are giving the customer another opportunity to either maintain or optimize their performance. SSDs are still a little pricy today but the price seems to be coming down with time. They are less expensive with more capacity today than they were two years ago and expect to see this as a continued trend so price/performance will improve over time as the technology matures…

  • Scott,

    Great post! The fact that “The world will not much longer support the existence of two rotating disk technologies” is not already coming to fruition with the rapid adoption of SAS drives. There are already storage vendors in the market that are using SAS as replacement for both SATA and FC drives.

    In terms of the use of SSD and tiering, I agree that more efficient layout of data and I/O is effective for producting better performance. One area that I think may bear watching is the separating out of metadata and placing them in SSD drives, which are used as flash memory. Orders of magnitude in performance gains may be possible depending on the data type.

  • Hmm….so I definitely follow the graph but there’s still a 2-4x performance difference between SAS/FC and SATA. To me that represents some practical issues in cache/tier warming..

    Thoughts?

    • Andrew,

      I have two arguments for why I think that 2-4x is not worth much. Let me try each on you and see what you think:

      (1) An efficient autotiering policy only needs two extremes: very fast (and very expensive) and very big (and very inexpensive). With a big cache in place to speed the access to slow data that has not yet been promoted, there is no need for an intermediate tier. In fact, that tier is either too slow or too expensive.

      (2) Because of the locality of reference of most data, a 2x improvement at the slow tier yields little value to the workload. Consider the 90-10 case. 90% of the access is coming from 10% of the data which is on SSD. 10% of the access is on the data that is demoted to a rotating disk. This means that a 2x improvement in rotating disk speed can only halve 10% of the accesses. This is only a net improvement of 5%. And this assumes 100% of the rotating disks are on FC disks. If this was reduced to 50%, then the net benefit of using half FC disks is only 2.5%. Is that worth the cost?

      Scott

  • Very interesting justification of EFD’s. We’re currently putting together a plan to implement a relatively small SAN (20TB) for our company and we are starting from scratch, ie. no existing SAN. It would seem that we are in an ideal situation to take advantage of the 3%/97% scenario, as we believe that our data has a moderate amount of skew. (We are looking to support a collection of highly-transient virtualized test servers used in a software test/dev environment, where most of the servers are nearly identical.) I just hope that the sales engineers we are working with understand this concept well enough to help us get the right mix of EFD/SATA.