Scott Drummonds on Virtualization

Databases, Storage, and Solid State Disks

Comments Off on Databases, Storage, and Solid State Disks

A colleague of mine dropped by my desk on Friday to talk about storage best practices for virtualized databases (SQL Server in this case).  He observed a VMware deployment where the data and log files for a SQL Server virtual machine were consolidated on a single VMFS volume backed by a RAID 5 LUN.  “Is this a VMware best practice?” he asked.  “Should you not put the redo logs on a RAID 10 LUN?”  The answers are ‘no’ and ‘yes’, respectively.  And with the solid state disk (SSD) auto-tiering from EMC (FAST) the second answer is an emphatic “YES!”

A perfunctory bit of guidance I include in nearly all of my performance talks (such as the enthralling, entertaining, and cancer-curing* presentations from VMworld 2010 that I will repeat in Copenhagen from 12-14 October) is “follow your application best practices”.  Audiences usually nod and immediately forget because this recommendation we all know to be correct yet somehow ignore.  In that way it is like, “stay away from fatty foods”, “do not drink wine with pain killers”, or “pay attention during the flight attendants’ presentation”.

Part of the reason why people forget this nugget is because advice is general, and not crystalized in a technological explanation that embeds deep in the minds of the audience.  In this case the application best practice that should be followed is to separate data from logs, putting the data on something good for random read performance (like RAID 5) and the logs on something good for sequential write performance (RAID 10).  Obviously I want everyone to consolidate their storage to VMFS and enjoy the technology, but if you are putting VMDKs that contain each of these files on the same volume, you are ignoring application best practices.

In this case I recommend building two VMFS volumes.  One backed by RAID 5 and the other by RAID 10.  Put the data on RAID 5, the logs on RAID 10.  While you will change the access profile at the array by putting multiple log files on the same RAID 10 backed LUN, the resultant IO will be much more sequential write than had you mixed data file reads among them.  So, consolidate multiple data files onto the same RAID 5 LUN and consolidate multiple log files on the same RAID 10 LUN.

Furthermore, if you are using solid state auto-tiering to manage your volumes, you do not need to protect your database log file with this technology.  What I am talking about here is EMC’s Fully Automated Storage Tiering (FAST), which is the most popular thing EMC has created since I have been paying attention.  Despite what some people will tell you, solid state disks are the cheapest way to serve huge amounts of random reads.  But their benefits diminish when the profile is sequential write when they become unattractive from a cost perspective.

EMC’s FAST works by creating a volume that is like a vertical stripe of multiple RAID groups.  LUNs, which become VMFS volumes, are then placed in that FAST volume.  Since FAST is a great technology for solid state disks, RAID 5 is the most cost efficient configuration for database data, and solid state is wasted on sequential IO such as redo logs, my best practice for virtual storage configuration for databases workloads when FAST may be present can be boiled down to the following rules:

  • Always create RAID 5 volumes for your read-intensive database data.
  • Always create RAID 10 volumes for your database logs.  If you have write-intensive data, you may consider putting them here, too.
  • If you have FAST, use it to stripe across multiple RAID 5 volumes of different disk types and put your random, read-intensive data on VMFS on this volume.

The last bullet is clearly the most important here. I really love FAST, and it seems that EMC’s customers are crazy for it.  But its not the technology you need for sequential write workloads like redo logs.  Separate those data onto their own “normal” (not FAST-backed) VMFS volumes that use no SSDs.  Then you will have the best of all worlds: optimally deployed disk technologies, application best practice compliance, and righteous virtualized database consolidation.

(*) The claims made by the author of this blog do not reflect the views of his employer, the conference organizers, the government of the Kingdom of Denmark, or reality, for that matter.

Comments are closed.