Storage Performance 101 Part 1 – Back to Basics

I have recently spoken to a number of prospective customers who are evaluating new storage solutions (emerging storage vendors that are building flash enabled storage arrays) – what I find (not to my surprise), is that a good number of vendors are touting ambiguous/misleading marketing claims.  If you are evaluating storage solutions out there, be sure to challenge and validate the vendor’s claims, and equip yourself to smell B.S and push the bullshit button.  This will be a two part blog post on storage performance.  Here are some examples of ridiculous bullshit that I heard:

  • “We could do 100,000 random IOPS with 512-byte block size”
  • “We see average of 0.2 ms latency with queue depth of 1”
  • “Well, you can test performance on our storage unit with 10-20GB volume, just to make it quick and simple”

I will tear apart these stupid claims towards the end of this post…or maybe you could even do that yourself after reading this post.

Basic terminology
What is an I/O?
Stands for Input/Output, it is a request for a block, it could be a read or write operation.

What is Sequential I/O?
Blocks are read or written in a sequential fashion. Examples would be video streaming, VMware storage vMotion, DB transaction log replay. Even mechanical disks perform sequential I/O relatively well, especially for the price you pay. Most array solutions will perform this well.

What is Random I/O?
Blocks are requested/over-written randomly. Example workloads that generates random I/O are: OLTP, Exchange, VDI. Most enterprise real workloads are random, so you’d most likely be interested in how the storage solution does in this area.

Basic Storage Performance Measurement
These measurements are analogous to 0-60mph time for a car. In the storage world, we care about Throughput & Latency.

Throughput: How many IO transactions can the array complete in a second? Or how many MB of data blocks are processed by the array for a given second. The unit of measurement used depends on the type of IO.

  • Always measure random I/O with IOPS (IO processed per second) – why? Because the application will request for blocks of various sizes…and given that MB/s throughput = block size x IOPS, variable block size request would yield inconsistent MB/s throughput
  • Always measure sequential I/O with MB/s – why? Because blocks are typically accessed in a sequential fashion, therefore, the MB/s throughput number is more meaningful

Latency: how long does the array take to finish an IO operation?
Most main stream applications care about average latency. Of course, apps such as trading or medical apps may have a strict requirement for max latency tolerable.

Variables that affect throughput & latency
I/O Request Size (a.k.a I/O Block Size)
I/O Request Size varies, depending on the I/O access pattern. Sequential I/O typically involves large block size (we know you will likely need the next block, so we will go ahead and fetch it). Random I/O, on the other hand, typically involves smaller block size. For example, existing customer placed a new set of products would result in a small update to the existing database table corresponding to the specific customer. Keep in mind, while random I/O typically operates in small I/O block size, they typically don’t go as small as 512-bytes! Always keep in mind, random I/O typically operates at small block size (4k-32k), but it typically doesn’t go as small as 512-bytes! When in doubt, check with your application owner, or if you are running ESX, try vscsistats tool to generate histogram of the I/O block maps. My next post will cover this in greater detail.  If you hear high IOPS throughput with less than 512-byte block size, ask to see the performance number with 4k or above block size!

Queue Depth (a.k.a Outstanding I/O)
Enterprise workloads typically have lots of outstanding I/Os. A fast storage array that can process and complete I/O will yield lower latency for the application, and that translates to faster application response time. By definition, latency = Queue Depth/IOPS. Imagine if you set your host to have a queue depth of 1, you’d get great latency number. But that’s not a good representation of the real workload in your environment. Additionally, you are barely scratching the surface of your array by sending only one outstanding I/O at a given time. So beware of claims of low latency that has a queue depth of 1 during performance testing. Key is to ask the storage vendor what the sweet spot queue depth value is, for both random IO & sequential IO.

If you are doing a performance validation to prove the vendor’s performance claim, then be sure to use a volume size to generate data set that is larger than the DRAM size of the storage controller.  Otherwise, you don’t fully test the array’s ability to process I/O outside of DRAM in the controller.

In summary, when evaluating a new emerging storage vendor’s solution (all flash, hybrid), be sure to keep these questions in mind, when facing claims of XXX IOPS, X ms latency:

4 Comments

  1. Hi Wen,
    Great article, it gives weapons to fight with these wolves, salesmen of the latest-best-solutions-in-the-known-Universe. I’m impatient to read your next articles.
    But something bothers me : In which manner Nimble is different from these wolves ?
    Anyway Yu rock !
    e-Van

  2. Hi e-Van, thanks for your comment. Glad you got something out of this post. Very interesting question you asked – how’s nimble different? :) we make performance claims based on real customer data (we know how well our boxes live up to our claims because auto support in each deployed unit sends health/perf heartbeat back to corporate – and we analyze those data). I don’t want to sound like a marketing machine so I’ll stop here. My whole intention is to ensure folks do their due diligence before making a purchase – some wolves out there leave bad marks for us storage enthusiasts!

    • How do you perform the collect back to Nimble ? I mean, do the boxes need a direct Internet Access ? Ok through a proxy ? Through a modem ? the last option may sound crazy but I know a lot of companies that could be reluctant to see their storages boxes talking directly to the Net. In the past, IBM proposed a similar solution with Service Director and that was really efficient.

      • oh, it’s through http. We do provide the option to enter in proxy information (since most organizations don’t allow direct access to internet).

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>