Choosing the right storage for your virtual platform

 In administration, Hyper-V, Linux, VMWare, Windows

We have recently been working with a customer to help them decide upon a new storage platform, for a virtual environment they are putting together. Choosing the right storage for your virtual platform and making the correct decision took some thought, some research, some testing and some maths. If you don’t purchase storage which is quick enough to serve your virtual machines, the infrastructure could grind to a halt. Making the right choice is key. I thought we’d share that process with you. Hopefully it can help you too.

The process involves taking measurements of your current servers disk usage, totaling up all of the IOPS they required and then buying storage which can fulfill all of the IOPS requirements, whether that be SSD, SAS or SATA.

Calculating your storage needs

Lets discuss each of these processes in turn

How many virtual machines are you going to be running on the storage?


This is a fairly easy one.

Make a note of the names of each server, whether it they be real or virtual servers, which you wish to run as a virtual machines on the new storage being purchased. The customer we worked with recently came to a grand total of 18 servers, which they wanted to virtualise. 2 servers were already virtual, but the process is the same regardless.

Run performance monitors on these servers to calculate the IOPS needed

This is slightly more technical and differs depending on whether you are using Windows or Linux.

Windows

Fire up the Performance Monitor utility and choose the following counters for each disk present in your Windows server.

  • Disk Reads/Sec
    • Read IOPS
  • Disk Transfers/Sec
    • Concurrent IOPS
  • Disk Writes/Sec
    • Write IOPS

Linux

Fire up a utility used to take readings of data about your system. You could use SAR for example.

Again, you are looking to monitor Disk Reads, Disk Writes and Disk Transfers.

another good Linux tool is IOSTAT for recording IOPS.

Run this audit over the course of a week to get a really good view of your systems

Using your monitoring tool/utility, collect data from the drives within your systems and write them out to a log or csv files. Start your log and stop it 7 days later. This will give a really good indication of the disk activity, showing all the highs and lows over the course of a week. This should be long enough to get a good idea of how your systems generally perform and the IOPS they require when running.

For each VM work out the IOPS needed at the 95th Percentile

Now to test your excel skills

Import all of your data into excel and create a graph of the data. This is easier said than done, I know. You will need to understand excel so that you can sort your columns and data out, and create the graphs.

What we’re looking for is the IOPS, used by each disk in a server, at the 95th percentile.

See here http://en.wikipedia.org/wiki/Percentile for a definition of percentiles.

Basically it’s a way of looking at the data under a particular curve of the data. In our case it’s the curve NEARLY at the max, which would be the 100th percentile, but not quite. The 95th percentile is often used by internet service providers to calculate billing costs for their customers. It stops customers paying for spikes in data usage and the same is useful here. Excel has a PERCENTIL function which can be used to help draw your graphs.

Using the data from READS, WRITES and CONCURRENT IO from your disks, within your servers, you need to calculate this 95th P value. To check that you have the value right, you should be able to see the MAX values on your graphs and your 95th P value will be less. If you have a lot of spikes, it might be 10% less than your MAX values. If you have hardly any spikes, your value is much more likely to be closer to your average IO seen on your graphs.

Or you could just draw a virtual line across your graph at a level which looks like a good average, which takes into account the spikes. That’s not quite as technical but it will save you a lot of time. If you lean on side of caution and choose slightly higher values than calculated, you cannot go too wrong. The main thing is to make sure the IOPS value you write down is definitely above average. That way you’ll be using higher than calculated values and your final IOPS values for all systems will simply be high. When you choose your storage, based on this calculation, you’ll have given yourself more breathing space (resource) than is actually necessary. Either way you’ll buy something capable, which is the main thing.

When we looked at a busy Windows servers we calculated values of 70 IOPS for the C:\ (OS) drive and 100 IOPS for the D:\ (data) drive. This will be different for every server but gives you a rough idea of the types of values you might be looking at.

Total up all of the IOPS for all of the virtual machines you want to run

Now to collate all of the IOPS values calculated for each of your servers.

This is how it’s going to work. When you buy your new storage array you may choose to create one big array for all of your VM’s to reside on. In the case of the HP P2000 units, which we support for our customers, the largest array you can create is across 16 disks. the HP P2000 SFF actually takes 24 SFF disks, so you HAVE to create at least 2 arrays on them, if you want to use all of your 24 disks. You may even create 3 arrays of 8 disks each.

Some people like to divvy up virtual disks in the following manner

C:\ drives or OS partitions in Linux onto – Disk Array 1

D:\ drives (data) or Data partitions in Linux onto – Disk Array 2

X:\ drives (backups) or a partition used in Linux for backups, onto say – Disk array 3

If this is the way you envisage using your new storage, for your different drives, then you simply need to tally up:

  • all of your IOPS values taken from the OS drives
  • all of your IOPS values taken from the data drives
  • and all of your IOPS values taken from additional drives, say backup drives for example.

You tally up the IOPS values into groups of your choice, depending on how you want to place them onto the arrays you create inside your storage system. In the example above we have 3 sets of IOPS values: OS, Data and Other; Arrays 1,2 and 3

Example results

just as an example, we did the same for a customer with 9 Windows machines.

All windows machines had a C:\, obviously, and the 9 machines we tested had a grand total:

320 IOPS for all the C:\ drives.

Some of the 9 windows machines had data drives. The 9 machines we monitored had a much higher total IOPS:

780 IOPS for all of the data drives

Now backups are very very IOP intensive. IF you were going to run ALL of your system backups at the SAME time, then the disk array used might have to be very fast, if you want the backups to complete very fast. The reality is that people stagger their backups and don’t mind if they take a while. Generally backups run through the night. The backup drives we tested had IOPS values of:

600 IOPS for most servers, whilst the backup was running.

I wouldn’t try running all the backups simultaneously! But if we had to, you would have a grand total of about 5400 IOPS required.

So – You now have your TOTAL IOPS requirements for your different arrays. Now onto choosing the right storage.

Calculating IOPS performance for different drive and array types

So first of all, as I’m sure you are aware, SATA, SAS and SSD drives have differing throughputs. They all have different IOPS values. Input / Output per Second.

We use HP drives here at Binary and “generally” quote for current or last generation drives. As of writing (Jan 2015) these are some rough figures for different drive types.

  • HP SATA 7.2K LFF 3.5″ Disk – 80 IOPS
  • HP SAS 15K LFF 3.5″ Disk – 175 IOPS
  • HP 80GB 6G SATA Value Endurance LFF SSD – 7,700 IOPS Write and 60,000 IOPS Read
  • HP 640GB DUO IO Accelerator Card – 138,000 Mixed R/W IOPS
These are values for singular drives.

RAID Penalties

When you start utilising more than 1 drive in an array, the IOPS performance of that array goes up. More drives = More spindles = More IO

The different RAID types you use ALSO has an effect on the IO of the array. This is commonly known as the “write penalty or IOPS penalty” of RAID sets.

  • RAID 0 has a penalty of 1 writes
  • RAID 1 and RAID 10 has a penalty of 2 writes
  • RAID 5 and RAID 50 has a penalty of 4 writes
  • RAID 6 has a penalty of 6 writes

which is all based upon how the data is thrown down onto the disks and the parity that is written.

What this means in reality is that R0 has the highest IO if used on an array and R6 has the lowest.

Putting it all together

So now we have:

  1. the IOPS for a disk type
  2. the RAID penalty we will be subject to when building our arrays
  3. a number of disks we might put in an array
  4. and the total IOPS we calculated that our arrays need to be able to achieve

Using the following calculation we can work out the total IOPS our arrays can achieve:

(TOTAL IOps × % READ)+ ((TOTAL IOps × % WRITE) ×RAID Penalty)

Lets just say, for example, we want to build an array for our C:\’s, using the 320 IOPS values we calculated from our servers. Lets also say that our systems do 30% READS and 70% WRITES. The actual amount of IOPS these servers are going to create is:

if we use RAID 5

(320 x 0.3) + ((320 x 0.7) x 5) = 1216

if we use RAID 10

(320 x 0.3) + ((320 x 0.7) x 2) = 554

Using RAID5 creates over twice as much IO, as you can see.

We repeat this for each set of results we grouped up. C:\’s for array 1, D:\’s for array 2 and so on.

Eventually we get the “amount” of IOPS our systems will create for the different partition types. e.g. 554 for array 1, 800 array 2 and 1000 array 3.

and finally

We can now use a calculator, such as the one here

http://www.wmarow.com/strcalc/strcalc.html

to calculate the amount of IO our SAS/SATA/SSD arrays can handle.

For example, I selected the “Seagate Cheetah 15k 3.5″ 450 GB” drive from the drop down here, defined that our array would be 8 disks wide and inserted the 30% / 70% R/W ratio discussed earlier.

The overall IOPS value for an array of this type came out at 1373 RAID10. Our requirements for array 1 was 554 RAID10. We have calculated that an 8 disk SAS array of this type can achieve 1373.

Perfect – in fact we can place another 800 IOPS on this array before we’ll start to see a slow down.

Lets say we decided to save money and try using SATA disks instead. Again using this calculator we can see that it will achieve 350 IOPS. Ergo this drive type is sufficient. In fact, we have to increase the amount of disks in a SATA array to 14 to get it to achieve 606 IOPS, which would be enough.

Please don’t forget that IF you choose a different RAID type, you will also have to re-calculate the number of IOPS all of your disks will produce. Remember the RAID penalty – i.e. R5 = much higher.

Final Thoughts

  1. You can use the actual values given from manufacturers, with wmarow’s tool, if the disks shown in the drop down aren’t representative of the values for the disks you were thinking of using.
  2. SSD have a stunningly high throughput and you’ll see that you can put huge amounts of IOPS onto a disk of this kind. In our example you could put ALL of the C:\’s the D:\’s AND the backups onto an SSD and still have room for extra.

So to summarise, the process is like this:

Summary

  • record the IO on the disks in use today, across the servers you want to virtualise
  • calculate a 95th percentile value which is representative of the above average usage of your disks
  • total up all of the IOPS into groups as you wish to lay them down onto your arrays
  • think about the RAID levels you’d like to use on your arrays
  • calculate the actual IO you’ll use on your array based on your RAID requirements
  • start playing around with the calculator to work our how many drives and the types of drives, you’ll need, to administer the IOPS you need
  • You’ll end up with an idea of array size, an idea of raid types and an idea of disk types, you’ll need to cover your IOPS requirements
  • Give yourself some overhead, so you can add more virtual machines in the future

The Real World

  • Here at Binary Royale, we tend to use HP P2000’s as mentioned before.
  • We supply SFF and LFF units. The small form factor units take 24 disks and the large form factor units take 12 disks.
  • The LFF disks are cheaper and have a slightly better throughput on them. This is actually down to the disk seek latency.
  • We’ve always used SAS disks for virtual platforms as SATA just don’t seem to be fast enough
  • having said that, if a virtual machine was say a fileserver and needed 5TB of slower storage, then creating a VHD/VMDK on a SATA array, and giving it to the fileserver virtual machine, would be very cost effective. Again, remember it’s all about the TOTAL IO on the array. If ONLY the fileserver and say one more VM was using the SATA array, it would be fine.
  • This is where the maths comes in. You calculate what you need and then you can spread it out across varying disk types and styles, depending on what you need.
  • We also using HP IO accelerator cards (SSD), if the virtual machine is going to run high IO database applications. It works!

I hope this has been informative and can help you.

If you cannot be bothered with all of this or need some additional guidance, just give us a call – 01332 890 460

Leave a Comment

Contact Us

Call us on: 01332 890460 or Send us an email and we'll get back to you, asap.

Not readable? Change text. captcha txt