Equallogic, VAAI and the Fear of Queues

Previously I posted on how using bigger VMFS volumes helps Equallogic reduce their scalability issues when it comes to total iSCSI connections. There was a comment about does this mean we can have a new best practice for VMFS size. I quickly said, “Yeah, make em big or go home.” I didn’t really say that but something like it. Since the commenter responded with a long response from Equallogic saying VAAI only fixes SCSI locks all the other issues with bigger datastores still remain. ALL the other issues being “Queue Depth.”

Here is my order of potential IO problems on with VMware on Equallogic:

  1. Being spindle bound. You have an awesome virtualized array that will send IO to every disk in the pool or group. Unlike some others you can take advantage of a lot of spindles. Even then, depending on the types of disks some IO workloads are going to use up all your potential IO. **Solution(s): **More spindles is always a good solution if you have unlimited budget. Not always practical. Put some planning into your deployment. Don’t just buy 17TB of SATA. Get some faster disk and break your Group into pools and separate the workloads into something better suited to the IO needs.

  2. Connection Limits. The next problem you will run into if you are not having IO problems is the total iSCSI connections. In an attempt to get all of the IO you can from your array you have multiple vmk ports using MPIO. This multiplies the connections very quickly. When you reach the limit, connections drop and bad things happen. Solution: The new 5.02 firmware increases the total maximum connections. Additionally, bigger datastores means less connections. Do the math.

  3. Queue Depth. There are queues everywhere, the SAN ports have queues. Each LUN has a queue. The HBA has a queue. I would need to defer to a this article by Frank Denneman (a much smarter guy than myself.) That balanced storage design is best course of action. Solution(s): Refer to problem 1. Properly designed storage is going to give you the best solution for any potential (even though unlikely) queue problems. In your great storage design, make room for monitoring. Equallogic gives you SANHQ USE IT!!! See how your front end queues are doing on all your ports. Use ESXTOP or RESXTOP to see how the queues look on the ESX host. Most of us will find that queues are not a problem when problem one is properly taken care of. If you still have a queuing problem then go ahead and make a new datastore. I would also request Equallogic (and others) release a Path Selection Policy plugin that uses a Least Queue Depth algorithm (or something smarter). That would help a lot.

So I will repeat my earlier statement that VAAI allows you to make bigger datastores and house more VM’s per store. I will add a caveat, if you have a particular application that needs a high IO workload, give it a datastore.

Written on October 29, 2010