Previously I posted on how using bigger VMFS volumes helps Equallogic reduce their scalability issues when it comes to total iSCSI connections. There was a comment about does this mean we can have a new best practice for VMFS size. I quickly said, “Yeah, make em big or go home.” I didn’t really say that but something like it. Since the commenter responded with a long response from Equallogic saying VAAI only fixes SCSI locks all the other issues with bigger datastores still remain. ALL the other issues being “Queue Depth.”
Here is my order of potential IO problems on with VMware on Equallogic:
- Being spindle bound. You have an awesome virtualized array that will send IO to every disk in the pool or group. Unlike some others you can take advantage of a lot of spindles. Even then, depending on the types of disks some IO workloads are going to use up all your potential IO.
Solution(s): More spindles is always a good solution if you have unlimited budget. Not always practical. Put some planning into your deployment. Don’t just buy 17TB of SATA. Get some faster disk and break your Group into pools and separate the workloads into something better suited to the IO needs.
- Connection Limits. The next problem you will run into if you are not having IO problems is the total iSCSI connections. In an attempt to get all of the IO you can from your array you have multiple vmk ports using MPIO. This multiplies the connections very quickly. When you reach the limit, connections drop and bad things happen.
Solution: The new 5.02 firmware increases the total maximum connections. Additionally, bigger datastores means less connections. Do the math.
- Queue Depth. There are queues everywhere, the SAN ports have queues. Each LUN has a queue. The HBA has a queue. I would need to defer to a this article by Frank Denneman (a much smarter guy than myself.) That balanced storage design is best course of action.
Solution(s): Refer to problem 1. Properly designed storage is going to give you the best solution for any potential (even though unlikely) queue problems. In your great storage design, make room for monitoring. Equallogic gives you SANHQ USE IT!!! See how your front end queues are doing on all your ports. Use ESXTOP or RESXTOP to see how the queues look on the ESX host. Most of us will find that queues are not a problem when problem one is properly taken care of. If you still have a queuing problem then go ahead and make a new datastore. I would also request Equallogic (and others) release a Path Selection Policy plugin that uses a Least Queue Depth algorithm (or something smarter). That would help a lot.
So I will repeat my earlier statement that VAAI allows you to make bigger datastores and house more VM’s per store. I will add a caveat, if you have a particular application that needs a high IO workload, give it a datastore.
So I often have epiphany teasers while driving long distances or stuck in traffic. I call them teasers because they are never fully developed ideas and often disappear into thoughts about passing cars, or yelling at the person on their cell phone going 15 MPH taking up 2 lanes.
Here is some I was able to save today (VMware related):
1. What if I DID want an HA cluster to be split in two different locations, Why?
2. Why must we over-subscribe iSCSI vmkernel ports to make the best use of the 1gbe phyical nics. Is it a just the software iSCSI in vSphere? Is just something that happens with IP storage? I should test that sometime…
3. If I had 10 GB nics I wouldn’t use them on Service Console or Vmotion that would be a waste. No wait, VMotion ports could use it to speed up your VMotions.
4. Why do people use VLAN 1 for their production servers? Didnt’ their Momma teach em?
5. People shouldn’t fear using extents, they are not that bad. No, maybe they are. Nah, I bet they are fine, how often does just 1 lun go down. What are the chances of it being the first lun in your extent? Ok maybe it happens a bunch. I am too scared to try it today.
*Disclaimer – I work for a Xsigo and VMware partner.
I was in the VMware View Design and Best practices class a couple weeks ago. Much of the class is built on the VMware View Reference Architecture. The picture below is from that PDF.
It really struck me how many IO connections (Network or Storage) it would take to run this POD. Minimum (in my opinion) would be 6 cables per host with ten 8 host clusters that is 480 cables! Let’s say that 160 of those are 4 gb Fiberchannel and the other 320 are 1 gb ethernet. The is 640 gb for storage and 320 for network.
Xsigo currently uses 20 gb infiniband and best practice would be to use 2 cards per server. The same 80 servers in the above cluster would have 3200 gb of bandwidth available. Add in the flexibility and ease of management you get using virtual IO. The cost savings in the number director class fiber switches and datacenter switches you no longer need and the ROI I would think the pays for the Xsigo Directors. I don’t deal with pricing so this is pure contemplation. So I will stick with the technical benefits. Being in the datacenter I like any solution that makes provisioning servers easier, takes less cabling, and gives me unbelievable bandwidth.
So just in the way VMware changed the way we think about the datacenter. Virtual IO will once again change how we deal with our deployments.
Equallogic PS Series Design Considerations
VMware vSphere introduces support for multipathing for iSCSI. Equallogic released a recommended configuration for using MPIO with iSCSI. I have a few observations after working with MPIO and iSCSI. The main lesson is know the capabilities of the storage before you go trying to see how man paths you can have with active IO.
- EqualLogic defines a host connection as 1 iSCSI path to a volume. At VMware Partner Exchange 2010 I was told by a Dell guy, “Yeah, gotta read those release notes!”
- EqualLogic limits the number of hosts in the to 128 per pool or 256 per group connections in the 4000 series (see table 1 for full breakdown) and to 512/2048 per pool/group connections in the 6000 series arrays.
- The EqualLogic MPIO recommendation mentioned above can consume many connections with just a few vSphere hosts.
I was under the false impression that by “hosts” we were talking about physical connections to the array. Especially since the datasheet says “Hosts Accessing PS series Group”. It actually means iSCSI connections to a volume. Therefore if you have 1 host with 128 volumes singly connected via 1 iSCSI path each, you are already at your limit (on the PS4000).
An example of how fast vSphere iSCSI MPIO (Round Robin) can consume available connections can be seen this this scenario. Five vSphere hosts with 2 network cards each on the iSCSI network. If we follow the whitepaper above we will create 4 vmkernel ports per host. Each vmkernel creates an additional connection per volume. Therefore if we have 10 300 GB volumes for datastores we already have 200 iSCSI connections to our Equallogic array. Really no problem for the 6000 series but the 4000 will start to drop connections. I have not even added the connections created by the vStorage API/VCB capable backup server. So here is a formula*:
N – number of hosts
V – number of vmkernel ports
T – number of targeted volumes
B – number of connections from the backup server
C – number of connections
(N * V * T) + B = C
|Equallogic PS Series Array
Use multiple pools within the group in order to avoid dropped iSCSI connections and provide scalability. This reduces the number of spindles you are hitting with your IO. Using care to know the capacity of the array will help avoid big problems down the road.
*I have seen the connections actually be higher and I can only figure this is because the way EqualLogic does iSCSI redirection.
Due to the performance hit while zeroing mentioned in the Thin Provisioning Performance white paper this article in the VMware knowledge base could be of some good use.
I would suggest using eagerzeroedthick for any high IO tier 1 type of Virtual Machine. This can be done when creating the VMDK from the GUI by selecting the “Support Clustering Features such as Fault Tolerance” check box.
So go out and check your VMDK’s.
So today I got around to putting ESXi 4 on my spare box at home. I first deployed a new virtual server and decided to use the thin provisioning built into the new version. After getting everything all setup. I was suprised to still see this.
I was like DANG! that is some awesome thin provisioning. I was more thinking something had to be wrong. A 42 GB drive with Windows 2008 only using 2.28KB that is sweet! I thought for sure since I had not seen this screen on the information of the VM it had already refreshed. It was too good to be true though I clicked the Refresh Storage and it ended up like this. Which made alot more sense for a fresh and patched Windows install. So far this leads to my first question, why the manual refresh? Should this refresh automatically when the screen redraws?