Recently I had the privilege of being a Tech Field Day Delegate. Tech Field Day is organized by Gestalt IT. If you want more detail on Tech Field Day visit right here. In interest of full disclosure the vendors we visit sponsor the event. The delegates are under no obligation to review good or bad the sponsoring companies.
The first place hosting the delegates was NetApp. I basically have worked with several different storage vendors but I must admit I have never experienced NetApp in any way before. Except for Storage vMotioning Virtual Machines from an old NetApp (I don’t even know the model) to a new SAN.
Among the 4 hours of slide shows I learned a ton. One great topic is Storage Caching vs Tiering. Some of the delegates have already blogged about the sessions here and here.
So I am going to give my super quick summary of Caching as I understood it from the NetApp session. Followed by a post about Tiering as I learned from one of our subsequent sessions from Avere.
1. Caching is superior to Tiering because Tiering requires too much management.
2. Caching outperforms tiering.
3. Tiering drives cost up.
The NetApp method is to use really quick Flash Memory to speed up the performance of the SAN. Their software attempts to predict what data will be read and keep that data available in the cache. This “front-ends” a giant pool of SATA drives. The cache cards provide the performance the the SATA drives provide a single large pool to manage. With a simplified management model and using just one type of big disk the cost is driven down.
My Take Away in Tierless-Caching
This is a solution that has a place and would work well for many situations. This is not the only solution. All in all the presentation was very good. The comparisons against tiering were really setup against a “straw-man”. A multi-device tiered solution requiring manual management off all the different storage tiers is of course a really hard solution. It could cost more to obtain and could be more expensive to manage. I asked about fully virtual automated tiering solutions. Solutions that manage your “tiers” as one big pool. These solutions would seem to solve the problem of managing tiers of disks, keeping the cost down. The question was somewhat deflected because these solutions will move data on a schedule. “How can I know when to move my data up to the top tier?” was the question posed by NetApp. Of course this is not exactly how a fully-automated tiering SAN works, but is a valid concern.
My Questions for the Smartguys:
1. How can the NetApp caching software choices be better/worse than software that makes tiering decisions from companies that have done this for several years?
2. If tiering is so bad, why does Compellent’s stock continue to rise in anticipation of an acquisition from someone big?
3. Would I really want to pay NetApp sized money to send my backups to a NetApp pool of SATA disks? Would I be better off with a more affordable SATA solution for Backup to Disk even if I have to spend slightly more time managing the device?
14 thoughts on “Storage Caching vs Tiering Part 1”
I think this might spark some debate from a certain company that sells “Fully Automated Storage Tiering”. Good work, Jon!
I aim to please. I like good discussion.
I don’t see why there would be a discussion.
The facts are that FAST Cache and FAST are available in EMC Unified today and are point and click options in Unisphere. NetApp have said tiering is dead but then came back and made SSDs available as a drive option in their most recent launch thereby creating a Flash Storage Tier with no automated data placement.
How is that tierless?
Realistically, as tiering and caching develop, I don’t see any difference between them except for a minor thing:
With caching, the “home” of all data is the slowest area, the cached copy is in theory temporary, regardless of how many months or years the data may remain there, and a copy of the data will remain in the slowest area.
With tiering, the “home” of a block of data will change over time dependant on the usage pattern of it compared to other pieces of data, and the data may (but doesn’t need to) be removed from the slowest data location when it’s moved to the higher tiers.
Also, it’s perfectly valid to consider a caching system with multiple layers of cache, just like a standard CPU has layer 1, layer 2, and layer 3 cache, and the cache actually consisting of spinning disk rather than SSD, just like in the more advanced tiering models currently being used.
The reason that NetApp doesn’t like to answer some of your questions is that they don’t have software that can automatically tier between different media types, EMC’s FAST being only one example. Typically, when NetApp doesn’t have an important feature, they’ll claim that (a) it’s not needed, or (b) construct a strawman argument against something that doesn’t really exist. That’s what you’re seeing.
NetApp’s use of cache — while useful in some cases as you point out — is read-only. Since it is volatile, it doesn’t do anything to cache writes. OK if your application is doing mostly read-only (and has goo locality of reference!), not so good if there are heavy change rates, or — perish the thought — you’re doing a massive restore.
Not that EMC is against storage caches — we like them too. But our caches are non-volatile, and can be used for both reads *and* writes. The CLARiiON/Celerra implementation is kind of cool: an enterprise flash drive can either be used as a caching device (reads and writes) or as a storage target — your choice, do one, the other, or both.
As far as “sending backups to NetApp SATA”, I’d agree with you there. The EMC Data Domain product is optimized as a backup target, and — as a result — compares extremely favorably in performance, availability, efficiency, etc. vs. a generic storage product that’s been repurposed for the task.
And that’s the key here — rather than get dragged into a good vs. bad debate, understand that different approaches have different strengths and weaknesses.
I asked NetApp the same question – “why does Compellent’s stock continue to rise in anticipation of an acquisition from someone big”
There responce was…. Compellent wants to be sold, they’ve stated this for years, if their product is so good then why haven’t they been sold yet?
Not sure I’m buying that but its what they said…
There is another key consideration when debating caching vs tiering: the underlying filesystem and the way data is laid out on flash and disk. Many of the tradeoffs that have been discussed here assume a traditional file layout. For example, in general write in place file systems require disk seeks for every random write which makes write caching more important. What we call “write in free space” filesystems can get better random write performance because they coalesce logically random writes into a physically sequential write on disk. In this case, a large read cache coupled with sequential data layout gives consistently good performance for both reads and writes.
Our CTO compares filesystem architectures on our blog: http://www.nimblestorage.com/category/blog/
Well, my view is that the proof is in the pudding, check out industry standard benchmarks such as specsfs and decide for yourself who has the best approach…
1) really depends on your workload, myself I still prefer caching to tiering, but I don’t like the NetApp caching as much because it’s only a read cache I want a read and write cache. EMC Flash cache is both read and write(though I don’t use EMC either). For me distributed RAID with wide striping does the job fine. I think it will be a few years before all of this matures to the point where it’s really good.
2) I don’t believe anyone is saying tiering is bad, NetApp might say their caching is better, probably because they don’t support tiering. They also don’t support RAID 1 which is pretty important if you want high performance. My advice if you want NetApp, get the V-series head units and put them in front of a better disk subsystem. And Compellent’s stock price is going up because they have some pretty unique technology on the market and after the bidding war with 3PAR well folks are hopeful the same will happen for them.
3) depends on your requirements, the idea behind bigger storage arrays is there is less to manage, easier to manage, higher performance, availability etc. As for whether you’d be better off, that’s a decision you’ll have to make, for me I much rather pay more and get more out of the system, especially if the system can be purposed for multiple tasks simultaneously, or grow into such a system. Too many headaches associated with tier3/4 storage systems to look at them again for me.
just my 2c as a freelance NetApp trainer:
1) NetApp’s ‘Read-Cache’ is the NVRAM. As I always state in my courses: A NetApp writes faster than it reads. Check the published benchmarks for details. Also ‘distributed RAID with wide striping’ is standard ‘Best Practice’ on NetApp systems with up to 100TB per storage pool (‘aggregate’).
2) NetApp DOES do RAID 1. It’s called SyncMirror. Officially it would be RAID 14, 16 or 1-DP… You do get the usual almost 200% speedup in reads (whenever they’re actually coming from disk and not (Flash)Cache), provided the two halves of the mirror aren’t too far apart (~<10km, ~6miles). They can be up to 100km/60miles apart, part of a MetroCluster deployment.
3) NetApp should be right up your alley with up to 2.8PB or '1.7 gigabytes/second sustained random read/write' (real number found in a recent post) per HA Pair.
Thanks to all for the comments. I think my follow up comment would be similar to @storagezilla . I am favoring thinking of a “Cache” as a tier. Anywhere the data will live for a while (even for a short while) can be a tier. As I didn’t intend for a EMC v NetApp battle in the fundamentals of storage to break out. I more want to point out the rhetoric and how if you really think about what they are saying. It seems to me that NetApp is trying really hard to be set apart from the crowd.