Thanks for all of the responses (Basil, Adam, John, Jeff).
So it sounds like, by and large, youâ€™ve all had good experiences with using NetApp to host â€œSANâ€ type of workloads, whether using NFS or iSCSI/FC (and across LUNs). On our SAN-side of the house, weâ€™ve grown accustom to IBMâ€™s Easy Tier, but personally I will say that I still like all the bells and whistles in ONTAP (snaps, snapvault, flexclone). Given the suggestions around FlashPool, I think thatâ€™s worth exploring potentially with some of our VM environment. Will certainly reach out to our NetApp contact to investigate/POC further.
From: Basil [mailto:firstname.lastname@example.org]
Sent: Wednesday, April 13, 2016 5:43 AM
To: Steiner, Jeffrey
Cc: Eric Peng; email@example.com
Subject: Re: running SAN on NetApp
RDMs on VMWare should be avoided. This is out of scope of the initial question, but just something I felt should be said. Other hypervisors are OK with that, but for VMWare, it's better to use a datastore.
On Wednesday, 13 April 2016, Steiner, Jeffrey <Jeffrey.Steiner@netapp.com<mailto:Jeffrey.Steiner@netapp.com>> wrote:
Hi all, Jeff from NetApp here. I feel compelled to chime in. Forgive any typos, I'm at 37,000 feet over Murmansk and there's a little turbulence.
First, I'll try to get to the point:
1) Protocol is less important than ever these days. Part of the reason for a change is that in the past it was hard to deal with the delays caused by physical drive heads moving around. Lots of solutions were offered, ranging from just using 10% of each drive to minimize drive head movement or just having huge frames that had massive amounts of disks and dedicated ASICS to tie it all together. These days you can get an astonishing amount of horsepower from a few Xeons, and combined with Flash all those old problems go away. I spend 50% of my time with the largest database projects on earth and for the most part I don't care what protocol is used. It's really a business decision. NFS is easier to manage, but if someone already has an FC infrastructure investment they might as well keep using it.
2) 19 out of 20 projects I see are hitting performance limits related to the bandwidth available. Customer A might have a couple of 8Gb FC HBA's running at line speed, while customer B has a couple of 10Gb NIC's running at line speed. That 20th project tends to actually require SAN, but we're talking about databases that are pushing many GB/sec of bandwidth, such as lighting up ten 16GB FC ports simultaneously. When you get that kind of demand, FC is currently faster and easier to manage than NFS.
3) Personally, I recommend FlashPool for tiering. I know there are other options out there, but I think they offer a false sense of security. There are a lot of variables that affects storage performance and lots of IO types, and it's easy to damage performance by making incorrect assumptions about precisely which IO types are important. FlashPool does a really good job at figuring out what media should host what IO and placing the data accordingly. For those fringe situations, FlashPool is highly tunable but almost nobody needs to depart from the defaults. For more common situations, such as archiving old VM's or LUN's, you can nondisruptively move the LUNs from SSD to SAS to SATA if you wish, but almost everyone these days goes directly to all-Flash. I still run into an occasional customer looking for really huge amounts of space, like 500TB, where all-flash isn't quite deemed affordable, and they seem to go with SAS+SSD in a FlashPool configuration.
4) When virtualization is used, I recommend caution with datastores. If you're looking for the best possible performance, keep the IO path simple. Don't re-virtualize a perfectly good LUN inside another container like a VMDK. The performance impact is probably minimal, but it isn't zero. Few customers would notice the overhead, but if you're really worried then consider an RDM or iSCSI LUNs owned directly by the guest itself. Let the hypervisor just manage the raw protocol itself. It also simplifies management. For example, if you use RDM/iSCSI you can move data between physical and virtual resources easily, and you can identify the source of a given LUN without having to look at the hypervisor management console.
Strictly speaking, I would agree that a NetApp LUN is "on top of WAFL" but unless you're using raw hard drives that's true of any storage array. Anybody's SAN implementation is running on top of some kind of virtualization layer that distributes the blocks across multiple drives. The main difference between ONTAP and the so-called "traditional" block-based arrays is there's OS that controls block placement is more complicated.
I know the competition loves to say that ONTAP is "simulating" SAN, but that's just nonsense. The actual protocol itself is actually pretty much the same sort of thing whether you're talking about NAS or SAN. You have a protocol and it processes inbound requests based on things like file offsets, sizes of reads, sizes of writes, etc. Servicing a request like "please read 1MB from an offset of 256K" isn't all that different when you're dealing with FC, iSCSI, CIFS, or NFS. SAN protocols are really more like subsets of file protocols. The hard part of SAN is mostly things like the QA process for ensuring a particular HBA with a particular firmware works will under a majorly faulty SAN situation without corruption. There's where the effort lies.
Sometimes, that extra stuff in ONTAP brings benefits. That's how you get things like snapshots, snapmirror, flexclones, snaprestore, and so forth. Not everybody needs that, which is why we have products like the E-Series which are mostly geared for environments that don't need all the ONTAP goodies. I used to work for an unspecified large database company somewhere near Palo Alto and where the snapshot stuff was needed, we invariably bought ONTAP and when it wasn't needed I almost always went for the IBM DS3000 series arrays, which eventually became E-Series. Neither one was better or worse than the other, it depended on what you want to do with it.
I won't like - you don't get something for nothing. Speaking as someone who influenced, specified, or signed off on about $10M in total storage spending, the only time I really noticed a difference was with random vs sequential IO. ONTAP has more sophistication dealing with heavy random IO and it delivered generally better results, while the simplicity of Santricity (E-Series) allowed better raw bandwidth. Both do all types of IO well, but there has to be some kind of tradeoff or all arrays would have 100% identical characteristics and we all know they don't.
We have all sorts of documents that demonstrate what the various protocols can do on ONTAP and E-Series, but if you need specific proof with your workloads you can talk to your NetApp rep and arrange a POC. We have an extensive customer POC team that does this sort of thing every day in their lab, but sometimes it can be as easy as just getting a temporary iSCSI license so you can see what iSCSI on a 10Gb NIC can do.
Sent: Wednesday, April 13, 2016 4:19 AM
Subject: running SAN on NetApp
Does anyone have experience with running production workloads on FC-based LUNs on NetApp? Am curious to know how performance of hosting virtual machines (including Exchange, database environments) compares to more traditional block-based SANs (EMC, 3Par, Hitachi, etc), since what Iâ€™ve read is that NetAppâ€™s LUNs feature still sits on top of WAFL?
We have some native FAS NetApps, along with many N-series rebranded NetApps, but all are run in 7-Mode and using NFS connections.
Also, how do you all implement data tiering in your NetApp environments? We are currently using IBM SAN (Storwize/V7000) and this has tiering capability. Weâ€™d consider moving some SAN workloads to NetApp if we could get as good SAN performance and also address the tiering capability.
Eric Peng | Enterprise Storage Systems Administrator
Esri | 380 New York St. | Redlands, CA 92373 | USA
T 909 793 2853 x3567 | M 909 367 1691