TechAdvantage 2011 – Storage VirtualizationPosted: March 24, 2011
There were several off-line comments and questions that I have been getting from cooperatives regarding the necessity for virtualization storage requirements. I mentioned in the presentation that we did not deploy with a storage area network (SAN). This seems to confuse many as to why or how we were able to do that. The purpose of this post is to address the storage options that you can have in a virtual server environment and give you the pros and cons of setting them up.
Why Storage is Important
The importance of storage in a VMware environment is paramount. I would rate the necessity for storage to be higher than the amount of processing speed one rates the CPU’s on the host. Why? Here is the reality:
Virtual machines (VM’s) need physical storage for several reasons. If you are not virtualized, your physical servers currently save their data on the disk drives attached directly to them. I call this DAS or direct attached storage. Virtual machines are a little different than their physical counterparts. A VM’s “disk” is actually a single file that is accessed by the host. It gets loaded into the host server’s memory when the VM is ‘powered on’. The VM’s applications and data also require disk storage for backup, restore, BC and DR. There is a strong affinity and dependency on virtual machines, physical machines and the storage. They go very much hand-in-hand. It is almost impossible to have one without the other.
Recognizing the importance for storage in a VM environment might be obvious. VMs are effectively data structures which are mapped into the host’s memory when they are not executing. In the physical sense, they are a couple of files that reside on the host’s storage. There are also all the other files: the applications installed on top of the guest operating system – the actual data that the users accesses. So, the dependencies of storage for both the VM and the applications installed on the VM are also important.
The VMWare hypervisor provides a plethora of tools and plug-ins to manage the underlying storage. Each tool and plug-in requires different capabilities of the physical infrastructure. For instance, while VMware offers the “Site Recovery Manager” option, it has a dependency on external, third-party technology such as replication, snapshots, data protection tools, management, and tiered storage.
The other aspect to this is the physical storage connectivity choices. There is I/O connectivity using serial attached SCSI (SAS), iSCSI, fiber channel (FC), fiber channel over Ethernet (FCoE), or NAS. Where the rubber meets the road, it all comes back to your storage and I/O needs: aggregate performance, reduce latency, boost IOPs and bandwidth in an aggregated environment. In the end, it is you who must support the availability, reliability, and serviceability for data protection, backup, BC, restore, HA, and DR for your cooperative.
Traditionally, the focus around the VM host and storage for the VMs was consolidation of utilization. You will find that as you aggregate multiple physical machines down onto a single physical host running multiple virtual machines you might introduce a storage performance bottleneck. Multiple individual physical machines in your legacy environment may not have had a high-throughput, high bandwidth, or high IOP requirement and as you aggregate them together on the same array you can cause latency and contention on the storage. So, in addition to looking at driving up CPU and memory utilization on your host, you must also focus on the storage.
What is needed for VMware storage?
Virtual machine hosts do not have a core requirement for shared storage to be accessible by multiple physical hosts. You can run standalone physical hosts using DAS. In our environment, a couple of single physical machines running the VMWare hypervisor and a product called Veeam can move the backups (VCB) from one hypervisor to the next. This inexpensive option kept us away from having to purchase a SAN and gives us everything we need. There are many vendors that will discourage you from proceeding with DAS because of the options available in the marketplace. They will attempt to dazzle you with cool-guy tricks that you don’t yet think you need…
Today, what is popular is automated or system assisted data migration and tiering. There is storage that uses unified, multi-protocol support for concurrent block and file writes incorporating data reduction techniques. They natively support archiving, compression, de-dupe, thin provision, and space saving snapshots. Some are “application aware” and include application enabled data protection, leveraging snapshots, replication, continuous data protection with backup and restore.
There are several options to choose from. Choosing what is out there depends on what your needs are.
I needed to have basic functionality. I broke it all down into categories: what I need, what I want to have, nice to have, or desirable – if I can afford it. Each may be part of a solution, but might not be part of my requirements. Now, I get a lot push-back with this one in terms of what I considered as desirable features. In discussing storage virtualization with one vendor, they told me that VMware VAAI API (vStorage API for Array Integration) was mandatory in order to virtualize my servers. Well it probably is, particularly if their product required VAAI. I learned that their messaging was very tightly aligned with the API. So, one would expect them to push in position that makes VAAI an absolute requirement; however, the reality is that underlying storage systems are evolving (Moore’s Law). Right now, I put this in the desirable category and in a year from now I might move it into a required category. Remember that desirable features are not really required unless you “underhandedly” present them as such.
So, no Vmotion or other VMWare branded BC/DR options were viewed as being required in our environment – yet. This does not imply that we don’t have BC, DR, or HA solution. To be blunt, we did not see the value in the amount of money we’d have to pay to get the cool-guy toys that we would hopefully never have to use. Also, the fancy options were priced way out of our reach anyway. Of course, I could have made a business case using a scare tactic, but I’ve grown out of that. I’d rather operations have a couple new bucket trucks to assist in getting the lights back on than a SAN that is going to be replaced in a couple of years due to technology improvements.
If you disagree with me or already made your business case for shared storage, for everything else VMWare offers beyond free server virtualization, shared storage is a necessity. Read on…
Now, this is where things usually become cloudy: shared storage means lots of different things to different people. When you think of shared storage, you might think of fiber channel FC, fiber channel over Ethernet FCoE, iSCSI, NAS, mapping a network drive, or NFS but shared storage can also be externally shared SAS. You can attach a disk array to one, two, four or more servers using external SAS cards – very similar to how you attach a fiber channel.
The key point here is that shared storage can use one of two mechanisms: block or file. What gets even more confusing is that it can also depend on a the shared storage vendor you talk to. If that vendor uses iSCSI only, they will say, “You need iSCSI for your virtual machines!” If they are FC they are going to recommend fiber channel. But it really all comes back to your base requirement of having shared storage so you can move virtual machines around. What they recommend is unimportant beyond your requirements.
Machines hosting VM’s require basic reliability, availability and serviceability features such as redundant power supplies, redundant fans, hot-swap components, and hot swap drives. The hosts need the ability to scale in terms of performance, availability, and capacity.
There are also requirements of each application environment. Each environment has a constraint on performance, availability, and capacity. Maybe you don’t have a performance need, but you need more space for Application A; but Application B doesn’t need lot of space but it needs performance. This is where “scale” has different meanings and different dimensions that will align to your particular business application requirement.
The most important thing is that it has to be easy to buy, easy to install, configure, understand, and use on a routine basis. So often I see and read about “easy button” products. Once you buy them, you find that while it was actually very easy to install, you don’t know how to manage it. How easy is it going to be the day after it is installed to manage? How long do you intend to keep it? If something should go wrong, how easy is that solution to acquire?
Determining what VMware external shared storage you need
What it comes back to is this: determining your needs versus your wants. Your must-have’s versus nice-to-have’s wish list you usually make for Santa during budget time (October-December). There are a lot of new things that are out there that I would love to have, but there is a basic capability that I need-to-have to get my job done in supporting the cooperative business activities and adhering to my cooperative’s mission statement. If something on the nice-to-have list becomes more affordable or integrated with a new product, then they become more and more viable for me to obtain. This integration sets and establishes my basic needs, my wants, and my wish list. So, stick to your guns: stick to what you need to meet your basic objective and then see what else you can get for a couple more dollars.
Factor in your budget – both capital and operating. What must you be able to do to get a deal on some storage right now? Is it cost effective? What are the ongoing costs going to be? Are you going to be billed for maintenance? What about ongoing software licensing or other fees? In terms of read versus write, random versus sequential, big versus small IOPs, concurrent versus parallel, what are your performance needs? What kind of applications are you going to be supporting?
Those individual servers that you are consolidating might not have much of a storage performance impact by themselves, but aggregating them can cause contention by introducing performance bottlenecks and as you add them together on your host. Be sure to focus on IOPs to measure disk performance. If you are doing lots of small IOs, you are not going to see a large megabyte per second or gigabyte per second rate; you are going to see lots of activity. So keep all this in perspective.
All of this is going to help determine whether you should buy something now. What are your availabilities in data protection requirements? What are your preferences? All those other things aside what is your preference for a vendor, architecture, protocol, block, or file along with coexisting in your current environment.
There are many different options for VMware storage. Identify what you need versus what you want and with what you would like. List all of them out on a sheet and do your comparison. Categorize them in terms of must have, need to have, would like to have, would want to have and then (in your course of going through and evaluating the technology) use it as part of your criteria. Stick to your guns on what meets your base requirements. Then see what else you can get from those “wish list” items as a part of your solution without having to increase the costs outside of what your budget can support. Consider putting together a multi-year strategy, revisit it and update as needed. Try balancing the old with the new and don’t be afraid but look before you leap!!!