Wrangling Virtual Machine Sprawl
September 30, 2008
There are a number of reasons for this, and some are, unfortunately, inevitable. Virtualization, by its nature, shifts storage requirements from internal and directly attached disks to networked storage because operating systems and applications are no longer tied to a specific physical server.
So when you get involved in virtualization, you know you are going to need more storage, but the question is how much? The answer is almost certainly more than you think.
"Most companies I speak with don't plan for enough storage simply because when they start the virtualization projects, they don't think about rapid adoption or disaster recovery, and they haven't even begun thinking about desktop virtualization," said Mark Bowker, an analyst at Enterprise Strategy Group (ESG).
Features like VMware's VMotion and the newer Storage VMotion require even more space on the same storage system. What's clear, then, is that serious advanced planning (including planning for new features that don't exist yet) is essential.
Once you've faced up to the fact that you are going to need a lot more storage, it's sensible to take steps to try to ensure you don't spend more than you need by buying more than you need. But how do you do that? What can be done to stop the data center being taken over by storage devices?
In a future article we'll look at some of the new "smart" software that is available to minimize the storage footprint of a given virtual machine strategy by building VMs on the fly, but for now we'll concentrate on some rather more straightforward but effective steps.
Thin and Dedupe Are In
An obvious step might be to embark on a parallel program of storage virtualization to try to ensure that the storage you have can be used as flexibly as possible, but Roy Illsley, a senior research analyst at the Butler Group, doubts that this is the best way to go.
"You certainly need SANs, but do you need storage virtualization? It's a moot point at the moment," he said. "I would contest that you would actually get far more valuable benefits by implementing some form of thin provisioning." This, at any rate, is the experience of many of the organizations Illsley has spoken to.
Thin provisioning mirrors server virtualization rather nicely: by eliminating or substantially reducing so-called stranded storage, which has been allocated but not used, organizations can dramatically increase their storage utilization, just as server virtualization can increase the utilization of the underlying physical servers. According to research carried out by ESG some time ago, about half of all companies waste about half of their storage capacity. Virtualization requires vast amounts of storage, and thin provisioning can help you get away with needing less by wasting less.
Bowker also suggested that data de-duplication should be a priority to reduce the storage needs of virtualization. Most storage vendors offer a de-duplication engine of some sort, although some balance needs to be reached between pure storage space savings and the performance hit that can result from extreme de-duping; if you've got hundreds of VMs all trying to access the same operating system file at the same time, and to save space you only have one copy of that file anywhere in storage, then clearly this could slow the VMs down substantially.
Controlling VM Sprawl
But aside from the move from local to networked storage, there's another reason why virtualization can make storage requirements explode. It's because without some form of control, it can be far too easy to call a new virtual machine into existence in a way that is simply not possible with a physical system. When virtual machines can be built at the touch of a button, the simple fact is that they will be especially in development labs and for testing purposes, but also for use in a full production environment. If you're not very careful, then VMs will sprout up all over your data center, created by IT staff for their own (often perfectly legitimate and productive) purposes. And, of course, these VMs will require storage resources.
To compound to the problem, virtual machines can be very difficult to inventory. In the past, it was possible to send an IT rookie around the data center with a pencil and paper to count and identify every server he or she could find. But in the absence of a lifecycle management system like VMware's Lifecycle Manager or Microsoft's Virtual Machine Manager System Center module to keep track of the virtual machines that have been created and to ensure they are deleted when they are no longer required, you can easily get into a situation where you have no idea how many there are, who made them, whether they are still needed, and whether anyone even remembers that they exist. But if they exist, they are taking up storage. (There are also software license implications of an undisciplined virtual machine environment, but that's another story.)
Virtual machine lifecycle management software can also help keep storage requirements in check by controlling the configurations of the virtual machines that are created (to ensure, for example, that they are not allocated unnecessary internal storage) and by assigning chargeback metrics to virtual machine deployments, ensuring that departmental managers have incentives to minimize or eliminate the unnecessary use of virtual machines by their staff.
To keep on top of the storage requirements of your virtualization strategy, then, you're going to have to manage your storage tightly and manage the lifecycle of your virtual machines. This can be done mostly by software but, as Illsley points out, don't forget that virtual machines need people managing them too.
"The problem is that in the past, the app team looked after apps and the server team looked after servers," he said. "But who's looking after the virtual machines?"
It's not the storage team's job, but if you don't know whose it is, then you could be in for trouble.
Paul Rubens is an IT consultant and journalist based in Marlow on Thames, England. He has been programming, tinkering and generally sitting in front of computer screens since his first encounter with a DEC PDP-11 in 1979.
This article was originally published on Enterprise Storage Forum. Amy Newman will return to Virtually Speaking next week.