The Ins and Outs of Virtualized I/O
Hypervisors that control virtual servers have a complex job. They have to make each OS that runs on that server function in an environment that appears to the OS to be a machine that it has all to itself. In reality, each of those operating systems is sharing a part of a larger machine that supports many operating systems.At first look, virtualized I/O seems a lot like virtualized memory. Look again.
With virtual memory that basically means maintaining a table that translates what the OS expects into what the server hardware really has allocated to it. In most cases, the hypervisor maintains this translation table. In some newer hardware environments, that table can exist within the processor, easing the workload on the hypervisor and improving performance.
Recent Articles» Is Your iPhone the Next Virtualization Battlefield?
» 7 Virtualization Survival Tips
» Virtual Memory Management
I/O is a different story entirely. When an OS needs to send data somewhere outside of its own environment, it must move that data to a device that in turn communicates with the outside world. That device might be a network card or it might be a storage adapter, but either way, the hypervisor must share a physical device between several different operating systems, each of which is designed to assume that it has the I/O path all to itself.
Initially, this seems to be another memory management problem. Operating systems generally handle I/O by sending data to a memory address using a DMA operation (for direct memory addressing). It gets data back by watching it appear at a specific memory location and passing it to whatever process requested it. It's the hypervisor's job to inspect the packet information for the identity of the process using it and make sure the data appears in the right spot for each virtual server.
Of course, exactly how the hypervisor handles this movement of data depends on which company provides the software. "The I/O portion is in its own partition," explained IBM's Virtual Server Architect Bob Kovacs. "Its core purpose is I/O."
Kovacs said that the IBM I/O partition has two main functions, storage and networking. "In the case of storage, the OS is a storage virtualizer. It has the physical adapters, and it has storage devices that have been allocated to it," he said. According to Kovacs, the storage virtualizer can use nearly any kind of storage, from directly attached disks to network attached storage to storage area networks. But in all of these cases, the virtual server thinks that all of the storage is a standard SCSI disk.
"Devices use a standard SCSI stack," Kovacs explained. "That gives you the ability to share adapters and devices. Our boxes allow up to 256 partitions on the higher-end boxes. The reason this is important is that each operating system requires two slots, one for the NIC [network interface card] and one for storage."
VMware handles I/O somewhat differently from the manner in which IBM handles it. In the case of VMware, instead of a separate partition and dedicated I/O system, the hypervisor handles I/O itself. "We provide a virtual device or virtualized drivers for Intel E1000 and AMD VLANCE," said VMware's Chief Platform Architect Richard Brunner. He said that the company also provides a virtual device called VMXnet. "When these devices attempt to access physical hardware, all their writes are trapped to the hypervisor, which turns around and places the request in a stack in the hypervisor," said Brunner. He added that the hypervisor has a queue that the physical device driver then has to take care of.
Of course, the entire process of placing read and write data into a queue and translating it to drivers that then send it to actual devices can have an impact on performance. For this reason, VMware, like IBM, is working on solutions that can bypass much of this.
"We're developing new technology called VM Direct Path," Brunner said. "For well-behaved guests it is possible to allow them access the logical portion of the physical hardware." Brunner said that all of the operating systems VMware currently supports are sufficiently well-behaved to work with Direct Path.
"We can allow them to write directly to the queues," Brunner said, describing how VM Direct Path will work. He said it requires special hardware from AMD and Intel that provide an extra level of protection. "We need that because if a guest is going to talk to the hardware, that device is going to want to do a DMA to the guest. With this we have complete control over everything," Brunner explained.
While all of this I/O management with virtualized servers sounds fairly straightforward, it isn't. For one thing, network traffic and storage traffic operate differently. Networks, for example, use data packets designed to be mixed with traffic from other devices and sorted out at the other end. Storage, on the other hand, uses blocks of data and assumes that it has the device all to itself. And some I/O does both, such as iSCSI, which sends storage data over an Ethernet network.
"iSCSI storage will allow you to pass storage through your network card," said Andrew Hillier, CTO of CiRBA. He said that this allows both network traffic and storage traffic to use the same Ethernet network, although he noted this isn't necessarily a good idea. "All the VMs share the same pipe," Hillier said. "iSCSI can stress out your pipe. It never goes between servers, but straight to storage. You can overwhelm your NIC. You can overload your switch."
Hillier said that iSCSI traffic should use a separate network and separate infrastructure in a virtualized environment. Even better, he suggested, is to run servers that need to communicate heavily with each other on the same physical machine. "A lot of technologies will short circuit the network, and they only go between the VMs," Hillier explained. "In the mainframe world that's called hypersockets." That way, storage traffic that would go from one server to a storage network will find the storage network running on the same physical machine, and will go there directly without ever hitting the network at all.
Hillier said that I/O, especially storage I/O, is complicated with the need to run backups. These massive data transfers can effectively monopolize a network. Fortunately, with proper planning of backups and other predictable traffic, this can be avoided. "Most VM software has a way of working around backup traffic," Hillier said, but he added that dealing with this can be complex. "There are a lot of moving parts in this," he said, "With I/O there's almost an infinite number of ways you can set things up."
In the end, I/O on virtual machines is a lot like the plumbing in an apartment building. If everyone flushes at once, there's sure to be a mess. It's the job of the hypervisor or the virtual I/O handler to ensure those flushes happen in an orderly fashion so nothing gets overloaded, and everything goes through in a timely manner.
Wayne Rash is a freelance writer based in the Washington, D.C. area. He can be reached at firstname.lastname@example.org.