In addition to being used in traditional high-performance computing and failover applications, specialized clusters are making their way into the storage server market.
“Traditional clusters have trouble scaling and hit bottlenecks when trying to create thousands of files every minute,” says Illuminata analyst David Freund.
He says that the clustered file system breaks down when trying to accomplish such tasks. This is resolved by parallelizing the structure, an approach HP takes with its HP StorageWorks Reference Information Storage System (RISS) active-archiving solution for e-mail and Microsoft Office documents (later releases will address other file types). RISS addresses scalability and performance issues by distributing content across a grid of storage units called “smart cells,” which are connected in a peer-to-peer fabric. Network-attached storage (NAS) vendor Network Appliance and EMC also sell clustering products.
Products from smaller firms shouldn’t be overlooked. For example, Boulder, Colo. based Cluster File Systems’ Lustre file system is used by firms such as HP and SGI; Panasas of Fremont, Calif. sells storage clusters with performance as high as 10 GBps in throughput and 300,000 I/O operations per second.
“When you have a computer cluster with 10,000 files created, you might have 1,000 storage servers attached,” says Freund. “A modular approach makes a better approach than big iron.”
Out of the Laboratory
As any horror-film buff knows, creatures developed in the lab always manage to escape, and Beowulf is no exception. Like a movie monster, it mutated into a variety of forms and infiltrated society. Beowulf has already taken over the supercomputer market and is now going for complete domination. The list of companies currently manufacturing or supporting clustered architectures includes most of the major hardware and software vendors. IBM, Sun, HP, and Dell all offer server clusters with associated software, wiring, and networking.
“HP, IBM, and Dell are focused on providing a solution in a box or a rack,” says IDC’s Humphreys. “Drop in the racks and you are off and running.”
Initially, clusters, despite their adoption of off-the-shelf hardware, were difficult to manage, thus limiting their usefulness and market penetration. Now, the clustering technology itself has reached near plug-and-play functionality in some cases and no longer requires a team of experts to operate.
Although clusters generally run on Linux, even Microsoft is getting into the picture with its 64-bit Windows Compute Cluster Server 2003 currently in a Beta2 release. A number of hardware and software vendors have based their business models on clustering: Myricom, of Arcadia, Calif., which creates the 10G Ethernet and Myrinet cluster interconnections, and Linux Networx of Bluffdale, Utah, which builds supercomputing clusters, are two examples.
“The picture is decidedly fragmented,” says Nist. “There are a number of open source cluster management projects that came out of the academic high performance computing space: many ISV and captive general server management software suites, a few commercial vendors focused on grid architecture, and a very small number of commercial cluster management players, such as Penguin Computing and Scyld Software, that offer commercial-grade solution stacks.”
As clustering technology develops, it is reaching into new areas. Initially, clusters, despite their adoption of off-the-shelf hardware, were difficult to manage, thus limiting their usefulness and market penetration. Now, the clustering technology itself has reached near plug-and-play functionality in some cases and no longer requires a team of experts to operate.
Penguin Computing, for example, recently released the Penguin Personal Cluster, which fits 6 to 24 CPUs delivering up to 200 Gflops in a workstation form factor, and it is managed as simply as a workstation.
“Because clusters are going mainstream, Penguin Computing has created a high performance Linux cluster to-go,” says Nist. “It delivers powerful, scalable, and easy-to-use supercomputing resources in any place that customers need including project teams, departmental computing, and highly productive individual contributors.”