dcsimg

Cutting Through the Clustering Haze: An Old Idea Is Coming of Age

By Carl Weinschenk (Send Email)
Posted Mar 24, 2003


Like most concepts that have been around for a long time, server clustering is shrouded in haze. The bottom line, however, is that the idea is gaining steam. Clustering's not a new idea, but as the concept picks up steam, it's important to cut through hazy language and commonly confused concepts to learn what clustering can do, and where it's headed. Carl Weinschenk explains.

It is not a difficult concept to understand, at least on the surface. Clustering is one of several nascent techniques aimed at linking together and thus better harnessing equipment. "Server clustering takes multiple physical servers and links them together through one of many clustering architectures for the purpose of distributing the work across the cluster," says Ira Kramer, the director of product marketing for InfiniCon Systems, a provider of cluster management equipment and services.

The haze is still understandable. Technical terms often lose their precision when they are batted back and forth between the engineering and marketing departments. In this case, there are a number of similar and related concepts -- such as fabric, grid, pervasive and mesh computing -- vying for the attention of those trying to increase the efficiency of computing infrastructures. Though they mean different things, the terms are often used interchangeably.

Clusters are comprised of servers connected by input/output (I/O) interconnects. They are connected to storage media and administered by distributed resource management (DRM) software. All these pieces are changing: Blade servers, fast InfiniBand I/O technology and more sophisticated DRM software are combining to make clustering a more utilitarian tool to IT managers.

"Clustering is definitely going more mainstream," says Reier Torgerson, the solution manager for high available products for Vision Solutions. "The amount of time systems have to be up is increasing. We also see that downtime windows-both planned and unplanned-must be smaller and smaller."

Different Definitions

There is a disparity in how different companies name and use these concepts, so precise definitions should be taken with a grain of salt. Sun's take focuses on grids and clusters. Peter ffoulkes, the planning and strategy manager for the company's high performance technical computing group, defines a cluster as a localized grouping of computers or servers. The grid is the matrix into which that cluster resides. The grid can span geographic locations. For instance, ffoulkes says, Sun's grid spans its California, Texas and Massachusetts locations. The company's DRM software can send a job from one cluster to another for processing if that makes the most sense. Another DRM software implementation than runs the job locally.

HP focuses on the independence of the grid concept. "A grid is a loosely coupled set of machines," says Dan Cox, a manager of Linux cluster programs for an HP unit. "It is very distributed, very independent. It is not as precisely configured as a cluster."

The new economic realities -- even after the near-term downturn ends -- is changing the business model in a way that favors the approach. Increasingly, IT departments are moving from being cost center to service center, says Marty Ward, the director of product marketing for Veritas Software.

A service center model, in which IT charges other departments for the amount of resources they use, demands more agility and efficiency in how IT manages its resources. This dovetails with a technology that extends existing resources. InfiniCon CEO Chuck Foley points to a Sun study that says clustering can increase server efficiency from 15 percent to 80 percent. "To move to a service model you have to have automated management of resources and infrastructure," Ward says. "The technology is there to be able to do that." Though ffoulkes is not familiar with the specific study to which Foley referred, he said the numbers sound reasonable.

The Two Clusterings

There are, in reality, two discrete uses for clustering. "One is to increase the availability of a particular application or service they are running," says Lee Johns, a director of a division of HP's industry standard services global business unit. "The other is to increase the performance of an application. Those two have very distinct requirements."

Redundancy is a growing area for clustering, says Johns. "On a basic level, nobody wants an application to fail," he says. "As the costs of hardware comes down, the cost of protecting yourself becomes more and more affordable."

Clustering's other mandate is to apply more horsepower to a particular application or problem. The goal is to process more information in a set amount of time or the same amount more quickly. There are two main approaches in this area: "Scaling out" refers to distributing the workload of a given application among servers. "Scaling up" focuses on the ability to add computing power to a single server profile doing the work.

It is possible to use the two approaches simultaneously, ffoulkes says. For instance, an automobile manufacturer may want to use a cluster to help quickly solve complex problems in the design of a new car. Simultaneously, it may want fail over protection for the database underlying the project so that highly paid engineers are not sitting idle while important deadlines slip. These two cluster operations would be run independently, ffoulkes says. "The software for high availability is totally different than the software for throughput," he says.

The push given clustering by the economy is being augmented by technical innovation. The goal is to give IT managers the ability to cluster more fluidly. Historically, redundant clustering has been predicated on a one-to-one match between servers. New software approaches are making it possible to fluidly change the proportion of backup to primary servers.

This enables enterprises to become shrewder in how they deploy clusters, says Jason Buffington, the director of business continuity for NSI Software. Along with technology to break from the one-to-one strait jacket is the concept of assigning redundancy on an as-needed basis determined by the enterprise itself.

"Say you have 100 servers," Buffington says. "Everyone probably agrees that two or three are critical and always need to be up, while eight or 10 might be key to some [individual] department. The biggest misconception is that the only thing available is one-to-one, so that in most cases it's not worth it."

HP's Cox agrees that the IT managers are now in the position to make decisions. "It all comes down to application criticality," he says.

Parallel Advances

The additional flexibility is not all that has changed. According to Foley, server blades, which enable hundreds or even thousands of servers to sit in a room, are especially well suited to fabric-type approaches such as clustering. Further, the InfiniBand connector used to link servers to each other and to storage is, at 10 Gbps, far faster than previous interconnects. Cox points to Myrinet and 10 Gigabit Ethernet as other advanced interconnects. Finally, software such as Oracle's 9i RAC and IBM DB2 EEE databases were designed with fabric environments in mind. "The biggest difference is the ability of a single application to span multiple servers and to keep in sync at larger and larger [server] levels," Foley says.

Server clustering is also becoming more flexible in another way. The proliferation of operating systems through the enterprise means that clusters must become ecumenical. "One of the things we are seeing is more cross-platform or multi-OS environments," Torgerson says. "We think that the next big trend in clustering is how to coordinate clustering. For instance, an OS 400 cluster with a cluster on Microsoft or Linux/Unix environment to deal with integrated or distributed applications."

Cox says that different OSes always will be separate. However, they will be tied together through shared storage media.

Other advances are in the wind. ffoulkes says that the grid will eventually go international. Thus, a computer operation called for in New York City may be done in Beijing. Veritas' Ward says that Veritas is working on autodiscovery technology that automatically recognizes and appropriately reconfigures servers being brought into the cluster.

Foley is optimistic about clustering. "We as a technology world have fallen in love with servers doing work for us," he says. "[But] to have an application tied to a specific server or a database tied to a given server has become far too limiting."

Page 1 of 1


Comment and Contribute

Your name/nickname

Your email

(Maximum characters: 1200). You have characters left.