Hardware Today: Server Admins SMASH Servers
November 28, 2005
Server management can be a hassle, as most companies accumulate servers from multiple vendors and end up with a scenario along the lines of a dozen Dell boxes, 15 HP servers, 20 IBM systems, 10 white boxes, and half a dozen old Compaq's lurking somewhere in the server room. As a result, organizations must jump from one management console another to run scripts, all the while retaining a library of such scripts for every vendor and sometimes for every type of server.
"The typical administrator spends his/her day dealing with one crisis after another; it's alert, diagnose, treat, and on to the next," says John Fruehe, enterprise marketing manager for Dell Product Group. "Through a common and consistent interface the administrator can become more productive and can spend more time adding value back to the organization instead of learning the tools."
Such an interface is now available in a standard known as Systems Management Architecture for Server Hardware, or SMASH. Some of the benefits of SMASH include reducing the management burden of server hardware, cutting the cost of server administration, improving the reach of system administrators to remotely located servers, and standardizing the management of heterogeneous environments.
"In a nutshell, SMASH makes it much easier to manage servers regardless of the vendor," says Winston Bumpus, president of the Distributed Management Task Force (DMTF), a Portland, Ore.-based not-for-profit made up of representatives from hardware and IT vendors, such as IBM, HP, Dell, Cisco, Microsoft, and Intel that developed the SMASH standard. "SMASH also makes it possible to talk to the machine regardless of the state of the OS."
Recently published by DMTF, products with SMASH v1.0 will begin shipping in the early part of 2006. They include add-in management cards, appliances, and system software. Their intention is to make hardware management much easier and therefore lower the cost of operations. This is achieved by standardizing the interfaces each hardware vendor provides. The result is command-line consistency.
"SMASH should help server managers avoid being locked in to a particular vendor through their management software," says Jeffrey Hewitt, a research director at Gartner. "By standardizing the interface and the command line, it should facilitate the integration of management tools with the servers themselves, which should offer an opportunity to lower costs."
SMASH builds on the foundation that the Intelligent Platform Management Interface (IPMI) laid earlier. IPMI defines a common and secure interface over the existing LAN connection for monitoring server hardware. This includes such features as temperature, voltage, and fan details. IPMI also controls various components to enable capabilities, such as remotely powering on/off and cycling individual servers or individual blades. There is even a way to view the log of important server events to track things like chassis intrusions or what caused the system to hang/reset. Finally, IPMI enables the IT department to remotely manage and recover failed servers. This includes the capability to remotely view the BIOS, operating system console, and reboot process.
"IPMI should facilitate lights out management of a server," says Hewitt.
According to Steve Rokov, director of marketing for the embedded software & solutions division of management software firm Avocent, the server vendor integrates IPMI. Some tools that use it are SOLProxy from Dell and SMBridge from IBM, as well as various third-party applications, such as PowerCockpit from Mountain View Data. IPMI comes free with servers and is now in more than 50 percent of all server shipments. The low- to mid-level priced rack-based servers from IBM and Dell (including their blade systems) all have IPMI.
"It's not often you get things for free, so take advantage of what the server vendors are doing to make their lives easier today," says Rokov.
Think of SMASH, then, as the scriptable interface to IPMI inside the box. It is a more recent development than IPMI, and it enables administrators to use a consistent command-line interface for server monitoring and management tasks in heterogeneous server environments. The DMTF released this command-line protocol (CLP) standard in June 2005 to address the lack of command-line consistency in management and monitoring information in heterogeneous server environments. It addresses the need for a universal command-line syntax, enabling systems from different vendors to be managed similarly.
"By using servers with IPMI and SMASH, IT [departments] can realize lower operational costs by offering a consistent command line that changes little over time, thus reducing training requirements and the number of mistakes made," says Rokov.
Other savings, he says, result from needing fewer scripts to perform management tasks across multiple server vendors, having to buy fewer management tools, and predicting hardware failures (thus being able to schedule maintenance during non-peak hours).
SMASH in Action
How will SMASH help in the real world? Three likely scenarios a large cluster, a branch office, and a mixed rack of 1U servers with blades play out as follows.
Clusters: How do you diagnose and power cycle the various servers within the cluster when the operating system has hung? With so many machines running, it's difficult to spot issues. By configuring IPMI thresholds within the servers, potential heat and power issues can be recorded and alerted to management consoles, such as Avocent's DSView, ahead of meltdowns, providing time to fix the problem. This predictive alerting offers cluster managers a chance to keep compute cycles, and return on investment, to a maximum. Alternatively, power cycling can be achieved by executing the "Power Cycle" SMASH script against the IPMI firmware in the servers using a Telnet or SSH2 session.
Branch Office: Without local expertise or even personnel to keep an eye on systems, how do you fill the gap? By placing an appliance out at the branch, you can aggregate alerts and secure access to a single point. An IPMI 'Server Security Alert' received at the head office indicates someone just popped open the server chassis. If that appliance supports SMASH, scripts can be run centrally, irrespective of the model or vendor of those branch servers. Opening a Telnet/SSH2 session to the SMASH appliance enables a health check to be run by pulling IPMI information. Additionally, an 'Inventory Scan' can be run using SMASH script to identify whether changes have been made out in the field.
Mixed Rack: IPMI and SMASH don't care whether they are being applied to a blade, a motherboard, a plug-in card, or the blade's chassis manager. To the administrator and his scripts, it all looks the same. So what happens when the rack experiences an "event"? A "Stream Console over LAN" SMASH script opens multiple operating system console sessions and records what the operating system consoles were doing right before the failure. The administrator can also set up thresholds to check on overall system hardware health.
What to Expect
Early next year, new server systems and appliances that support both IPMI v2.0 and SMASH v1.0 are due on the market. IPMI 2 adds additional security enhancements involving authentication and encryption of the serial-over-LAN (SOL) connection as well as for remote BIOS and operating system console viewing. SMASH v1.0, on the other hand, will offer a standardized command line to the server functions provided by IPMI.
Some vendors, such as Dell and Avocent, are ahead of others in adding these features to their products. Dell has already integrated IPMI across the entire PowerEdge product.
"In the future Dell will integrate other standards, like SMASH, across our entire line in order to make administrators' lives easier and more productive," says Dell's Fruehe.
Avocent, too, is adding SMASH to its products. In addition to DSView mentioned above, its embedded Virtual Presence Infrastructure (eVPI) product line is using SMASH to provide administrators with a consistent command-line interface for managing heterogeneous servers.