Big Box Appliance Buyer's Guide
December 22, 2010
2010 was the year of the big box appliance. Oracle Exadata II, Oracle Exalogic and EMC Greenplum are but a few recent examples. IT historians, of course, will point out that this market was begun by such firms as Netezza and Teradata.
For the purposes of this buyer's guide, only a handful of appliances will highlighted, as there are so many from which to choose. IBM has Cloudburst, the Grid Medical System (GMAS) and its Information Archive (IA). But for this article only IBM's Netezza is covered in the data warehouse and analytics field. Oracle has two solutions covered, while only one Teradata appliance has been featured. EMC Greenplum is also included.
"I like these big box solutions for some applications and deployment scenarios in that they should be easy to buy and install," said Greg Schulz, an analyst with StorageIO Group. "However, there can be a catch so look at what options you have for mixing and matching or reconfiguring of components after you buy the bundled solutions. In other words, what can you do or not do without violating terms of warranty or support ranging from reconfiguring hardware to changing software?"
Exadata II comes in quarter rack to full 42-unit rack configurations. You can pack in as much as two database servers, each with 8 by eight-core Xeon X7560 processors and 1 TB of memory, 2 TB SAS disks and 5.3 TB of Flash to serve as a fast cache with Infiniband to provide a fast interconnect. Sales are said to be brisk.
"CEO Larry Ellison claims a $1.5 billion pipeline, and the top executives are visibly moving database and database machines to the center of the business," said Merv Adrian, an analyst at IT Market Strategy. "70 percent of Exadata customers so far are using the systems for data warehousing."
Oracle (NASDAQ: ORCL) is also making a big deal of the encryption and compression included in Exadata. On the compression side, Oracle is claiming 10-times compression.
One user is LinkShare of New York City, a provider of full-service online marketing solutions. It has a couple of Exadata II boxes acting as data warehouse and analytics engines for core applications.
"We moved from four racks into a half rack and from four times 13 kV of power usage to 6 6kV," said Jonathan Levine, COO of LinkShare. "The flash-based cache has provided an eight to ten times performance boost."
Packed into about one rack are 3 TB of RAM, 1 TB of SSD, 40 TB of SAN disk, a 4 TB read cache and 72 TB write cache. Infiniband within the box provides 40 Gb/sec, which equates to latency of 1.2 microseconds. External networking is provided by 10 Gigabit Ethernet (10 GbE).
"This is by far the fastest machine for running Java applications," said Ellison. "Users can begin with a quarter rack and scale up to eight racks."
How to differentiate these Oracle products? Exadata focuses on data warehousing and data mining while Exalogic is used as the home for big Oracle applications. Users are advised to define their needs clearly to determine which path to follow.
Soon after Oracle released Exalogic, EMC came out with its alternative based on its acquisition of Greenplum.
"Greenplum is a solution to compete with solutions from IBM (Netezza), Teradata and Oracle among others in the data warehouse and business analytics market, which complement traditional database markets where EMC has supplied storage for in the past," said Schulz.
The Greenplum appliance uses a massively parallel processing architecture. The goal is to more efficiently deal with large amounts of data. EMC claims it has twice the data loading speed of its nearest competitor and the industry's best performance. It is based on the Greenplum Database and can load data at a rate of 10 TB an hour. EMC also boasted that this system is twice as fast as Oracle Exadata and five times as fast as systems from Netezza and Teradata. Users are advised to test such claims in their own environments or at least visit the vendor and view these boxes in action.
"The amount of data that businesses have at their disposal is greater than ever before, and new tools are required to gain business insight from it," said Bill Teuber, EMC vice chairman. "The EMC Greenplum technology provides the deep analytic processing that enables customers to leverage all of their data, taken from many sources, to make smarter decisions, faster."
The Old Guard
IBM recently bought Netezza and doesn't seem to be saying much about it so far.
The Netezza TwinFin system is the fourth-generation Netezza appliance aimed at the data warehousing and business intelligence markets. It integrates database, server and storage into one system, which scales up to petabytes with built-in compliance and security.
Teredata, meanwhile, remains independent and offers a series of appliances. The one covered here is the Teradata Data Warehouse Appliance. It features the Teradata Database, quad-core Intel processors, SUSE Linux and enterprise storage.
"Teradata data warehouse/business analytics solutions combine servers, storage and software tools," said Greg Schulz, an analyst at StorageIO Group.
Drew Robb is a freelance writer specializing in technology and engineering. Currently living in California, he is originally from Scotland, where he received a degree in geology and geography from the University of Strathclyde. He is the author of Server Disk Management in a Windows Environment (CRC Press).