Disaster Recovery Methods Expand
April 4, 2007
The bill for any kind of man-made or natural disaster can be staggering. From 2003 to 2006, more than 16 weather-related disasters in the United States caused more than $1 billion worth of damage, according to the U.S. Department of Commerce's National Climatic Data Center.
Not surprisingly, organizations throughout the country are taking steps to minimize the risk and safeguard assets. But the risks now extend beyond the traditional boundaries of disaster recovery (DR).
"When it comes to DR planning, the challenges are mounting for CIOs," says Fred Moore, an analyst at Horison Information Strategies in Boulder, Col. "They've added concerns ranging from widespread loss of electrical power to the growing intrusion threats from hackers to employee sabotage and terrorist attacks."
Take the case of Fidelity Bank of Edina, Minn. After analyzing its DR needs, it became obvious that the only way to reach its recovery objectives was to pay for space at a remote location for replicating images and data. Fidelity Bank soon found that establishing a dedicated DR facility wasn't enough. Once it opened that site, it had to confront a wide range of networking and bandwidth issues to efficiently replicate business data.
"We started sending 40 to 45 GB of data per day between our main bank and collocation facility," says Rick Erickson, assistant network administrator at Fidelity Bank. "As the volume of data increased, we realized that we would have to double or triple our bandwidth to accommodate demand."
Rather than replicating entire server images across the WAN, Fidelity Bank was forced to send only data. Thus, if a disaster occurred, it would be forced to rebuild the lost server from recovered data. As this considerably slowed the process, it defeated the purpose of rapid recovery.
Thus, the organization began to look at WAN optimization solutions. It ultimately selected NX appliances by Silver Peak Systems in Santa Clara, Calif. The appliances sit between network resources and the WAN infrastructure used to connect them to remote users. The idea is to reduce the amount of data traversing the WAN and improve application performance. In addition, the technology protects stored data using encryption.
Using these tools, the bank consistently sees a 94 percent data reduction across the WAN, and 1 GB transfers have been reduced from 70 minutes to 4 minutes.
Planning, then, is a critical aspect of any DR endeavor. And it's one of the hardest parts to get right.
"A wrong guess can cost significant time and money," says Erickson. "You find out through the planning process what applications have to be up sooner than later."
Fidelity found out, for example, that e-mail was not as necessary as the IS organization originally thought. But some simple applications it didn't rate highly were found to be absolutely crucial to rapid recovery. Thus, there is no substitute for taking a fresh look at the entire infrastructure during DR planning to evaluate what is vital and separate that from what is in reality non-essential.
Chip Nickolett, principal of systems integration firm Comprehensive Consulting Solutions of Brookfield, Wisc., has noticed SAN mirroring becoming more and more common as a part of the business continuity plans he's helping to implement. The reason is simple: Few firms can afford to be down one to three days while they to execute a DR plan.
But even with sound planning, says Nickolett, there is no substitute for a plan that has been tested so thoroughly that it actually performs when required. He tells the tale of a severe failure at a customer's main computer system. A part was needed that would have kept production down for 24 hours or more. Luckily the data was on a SAN.
"We took a small development system, executed their DR plan (making changes on the fly to accommodate the smaller system), and had them up and running in approximately four hours," says Nickolett. "The DR plan allowed us to walk through everything that was needed, without forgetting anything, and restoring production systems in relatively short period of time."
This example highlights the essence of modern recovery strategies velocity. There is no longer time to hunt through piles of tape or send offsite for information for back up tapes to be driven to a recovery facility. Things must be ready instantly.
"The best prepared organizations recognize that speed is essential in recovering from whatever disaster may come to pass," says Moore.
That, says Greg Schulz, an analyst at StorageIO in Stillwater, Minn., is why remote data replication has become so important. Such technology is now making its way down the food chain and is becoming affordable by small and mid-sized organizations.
"Many entry level storage systems are also supporting some form of remote replication built into the solution, as opposed to requiring external appliances or host based software," says Schulz. "The reason why is that small and medium size business data is just as much exposed and at risk as larger environments."
He also points to another trend: the rise of bandwidth and latency optimization techniques, such as those used by Fidelity.
"WAN optimization continues to be popular as people understand issues and requirements centered around data movement and effective bandwidth," says Scultz. "Continuous data protection (CDP) has gotten a lot of hype; however, adoption has been limited that should change with more organizations leveraging it with data backup as part of an overall data protection strategy."
Not Just IT Gear
One final trend bears mentioning. It's no longer enough to protect servers, storage and networking gear alone. The entire infrastructure requires attention.
The Children's Medical Center in Dayton, Ohio, for example, has had to backup its cooling system.
Chuck Rust, senior network analyst at the facility explains that Children's Medical Center originally had two AC units. It ran one at a time, with the other placed on standby. As the data center became overcrowded, it added an additional AC unit, running two at a time, with one remaining on standby. When the medical center itself expanded, the facility opened another data center in an adjacent office space directly behind the original one. Both data centers have a mixed environment of Unix, Windows and Novell Servers. Each includes an EMC SAN and all equipment is connected via a Cisco backbone. This expansion meant all three AC units needed to be online simultaneously.
Thus, the medical facility brought in an InfraStruXure system from APC Corp of West Kingston, RI. This unit uses chilled water to take cool air directly to the racks. Rust reports he has gone from one or two heat-related failures every few months to no failures in five months.
"The hot spots have definitely diminished," says Rust. "Whereas the room used to feel hot beside certain servers, we don't have that problem anymore."
DR, then, goes way beyond IT. That's why any smart IT manager coordinates closely with facilities, HR, operations and top management before finalizing any DR plan. The good news is that such thorough coordination is actually occurring in more and more companies.
"Planning for data recovery resulting from problems with hardware, software, people, intrusion, theft and natural disasters has never existed before at this level of intensity," says Moore.