Getting Back to Business

With the start of a new school year and the fourth-quarter crunch that befalls businesses, returning to basics never hurts. Many enterprises get caught up in the importance of advanced security and advanced tools and forget that sometimes it's the more day-to-day routine stuff that helps the most.

Security and advanced tools are cool, but day-to-day routines have an even bigger impact on server security and reliability.

I've recently noticed a trend that many companies are failing to deal with some standard routine activities. These include:

  1. Backups
  2. Testing environments
  3. Password strength and timeouts
  4. Standardized desktops

If I've Said It Once ...

Let's start with backups. Yes, you do need them. They aren't just for events like 9/11 or Hurricane Katrina. They are for whenever you might lose data. This can include things like user errors, viruses, hardware failures and other regular, ho-hum disasters.

The first form of backup is a complete or full backup (sometimes referred to as an archival backup). This is the whole system, including the operating system and the data itself. What this means is that if a failure occurs you can recover the system in its entirety (at the time of backup) in one shot. This makes for a faster recovery and is one of the reason's why it's preferred.

The reality is, however, that this isn't as viable as a solution as much any more because of the size of the operating systems and the vast amount of data now stored on systems. Because of this, full backups often take a long period of time to complete.

Usually, this isn't a viable option for user desktops, as they cannot let their systems go down for vast periods of time. Plus, oftentimes users have their systems with them (since laptops have become the new desktop for many corporations). It is more common to perform a full back up at one point and then follow up with an incremental or differential backup. A full backup, however, should be done. Generally, once a month or twice a month is a good place to start — depending on how much data goes through your systems.

Now for the more periodic options, which include incremental backups or differential backups.

The incremental backup only records what has been changed since the last full backup. Obviously a lot quicker but limited in what it records. This particular one would be limited to changes in data. You may also decide to create system state backups/restores (http://support.microsoft.com/?kbid=315412) to ensure registry and other system critical options are recorded as well. This tends to be a far faster process compared to a full backup, but it can lengthen as time goes on and more data is added. On the other side is the differential backup that copies data that has been changed since the full and incremental backups.

We often associate backups with DLT tape backup. Today, with high-speed access, tape backup is a rather slow option. Other options/combinations can include the following:

  1. Backing up to a NAS/SAN: Since it's likely for many corporations to be using these storage options anyway, it makes sense to use them as the backup option.

  2. Tiered backup: Again, using a SAN/NAS for original disk storage means you can backup to a NAS/SAN, but in the tiered process you move data to tape later, so the slower disk-to-tape speed has no performance impact on the system.

  3. Mirroring of SAN/NAS: Use that network side of the SAN. Mirror it across the city or the country with a provider who can look after the backup live.

Test and Test Again

Some of you may be reading this and saying, "But I backup! I don't need to do anything further!"


So when was the last time you tested the backup? That is, actually recovered it to a new system to see if it worked? I too often see people who run into issues and go to backup, only to discover the backup doesn't work.

With any system you have you should undergo some testing, whether building a system from scratch, performing updates or verifying a procedure actually works. A prevalent assumption is that large manufacturers have been doing this long enough that they've tested for every scenario, and thus, it must work. This assumption is based on a house of cards.

Computers are still built with the basic premise of CPU, memory, disks and, more recently (that is, in the last 10 to 15 years), networking components. Yet, we don't test hardware or software. Testing must be done for a variety of situations:

  1. The company has purchased new hardware: Perform a hardware diagnostic and verify all the components in the system are, in fact, as per the order. Additionally, do a 48-hour memory test. http://www.memtest.org is a free option to use, but note that NUMA systems may not respond properly to this kind of testing. Faulty memory is often the cause of many instability problems. Remember that mixed memory is a *bad* thing in servers.

  2. New software/upgrades/patches: Automated patching systems have been helpful in ensuring systems are kept up-to-date, but testing must be done to ensure that systems are not adversely affected. A recent example includes an anti-virus product update that thought the lssass.exe process was a virus and "prevented" systems from booting properly. The reality is that software vendors cannot possible test for *every* environment and configuration.

  3. New or updating of policies/procedures: Any change made to the system may have adverse effects on the whole of the IT infrastructure. Again, the perfect environment doesn't exist in the corporation. There are always flaws and challenges within the corporate day-to-day that must be addressed and resolved.

As with any testing or backup procedure, having a log that indicates who did what and when is helpful.

If you end up in a situation that requires help from a vendor, being able to provide that information can help resolve an issue faster and avoids duplication of efforts. Information should include the name of the person responsible, a brief but accurate description of what was done, what tools — if any — were used, the date and time it was started, and when it was completed. Keep this log in a central location so those who do testing or backups can find it and use it.

This article was originally published on Enterprise IT Planet.

This article was originally published on Sep 27, 2006
Page 1 of 1

Thanks for your registration, follow us on our social networks to keep up-to-date