The Raritan Blog

Data Center Disaster Recovery Tools

Richard Dominach
September 14, 2010

There are many types of “disasters” that can occur in a data center.  These can range from an all encompassing failure to the loss of certain critical IT services.  These can happen at any time related to human error, system breakdowns or natural disasters.  Time is of the essence when these disasters strike, with the need for quick troubleshooting, diagnosis and repair.   Modern KVM-over-IP systems with centralized management and remote power control are key tools to help data center managers and IT administrators recover when disaster strikes.

Since disasters by their very nature are unpredictable, the probability is that they will occur out of working hours!  Hence a reliable remote access method is critical to a quick response.  KVM-over-IP switches and serial console switches can provide the anytime/anywhere, secure, remote access required for recovery personnel to respond immediately when disaster strikes.   KVM-over-IP switches provide BIOS-level server access including support for remote virtual media.  This expanded level of remote access may be required in a disaster when servers need to be rebooted, software re-installed, servers re-imaged or BIOS options changed.

It is possible in a disaster that a server or other equipment will completely shut down.  In this case, remote power control can be the only way to re-start the equipment without going into the data center.  Intelligent rack based power distribution units (PDU) can be connected to the KVM equipment or directly to the LAN in order to remotely restart or power-cycle servers and other devices.

Serial console switches provide access to equipment managed by serial ports.  This includes networking equipment as well as headless servers running UNIX or LINUX.  The serial console switch provides remote console level access to troubleshoot and re-configure equipment.  In addition remote power control is an option.

In a disaster, the corporate network can be affected either partially or completely.  Many serial console switches and KVM-over-IP switches have a modem option so that they can be used over a PSTN connection.   While a PSTN modem may seem very old fashioned, it can be very useful in a disaster situation if the corporate network is not in full operation.  Another alternative is to have a separate backup or management network that connects to the KVM-over-IP and serial console switches.

Since a disaster can be the result of human error, it is useful to which servers have been recently changed .  Modern KVM and access systems produce a comprehensive log of server accesses, so that the recovery staff understands which equipment has been changed, when it was changed and even why it has been changed.

Disasters are unpredictable and the managers responding to the emergency must have the ability to quickly access virtually any equipment in the data center.  Centralized access management systems like Raritan’s CommandCenter Secure Gateway (CC-SG) can be used to provide centralized remote access to thousands of servers (physical and virtual), networking equipment as well as the ability to do remote power control.  The result is that recovery workers can quickly access virtually any server or serially controlled device using a single IP address.

And finally, for larger disasters, responders may need to go into the data center to fully recover.  In this case the KVM-over-IP and serial console switches provide a local port capability such that managers can access equipment while in the data center.  Through a technology called “tiering” or “cascading” the local ports of multiple switches can be consolidated such that from a single console hundreds of servers can be accessed.