Posted on September 20, 2017 by Rick Gonedes
Many people don't realize that the most significant cause of data center downtime, to the tune of an incredible 70% by most estimates, actually has to do with your employees, more specifically, human error. It's a very serious problem that will only get worse if not corrected upfront, but luckily there are a few key steps that you can start taking today to help mitigate risk from these types of issues moving forward.
The Myth of "Malicious Intentions"
The vast majority of the time, human error has absolutely nothing to do with a person's "malicious intentions" but is the product of someone being in the "wrong place at the wrong time." Think about it - data centers, in particular, are highly specialized environments, filled with room after room of sensitive (read: fragile) equipment. All it takes is one person getting in a situation they're not equipped to handle, making a choice perhaps they think will help the team but in reality, it’s the wrong decision and next thing you know, you’re looking at a major issue.
Combatting Human Error
Human error is not something that can ever fully be eradicated. We're all human beings and humans make mistakes. It is, however, something that you can plan for.
Color-coded power cord plugs and plug inserts, for example, are a perfect visual way to help quickly identify which resources are essential and which ones are less important. Even by someone who isn't necessarily familiar with the equipment, they are still able to recognize the proper importance.
A device that has been connected with a RED power cord, for example, might be one that is absolutely mission-critical and should never, under any circumstances, be unplugged. Any equipment that CAN be unplugged with management confirmation, on the other hand, might be connected with a YELLOW plug. GREEN power cords might be used to identify assets that are not mission-critical and therefore can be unplugged temporarily on an as-needed basis.
This is just one example of a system that you may implement and stick with, along with mandatory training for all employees, regardless of job title, to help eliminate human error. These actions will go a long way towards preventing employees who make an “educated guess” that, unfortunately, turns out to be wrong, thus taking some of your assets offline resulting in an unforeseen outage.
Another perfect opportunity to mitigate risk from human error would involve utilizing rack PDU locking power cords and outlets. Even going beyond human error, IEC outlets can oftentimes fail to hold plugs as securely as employees and data center managers would like. All it takes is one person who doesn't triple check a plug to make it secure, and you're a few hours away from an accidental disconnection.
With Raritan’s SecureLock power cords and outlets, however, this isn't something you would need to worry about anymore. All of Raritan's PX intelligent rack power distribution units are equipped with SecureLock outlets from the time of purchase, and when connected with SecureLock power cords it prevents cables from becoming accidentally unplugged, ensuring data center managers a peace of mind.
Another important area to focus on is physical data center security. Being able to limit access to only authorized personnel in a data center goes a long way towards reducing conditions where human error can take effect WITHOUT impacting the mission-critical functions of the rest of your organization. One simple solution that adds a layer of physical security to the cabinet is intelligent door locks. Raritan offers intelligent door locks called the SmartLock System which provides users an easy to deploy, economical, networked door locking solution for all types of data center enclosures. The optional USB webcam allows for even more security in highly sensitive environments with proximity sensors, which provides real-time images and video to be viewed remotely.
These are just a small sample of the different methods of preventing human error. These steps combined with employee education, process and procedure go a long way towards greatly reducing the risk from the number one cause of data center unplanned outages today, human error.