Data center monitoring necessary to prevent costly downtime
Tuesday, Jun 25th 2013
Data center outages and downtime are not only particularly disruptive to business, but can be quite costly, and IT administrators need to implement data center monitoring tools and strategies to better ensure continued operations. IDG News Service reported that the French government recently suffered downtime of its accounts payable system, Chorus, which serves to demonstrate how potentially damaging it can be to not effectively monitor data center conditions.
According to the source, the Chorus accounts payable system was hosted at a data center that was operated by French servers and services company Bull. One of Bull's subcontractors made a mistake recently, which caused the server room's fire extinguishing system to go off and resulted in significant damage to several important components of the storage bay holding the Chorus system. While Bull did not have much to say regarding the accident, IDG News Service reported that Chorus was severely impacted even though the disks were purposely arranged in a RAID 6 pattern for redundancy.
The French State Financial Computing Agency soon discovered after the incident that it would be impossible to recover data or restore the disk system. After logging many hours trying to recover data, the agency instead restored a backup of the data from before the accident had occurred. By the time the issue was resolved, however, the system had been offline for more than four days.
Preventing downtime in the modern data center
The scenario involving the Chorus accounts payable system demonstrates how important it is to implement proper controls such as environmental control systems and server room monitoring to gain effective oversight over infrastructure and operations. Since Chorus' failure was mostly related to human error, this emphasizes how automation can be an effective solution.
SiliconANGLE recently shared the perspective of Scott Lowe, Wikibon analyst and founder of The 1610 Group, who argued that CIOs need to adopt advanced management technologies and adjust strategies so that they can maximize infrastructure and avoid the substantial consequences that are inherently linked to downtime. Lowe stressed that his experience has led him to develop the mindset that the more people have to touch things in the data center, the higher the likelihood of mistakes. CIOs and decision makers should make use of effective management systems, data center environmental monitoring tools and automation software to enable administrators to spend less time on routine tasks and decrease the risk of error leading to downtime.
However, Lowe also noted that even with the most effective planning, problems will still occur eventually, and it is vital to utilize environmental monitoring solutions to alert IT staff of a potential problem so that appropriate steps can be taken to prevent a full-out disaster scenario from forming. Lowe recommended that the monitoring go down to the application level to provide a better picture of all operating systems, however, he also noted that even the most basic environmental monitor can provide effective foresight that can help data center administrators navigate around potential system failures and damaging downtime.
Managing increasingly complex infrastructure
In a recent Data Center Knowledge article, former Forrester analyst Vanessa Alvarez touched on the significant changes that have been underway in the data center, with advanced technologies like virtualization and the cloud creating a markedly more complex infrastructure to manage and operate.
"An integrated approach allows for infrastructure resources to be optimized automatically for business applications through intelligent software, eliminating the inefficiencies of over-provisioning and providing automatic cost savings," Alvarez wrote. "It also alleviates IT from the guessing game of how much resources are needed; managing three different boxes and multiple platforms; struggling through the myriad of licensing fees and multi-vendor red tape; and ultimately from the shackles of legacy infrastructure that no longer works. These operational challenges are often difficult to quantify but end up costing businesses a great deal of money."
As Alvarez argued, an integrated approach to infrastructure management is ideal for evolving data center environments. When paired with an effective server room monitor, enterprise IT will be able to more effectively conquer challenges and ensure management is always one step ahead of emerging problems.