New data from Uptime Institute has found that downtime costs and consequences are worsening as those involved in data infrastructure fail to find ways to curb outages.
The 2022 annual Outage Analysis report revealed that operators are still struggling to meet the high customer standards as demands continue to grow.
One in five organisations surveyed in the report said they experienced a serious or severe outage in the past three years, an upward trend in the prevalence of major outages.
80% of data center managers and operators said they experienced any type of outage in the past three years, which is a marginal increase over prior statistics that generally fluctuated between 70% and 80%.
Cost is also a significant issue raised in the data. Outages costing over $100,000 have increased in recent years, with over 60% of failures resulting in at least $100,000 in total losses. This is a figure up substantially from the 39% recorded in 2019. The share of outages that cost upwards of $1 million also increased from 11% to 15% over that same period.
Power-related outages were found to account for 43% of outages that were classified as significant (causing downtime and financial loss). The single biggest cause of power incidents was found to be uninterruptible power supply (UPS) failures, which are often extremely hard to manage.
There are also often issues with networks that create downtime issues. According to Uptime's 2022 Data Center Resiliency Survey, networking-related problems have been the single biggest cause of all IT service downtime incidents regardless of severity over the past three years. This has been attributed to the complexity of new technologies and cloud architectures, along with the emergence of human error when dealing with these issues.
When looking at human error related issues, nearly 40% of organisations have suffered a major outage because of it over the past three years. Out of these incidents, 85% have stemmed from staff in enterprises failing to follow procedures or from flaws in the processes and procedures themselves.
It was also found that significant pressure on external IT providers is responsible for many public outages. Third-party, commercial IT operators account for 63% of all publicly-reported outages tracked by Uptime since 2016. In 2021, commercial operators caused 70% of all outages. The company believes that the more workloads that are outsourced to external providers, the more these operators account for high-profile, public outages.
"The lack of improvement in overall outage rates is partly the result of the immensity of recent investment in digital infrastructure, and all the associated complexity that operators face as they transition to hybrid, distributed architectures," highlights Uptime Institute Intelligence founding member and executive director Andy Lawrence.
"Digital infrastructure operators are still struggling to meet the high standards that customers expect and service level agreements demand despite improving technologies and the industry's strong investment in resiliency and downtime prevention."