Cloudflare ‘Deeply Sorry’ Over Global Outage That Hit Users
Cloudflare has issued an apology over the outage that affected traffic in 19 of its data centres that have a significant proportion of its global traffic.
In a detailed blog post, the company said that the global outage on Tuesday was caused by a change that was part of a long-running project to increase resilience in our busiest locations.
“A change to the network configuration in those locations caused an outage which started at 06:27 UTC (11:57 am India time). At 06:58 UTC the first data centre was brought back online and by 07:42 UTC all data centres were online and working correctly,” the company mentioned.
The outage affected several popular platforms like Discord, DoorDash, NordVPN, Zerodha, Upstox and others that went down for users globally, including in India.
The company said that over the last 18 months, Cloudflare has been working to convert all of our busiest locations to a more flexible and resilient architecture.
“In this time, we’ve converted 19 of our data centres to this architecture, internally called Multi-Colo PoP (MCP),” said the company.
Although Cloudflare invested significantly in MCP design to improve service availability, “we clearly fell short of our customer expectations with this very painful incident”.
“We are deeply sorry for the disruption to our customers and to all the users who were unable to access Internet properties during the outage,” said Cloudflare.
Even though these locations are only 4 per cent of its total network, the outage impacted 50 per cent of total requests.
“We are very sorry for this outage. This was our error and not the result of an attack or malicious activity,” it added.
Cloudflare also suffered another outage last week when several online platforms like Shopify, Discord, Acko, GitLab and others went down for users in India.
“The root cause of the issue was an increase in resource consumption due to a software release,” the company had said in a statement.
Several customers in India faced ‘HTTP 504’ error-based service outages, especially for long-running queries.