Google Suffers System Outage As Gmail and Web Services Knocked Offline
Published on: 25th Jan 2014
Google suffered one of its rare system failures on Friday afternoon (European time) as a wide range of web services fell over for around half an hour.
During the outage, most Google users who use logged-in services like Gmail, Google+, Calendar and Documents found they were unable to access those services for approximately 25 minutes. For about 10 percent of users, the problem persisted for as much as 30 minutes longer.
Google says that the problem has been resolved, and it is now working on measures to prevent a recurrence. Had that happened to a telecoms operator, they would usually be expected to provide reports to the telecoms regulator and could face sanctions or fines.
Most web companies, even ones upon which increasing amount of a person's daily lives are not regulated in the same way, although it may only be a matter of time before there are calls for some sort of regulatory oversight of a service when it gets to a certain size.
In a blog post, the company outlined what went wrong:
At 10:55 a.m. PST this morning, an internal system that generates configurations-essentially, information that tells other systems how to behave-encountered a software bug and generated an incorrect configuration.
The incorrect configuration was sent to live services over the next 15 minutes, caused users' requests for their data to be ignored, and those services, in turn, generated errors.
Users began seeing these errors on affected services at 11:02 a.m., and at that time our internal monitoring alerted Google's Site Reliability Team. Engineers were still debugging 12 minutes later when the same system, having automatically cleared the original error, generated a new correct configuration at 11:14 a.m. and began sending it; errors subsided rapidly starting at this time.
By 11:30 a.m. the correct configuration was live everywhere and almost all users' service was restored.