This week was not a very good week for Google. After a serious outage on Monday morning that shut down all Google services, we had to contend with another Gmail outage yesterday afternoon (December 15).
Neither outage lasted very long, but left us with many questions. What in the world is happening to Google that they would have two serious outages in the past few days?
Yesterday's Gmail outage took the service down at 4:30 PM ET. Not all users were affected, but many suffered error messages, email bounce-backs, high latency, and other similar issues. The problem was fixed and normal service resumed at 7 p.m. ET, but Google did not say exactly what the cause of the outage was.
Fortunately, Monday's problem was less mysterious, and Google has published preliminary details of the outage on its blog; dubbed "Google Cloud Infrastructure Components incident 20013," the problem was caused by Google's automated storage management system. This system suffered a problem that degraded the ability of Google's authentication system, and Google was unable to discern which users were authenticated and which were not. In other words, Google was unable to discern which users were authenticated and which were not.
In short, this one problem hampered all systems that required login information - almost all of them. Particularly affected were Google's own Cloud Platform and Google Workspace; since Workspace is the service behind Gmail, Calendar, Meet, Drive, etc., most people would have suffered the latter problem.
"The root cause," Google explained, "was a problem with the automated quota management system, which caused Google's central identity management system to lose capacity and return errors globally. As a result, it was unable to ensure that user requests were authenticated and provided errors to users."
Thankfully, with the details of Monday's outage and the fact that they fixed last night's Gmail problem, Google can hopefully take steps to prevent a recurrence.
Comments