I know some of you are thinking: how could this happen if we have multiple copies of data in several data centers? Well, in some rare cases, software errors can affect multiple copies of the data. This is what happened. Some copies of the letters were deleted, and we worked hard during the last 30 hours (Traynor published his comment at 5:30 Moscow time - aleksandrit note) to return them to the affected users.
To protect the data from these unusual errors, we also made their offline backups. Thus, they were protected from such software errors. But data recovery from them also requires more time than transferring requests to another data center, which is why we spent hours, rather than milliseconds, on email recovery.
')
So what caused this problem? We released an update for a program that stores data, which was an unexpected error, resulting in that 0.02% of Gmail users temporarily lost access to their email. When we discovered a problem, we immediately stopped the introduction of new software and rolled back to the old version.
Source: https://habr.com/ru/post/114726/
All Articles