Another accident at Selectel

The post was created for information, because On Habré did not find the details of the accident.

Not so long ago, Selectel received a prevention ticket on a storage cluster:

March 28, 2012 from 8:00 to 11:00 scheduled maintenance work on one of the nodes of the storage cluster. This storage contains one or several disks of your virtual machines. You can find out which virtual machines will be affected by your work by uuid SR in the properties of the disks. Work will take place on SR 409127d3-61a1-9c8d-ceb5-486c608c58aa.
')
During the work break is not planned, but fault tolerance can be reduced.

Sorry for the anxiety.

As a result, today the cloud server does not work. Those support was silent until the last. And finally, the long-awaited news:

There was a failure in one of the repositories.
At the moment, our experts are working on troubleshooting.
Unfortunately, the exact timing of the decision, are not yet known, most likely the rise will take about 5 hours.
The data on the virtual machines will not be affected, in the worst case - a rollback of 10 minutes (consistent).
We apologize for any inconvenience caused.

At the moment, management of cloud servers that are not available on this SR. Hanging stub "Recovery in the pool of St. Petersburg (2) within 3 hours."

UPD 13:21 (GMT + 2) :
The notice of work disappeared, but the machine could not be started. Tech support says:

The recovery process continues, the estimated time of completion, a little more than an hour.

Source: https://habr.com/ru/post/140862/

All Articles

Another accident at Selectel

More articles: