📜 ⬆️ ⬇️

Backstage hotmail

image Hi, my name is Arthur de Haan, and I am responsible for testing and system design in Windows Live. I would like to give you a glimpse behind the scenes of Hotmail, and tell you more about what is needed to create, deploy, and launch Windows Live Hotmail on such a global scale.

Storing your mail and data (and our own data) on our servers is a big responsibility and we pay a lot of attention to quality, performance and reliability. We make significant investments in engineering and infrastructure so that Hotmail works 24 hours a day, day after day, year after year. You will rarely hear about these efforts, you will hear about them in those rare cases when something goes wrong and our service faces a problem.

Hotmail is a gigantic service in all dimensions. Here are some of the main ones:

You can imagine that the user interface of Hotmail is just the tip of the iceberg, most of the innovations occur inside and are not visible to the user. In this post I will give a high-level overview of the architecture of the entire system. We will make a deeper immersion in some of the features in the next posts (from the translator: if this article is like the community, I can translate these subsequent posts)
')

Architecture


Hotmail and other Windows Live services are located in several data centers around the world. Hotmail is organized into logical scalable elements - clusters. In addition, we have an infrastructure that distributes the load between the clusters in each data center:



There are several million users on one cluster (how much depends on the age of the hardware) and a stand-alone set of servers, including:



Preventing malfunctions and data loss is our highest priority, and we take every precaution to prevent this from happening. We designed our service to handle failures efficiently, given our assumption that anything that might fail would do so over time. We are experiencing hardware failures, among the hundreds of thousands of hard drives that we use are those that fail. Fortunately, due to the nature of the architecture and the timely handling of failures, customers rarely notice this kind of failure.

Here are some ways to prevent crashes:



Technological process



I talked a little about our architecture, and the steps that we are taking to ensure uninterrupted service. However, our service is not static; in addition to growth through use, we regularly update. Thus, our processes are as important as the architecture in order to provide you with an uninterrupted service. We follow certain precautions when deploying new code, from patches and small updates to major releases.

Testing and deployment. For each developer, we have a testing engineer who works hand in hand with the developer to contribute to the development and writing of specifications, creating a testing infrastructure, writing automatic tests to test new features, ensuring quality. When we talk about quality, we are talking not just about stability and reliability, but also about ease of use, performance, security, availability (for users with disabilities), privacy, scalability and functionality in all browsers.

Since we are a free service funded by advertising, we must be highly efficient. Therefore, the deployment, configuration and maintenance of our systems is a highly automated process. Automation also reduces the risk of human error.

Deploy code and change management. We have thousands of servers in the test lab, where we deploy and test the code, long before it hits the client. In data centers, we also have clusters specifically reserved for testing “dogfood” and beta versions at the final stage of development. We check all changes in our laboratories, be it a hardware or software update, or a security fix.

When all engineering teams sign a release (including testers and engineers), we begin the gradual deployment of updates on clusters around the world. We usually do this for several months, not only because it takes a lot of time, but also to make sure that it does not affect the quality and performance of the service.

We can also turn on or off some features separately. Sometimes we deploy updates, but postpone their inclusion. In rare cases, we block some features for security or performance reasons.

Conclusion



This topic should give you an understanding of the scale of development, which implies the development of Hotmail. We are committed to technical excellence and continuous improvements to our services for you. We continue to study how the service is growing, and listen to your feedback, seriously, you can leave me comments here with your thoughts and questions. I am a passionate fan of our services, like the whole Windows Live team - we can be engineers, but we use the services ourselves, along with millions of our users.

Source: https://habr.com/ru/post/80823/


All Articles