Pinterest architecture - 18 million visitors, 10-fold increase, 12 employees, 410 TB of data
The history of Pinterest is very similar to Instagram. Phenomenal growth, a huge number of users and stored data, while amazingly few employees. And everything is in the cloud.
Indeed, neither Pinterest nor Instagram have made any great scientific or technological discoveries, but this is more likely a consequence of the ease of use of cloud technologies than a sign of the decline of the era of innovation in Silicon Valley. . The figures in the title and the valuation of these companies are so great that it seems to us that there is some kind of technological revolution behind them that ensures their rapid growth. However, this revolution is much more skillful - it shows how easy it is to achieve such rapid growth if you are able to realize a good idea. Get used to it. Now it is the norm. This is what Pinterest is today:
410 Terabytes of user data or 80 million objects are stored in Amazon S3. This is 10 times more than in August 2011. The number of Amazon EC2 instances during the same time increased by 3 times. Monthly costs are around $ 39K for S3 and $ 30K for EC2.
Only 12 employees, as in December 2011. Thanks to the use of cloud technologies, the project can continue to grow, and the team supporting it can remain very small. UPD: It looks like 31 employees already.
Paying only for used resources saves money. Peak traffic occurs in the afternoon and evening, so at night the number of EC2 instances is reduced by 40%. At the time of the maximum level of traffic, on average, it takes about $ 52 per hour for EC2, and at night, when the load drops, the cost is only $ 15 per hour.
150 EC2 instances as web servers.
90 instances are used to cache data in memory for unloading the database.
35 instances for internal use.
70 database master servers and parallel existing database backup servers in several regions around the world to provide redundant data storage.
Written in Python and Django.
The sharding is used, the DB breaks at achievement of 50% of capacity. This allows you to simply scale the database while maintaining sufficient speed of IO operations.
Amazon ELB is used for load balancing between EC2 instances. The ELB API allows you to simply enter and remove instances from use.
One of the fastest growing sites in history. Using Amazon Web Service made it possible with a minimum level of IT infrastructure to process requests of 18 million visitors in March, which is 1.5 times more than a month earlier.
The cloud allows you to easily and cheaply experiment with the service, while not acquiring new expensive servers.
Based on Apache Hadoop , Elastic Map Reduce is used for data analysis and costs only a few hundred dollars per month.