📜 ⬆️ ⬇️

Highly loaded sites and applications in Python / Django (29 projects)

Our company has been developing large Python / Django web applications for quite some time. In RuNet, there is very little information about this wonderful programming language and the framework based on it. We decided to correct this error and talk about popular high-load sites on this technology, as well as give a lot of interesting technical details. For the convenience of comparing sites to each other, we also bring some statistics to each of the projects. In general, Python / Django sites, of course, are very numerous in the world; we will cover only the most interesting ones in our opinion.

A small table of known projects (based on http://builtwith.com/ ), clickable:

Zoom

The big version of the table here: http://seclgroup.ru/article_vysokonagruzhennyye_sayty_i_prilozheniya_na_python_django.html
')

Pinterest (social network)


Used technologies:

Python as a programming language and as a framework was chosen Django. Hosting - Amazon. MySQL is taken as the main database management system, memcached is the object caching, and Redis is the object caching. Solr is used as a search platform, and Hadoop is used to implement search and contextual mechanisms for analyzing data.

A little bit about statistics:

The staff employs more than 140 employees according to the latest data. About 11 million unique visitors per week as of December 2011. 500 virtual machines in EC2, 410 terabytes of user data or 80 million different objects are stored in Amazon S3. As of July 2013, Pinterest has about 70 million users, according to the French Semiocast agency. Totally, Pinterest becomes the 4th most popular social network in the USA after Facebook, Twitter and LinkedIn.

( http://en.wikipedia.org/wiki/Pinterest )

(original http://www.businessinsider.com/how-we-scaled-pinterest-2013-4?op=1 , http://highscalability.com/blog/2012/5/21/pinterest-architecture-update- 18-million-visitors-10x-growth.html )

Disqus (service)


Used technologies:

Python as a programming language and as a framework was chosen Django. The operating system is Linux. PostgreSQL is taken as the main database management system; object caching is memcached, as is the case with Pinterest. HAProxy is responsible for load balancing, Slony is responsible for data replication.

A little bit about statistics:

The staff employs 33 people. The number of registered users is growing and currently stands at 50 million people. The site receives approximately 17 thousand requests per second. More than 144 million unique visits per month only from the United States. About 750,000 sites and blogs.

(source http://www.insight-it.ru/masshtabiruemost/arkhitektura-disqus/ , original http://highscalability.com/blog/2010/10/26/scaling-disqus-to-75-million-comments- and-17000-rps.html , http://en.wikipedia.org/wiki/Disqus )

Instagram (photo and video application)


Used technologies:

For the main operating system taken Ubuntu Linux 11.04. Python as a programming language was chosen as a framework by Django. PostgreSQL is taken as the main database management system, objects are cached again by memcached, and Redis acts as an additional data storage. HAProxy is responsible for load balancing. The project uses Amazon infrastructure, in particular EC2, ELB, Route, S3 and CloudFront. For the search platform taken Solr. To work with tasks used Gearman.

A little bit about statistics:

It all started with one small and weak server and two developers. On the first day, about 25,000 users were registered. Today, Instagram is used by more than 200 million people - more than 7 million users daily. About 20 billion photos were published, and 60 million published daily. Facebook purchased the service for $ 1 billion in April 2012.

(source http://expandedramblings.com/index.php/important-instagram-stats/ , http://en.wikipedia.org/wiki/Instagram )

Reddit (news site)


Used technologies:

Python as a programming language was chosen as a framework by Pylons. PostgreSQL is also taken as the main database management system, caching of objects is performed by memcached. RabbitMQ is used for offline data processing. HAProxy is responsible for load balancing. For a search platform taken Amazon CloudSearch.

(source http://en.wikipedia.org/wiki/Reddit )

A little bit about statistics:

About 112 million unique visits per month. 5.46 billion page views per month. Reddit is enjoyed by 2.89 million people. And all this is served by a team of 28 people.

(source http://expandedramblings.com/index.php/reddit-stats/ )

Dropbox (cloud storage)


Used technologies:

Both the Dropbox server and client application are written in Python. The client uses GUI development tools such as wxWidgets and Cocoa and other important Python libraries such as Twisted, ctypes and pywin32. Dropbox depends on the librsync library, which is written in C. Information about files is contained in a repository implemented based on MySQL. Amazon S3 is used to store the files themselves.

(source http://en.wikipedia.org/wiki/Dropbox_(service) )

A little bit about statistics:

The staff consists of 110 employees. Over 50 million registered users. Every 3 minutes, more than a million files are saved using the service. 500 million files are saved daily.

(source https://www.dropbox.com/static/docs/DropboxFactSheet.pdf )

Pitchfork (electronic music magazine)


Used technologies:

Python as a programming language was chosen as a framework by Django. Hosting - Amazon. MySQL was chosen as the main database management system. In addition, PostgreSQL is used. Redis is responsible for caching objects. ElasticSearch and Solr are used as a search platform. Amazon services are also used (EC2, RDS, SES).

(source http://www.siteclass.com/www/pitchfork.com )

A little bit about statistics:

The project has more than 14.5 million visits per month (580 thousand per day), of which 5.5 unique (410 thousand per day). Browsing pages is 38.5 million per month (1.6 million per day). As of March-April 2014, the number of registered users is 4.3 million.

(source https://www.quantcast.com/pitchfork.com )

Lanyrd (portal)


Used technologies:

Python as a programming language and as a framework was chosen Django. PostgreSQL is taken as the main database management system, objects are cached by memcached. Redis is used in conjunction with Celery to store intermediate results of tasks that Celery performs in asynchronous mode. HAProxy is responsible for load balancing. The project uses Amazon infrastructure, in particular S3. For the search platform taken Solr.

(source http://www.slideshare.net/InfoQ/inside-lanyrds-architecture )

A little bit about statistics:

The development team consists of only 6 people, as stated by Andrew Godwin in his presentation. About 900,000 unique visitors per month in March 2014, the maximum value was in October 2013 and amounted to just over a million visitors (http://www.trafficestimate.com/lanyrd.com). Daily browsing is about 55,000.

(source http://www.slideshare.net/InfoQ/inside-lanyrds-architecture )

Mozilla (software)


Used technologies:

Mozilla uses various programming languages, including Python, which is widely used: from writing assembly scripts to the company's website, Webmaker and other components, as well as a synchronization server — a minimalist WSGI application using Paste for hosting deployment and Sqlite3 as a database .

A little bit about statistics:

To date, more than 500 million people use Mozilla projects. This is a very large figure, given the competition. More than a thousand developers are involved in Mozilla projects.

(source http://expandedramblings.com/index.php/internet-browser-stats/ )

Yelp (portal with social network elements)


Used technologies:

For the main operating system taken Ubuntu Linux. Python as a programming language and as a framework was chosen Django. MySQL is taken as the main database management system. Yelp also uses Amazon services, in particular, S3 for storing logs and photos, and EMR. Solr / Lucene is used as a search platform. HAProxy and LVS are responsible for load balancing.

(source http://engineeringblog.yelp.com/ , http://aws.amazon.com/solutions/case-studies/yelp/ )

A little bit about statistics:

Site attendance is about 200 million people per month, of which about 120 million unique visits. ( https://www.quantcast.com/yelp.com ). For all the time of the project, more than 53 million reviews have been written by users around the world.

(source http://expandedramblings.com/index.php/yelp-statistics/ )

Foursquare (social network with the function of the site)


Used technologies:

The project is written in several languages, including Python, which is used to automate operational tasks and other processes. CentOS Linux is the main operating system. HAProxy is responsible for load sharing and API requests. MongoDB is taken as the main database management system, computational caching is done by Memcache. Some of the information, namely custom photos, is stored in Amazon S3. Analysis deals with Hadoop. For the search platform taken Solr and Elasticsearch. To search for geo-indexes, the s2 library from Google is used in conjunction with PostGIS. Kestrel is responsible for handling asynchronous tasks.

(source https://foursquare.com/about )

A little bit about statistics:

The staff consists of approximately 140 employees. According to data for 2013, about 40 thousand developers were engaged in the project. About 45 million users. The total number of check-ins worldwide is 5 billion, and approximately 3 million is added every day.

Rdio (music service)


Used technologies:

Rdio uses several programming languages. In particular, the part of Backend is written in Python and Django is selected as the framework. For storage of information several databases are used - MongoDB and MySQL. Redis was chosen as an alternative to memcached.

A little bit about statistics:

Catalog of more than 20 million songs. Every day about 200 thousand page views. In the United States, about 300 thousand people use the services of the project for a month.

Google (search engine)


Used technologies:

Google uses many programming languages. Since the creator of Python worked one time at Google, it is easy to assume that this programming language is also used. And there is. Part of YouTube and the search engine, as well as many other components are written in Python. In short, LevelDB is used as the main database management system. It also uses Closure to work with JavaScript, a kind of toolkit created by Google developers.

A little bit about statistics:

Google statistics is known to everyone and everywhere, but we give a few numbers. Every month 12.477 billion queries pass through the search. Google occupies about 67% of the United States search market. Unique visits per month - 191 million people. The staff of the company is 53,891 people.

(source http://expandedramblings.com/index.php/by-the-numbers-a-gigantic-list-of-google-stats-and-facts/ )

Quora (social service Questions and Answers)


Used technologies:

Python as a programming language. Hosting - Amazon. MySQL is taken as the main database management system, memcached is the object caching. HAProxy is responsible for load balancing.

A little bit about statistics:

The staff consists of 72 employees. The number of unique visits in February 2014 is more than one million. 1,126.00 people use Quora service each month (data for 2013).

(source http://www.quora.com/How-much-traffic-does-Quora-get , http://techcrunch.com/2013/11/12/quora-confirms-its-favoring-search-ads- for-eventual-monetization-launches-author-stats-tool / )

Summarize


As you can see, quite a few well-known projects use Python / Django and this is quite justified. Python is a very interesting modern programming language, which is now very quickly gaining popularity, both in the West and in Russian Internet. About the advantages of Python / Django, we have already written a small note. At the moment we are seeing a great demand for programming in Python / Django, because this is quality! Demand for projects creates demand for specialists.

If you want to develop a highly loaded project, then we recommend to consider Python / Django as a technical platform. Well, for colleagues, if you want to become a programmer or learn a new programming language, Python would be one of the best options.

Python courses . Our school starts a five-month course on “Developing web applications in Python / Django” - there are still a few places. To enroll write to info@digitov.com

New Articles To receive our new articles before others or simply not to miss new publications - subscribe to us on Facebook , VK and Twitter .

Programming courses Very soon the courses will start in our business school Digitov: I want to become Junior PHP Developer! , Symfony 2. Agile Development and Ruby on Rails. On rails to professional design . Subscribe to courses now and be able to buy them at a discount.

Original article: http://seclgroup.ru/article_vysokonagruzhennyye_sayty_i_prilozheniya_na_python_django.html

The authors:
Andrey Astafyev, Middle Project Manager, SECL GROUP / Internet Sales Technologies
Nikita Semenov, President, SECL GROUP / Internet Sales Technologies

Source: https://habr.com/ru/post/218921/


All Articles