Baseline magazine has published a
detailed technical analysis of the
MySpace.com site infrastructure. As you know, this is one of the largest web services on the Internet, which is now
arguing with the
Yahoo portal for the title of the most visited site on the Web.
This topic is also interesting because MySpace was created quite recently, in July 2003, and its popularity has grown very rapidly. For this reason, the infrastructure upgrade was carried out in an emergency mode, and the cardinal update of the servers was already five times.
Currently, the number of registered MySpace users has exceeded 140 million people. Each of the users does not just visit the site daily, but spends a lot of time on it and publishes content, which greatly loads servers. In November 2006, 38.7 billion page impressions were recorded - and this is only three years after the site was opened. Each page is dynamically generated, retrieving information on the fly from numerous databases.
MySpace servers run on Microsoft software and it does not cope with the load. MySpace users see error messages literally every day. According to independent experts, at times of peak load, between 20% and 40% of attempts to log in to the site are unsuccessful - “Unexpected Error” appears in response.
On average on the Internet, this figure does not exceed 1%. A 20–40% rate is unacceptable for any commercial site, but MySpace’s main audience is teenagers.
')
On the night of July 24, 2006, MySpace.com went offline at all for 12 hours (there was a blackout in California, and MySpace did not have a backup data center). Then users met a page with apologies and a flash game, so that they could kill time waiting for the opening of their favorite site. Interestingly, during the downtime the number of visitors was more than on a typical day (according to Hitwise statistics). This indicates the attachment of users who checked the state of their virtual home again and again.
According to technical experts, the MySpace infrastructure was not originally designed for such a load. Unlike the creators of Yahoo, eBay or Google, the creators of MySpace from the small spam company Intermix Media did not consider the possibility of scaling. Obviously, they were not ready for such a rapid growth of the audience.
MySpace was created like this. In 2003, the US passed the anti-spam law CAN-SPAM, so that the owners of Intermix Media decided to change the scope of activities and open their own social network. To create a web application, they hired a programmer, Duck Chau, who wrote the first version of MySpace. The site worked on Perl, under the Apache web server and MySQL DBMS. However, this program did not like other Intermix Media programmers who had experience working under ColdFusion, so they rewrote the program under ColdFusion. Naturally, she now worked under Windows and Microsoft SQL Server, and Duck Chau quit.
The launch of the MySpace site took place exactly at the moment when the most popular social network Friendster started having performance problems. Users had to wait for each page to load for 20-30 seconds, and the developers due to lack of financial resources could not do anything about it. Very quickly, all users switched to MySpace, whose servers worked quite well.
Initially, the MySpace site worked on only two Dell servers (4 GB of memory and two processors) with a single database server. As incoming requests grew, new web servers were purchased. The problems began in early 2004, when the number of registered users reached 400 thousand, and the database server could not cope with the load. Adding database servers is not as easy as web servers, so we decided to create a bundle of three SQL Server databases (one main and two copies).
The next upgrade had to be done in mid-2004, when the number of users approached 2 million and the database server ceased to withstand the number of read and write requests. For example, this was manifested in the fact that published comments were published on the site with a delay of up to five minutes. The way out was to separate the data storage system from the DBMS.
The third upgrade (3 million users) occurred shortly after the second, because the DBMS still could not cope with the load. As a result, it was decided to create a large distributed system from relatively inexpensive database servers, which could easily be scaled in the future. Additionally, I had to rewrite the software for the site. It was the biggest upgrade. For a uniform load on the servers, users are “divided” into clusters so that each database server serves 2 million people.
When the subscriber base reached 9 million in early 2005, engineers began migrating from ColdFusion servers to a new version of web software written in Microsoft C # and running under ASP.NET. Immediately it turned out that under ASP.NET programs work much more efficiently. On the new code, 150 servers did the same work as before 246. In addition, a new professional data storage system was installed that can withstand a heavy load. A little later (for 17 million users), another series of cache sevrers, intermediaries between web servers and database servers were added to the system.
The last MySpace upgrade took place in mid-2005 (26 million users), when the migration to the new SQL Server 2005 DBMS was carried out, even during its beta testing. This rush is due to the fact that this was the first version of SQL Server, supporting 64-bit processors with the possibility of extended addressing to memory. And memory then was just the bottleneck in MySpace infrastructure.
Although a year and a half has passed since then, and the number of registered users has grown to 140 million, no more global upgrades have been carried out. Just in mid-2005, media mogul Rupert Murdoch bought MySpace for $ 580 million. It was a good bargain, because now the site’s cost has increased to about $ 6 billion (according to the same Murdoch). In 2007, the site should bring $ 400 million in revenue, mainly from the display of advertising.
Not surprisingly, the MySpace infrastructure is still faltering. Microsoft software cannot handle the load for which it is not designed. In November 2006, the site broke the SQL Server limit by the number of simultaneous connections: this is the main reason for the constant failure. Therefore, the Windows 2003 servers unexpectedly shut down: their built-in protection against DoS attacks worked falsely.
In mid-2007, the next MySpace upgrade is planned: the creation of a distributed storage system so as not to depend on the data center in Los Angeles.
MySpace infrastructureTask | Product | Manufacturer |
Web application technology | Microsoft Internet Information Services, .NET Framework | Microsoft |
Server operating system | Windows 2003 | Microsoft |
Programming language and environment | C # Applications for ASP.NET | Microsoft |
Programming language and environment | Initially, the site was launched under Adobe's ColdFusion, now the remaining ColdFusion code runs under New Atlanta's BlueDragon.NET | Adobe, New Atlanta |
DBMS | SQL Server 2005 | Microsoft |
Data storage | 3PAR Utility Storage | 3PARdata |
Accelerate Internet Applications | Netscaler | Citrix systems |
Servers | Standard HP 585 (see below) | Hewlett packard |
Advertising engine | DART Enterprise | Doubleclick |
Search engine and contextual advertising | Google search | Google |
The standard database server (there are 65 of them now) is based on the HP 585 server base with four dual-core 64-bit AMD Opteron processors and 64 GB of RAM (recently increased from 32 GB). Windows 2003 operating system, Service Pack 1, Microsoft SQL Server 2005 DBMS, Service Pack 1. 10 Gbps network card.