Overview of the system architecture of the social network Campus.ru

I present to your attention a review of the high-level system architecture of the social network www.campus.ru , developed by Creative Media LLC. In my opinion, this material is interesting because it allows to evaluate the applicability of the considered approaches and technologies in the development of Internet resources. At least, when our company started the Campus project, I really lacked such information.

The social network Campus.ru helps schoolchildren and students to build their future careers by holding special contests from large companies-employers, organizing work experience training and educational seminars, as well as through the necessary functionality for communication, information sharing and other educational needs - for example, drawing up a training schedule. The functionality of Campus.ru mostly overlaps with the functionality of social networks we are used to: registered users can communicate, form communities, write blogs, post photos, etc. However, since this social network was initially focused on schoolchildren and students, some things were implemented specifically to meet the needs of students, for example, “Training Portfolio” - a folder for storing materials, or “Planner” - a service for drawing up an educational timetable.

An overview of the system architecture will begin with the information that Campus.ru is implemented on the Java platform. Judging by general statistics, choosing a Java platform as a tool for implementing a media Web resource is a rather non-standard solution, but I chose this platform based on the following considerations:

• My previous experience in developing high-load, fault-tolerant, horizontally scalable Java applications was successful.
• The Java community has accumulated a huge code base of high-quality open source libraries and frameworks for almost all occasions.
• When building a project team from scratch, finding in the labor market several suitable Java developers with the necessary qualifications, teamwork experience and knowledge in understanding software architecture (Design Patterns) is easier than, for example, to work with Ruby (because this language, according to the experience of colleagues, is not yet widely spread), or with PHP (from my own experience, there are too many low-skilled staff, who take up the exodus without months).
')
Assuming caustic comments regarding the performance of Java applications, I note that when making a decision in favor of Java, I clearly realized that the JVM is not the fastest and most economical interpreter. However, firstly, JVM 6.0 is really sufficiently productive for the needs of a Web application, and secondly, the work of programmers is now more expensive than the cost of iron, so for a long time to invent expensive "bicycles" that allow you to win 5% -10% of performance, for a startup would be unreasonable. I adhere to the principle “it is better to work quickly with a minimum of code and beautiful architecture than to save 10% of hardware and get bogged down in debugging”.

The main framework of the project was Tapestry 5 . The choice in favor of Tapestry 5 was made for a number of reasons, the main of which are:
• Tapestry allows you to completely separate the layout and presentation logic. There is no need to add special tags to HTML that are not interpreted by the browser, as in JSP. This is especially important because A web project in most cases involves a complex layout, which is usually performed and supported by an individual specialist. By the way, in terms of layout optimization, we tried to follow best practice .
• Contains a number of architectural solutions aimed at high-performance work, such as, for example, the Page Pool.
• Adequate and coherent framework architecture that allows you to focus on the business logic of the application, rather than on its configuration and integration of components with each other, thanks to the extensive use of annotations and naming conventions, as well as embedded IoC.
Since none of us had any previous experience with Tapestry 5, the test prototype of the application was written first of all and its load testing was performed by the JMeter utility. Testing showed a fast response time, stable utilization of the hardware platform and the absence of locks, after which the decision to use Tapestry 5 was finally approved.

Based on the above, the internal architecture of the Web application Campus.ru is dictated by the rules for developing applications for the Tapestry 5 framework.
In addition to Tapestry, our web application uses the following libraries:
• DOJO 1.0 - JavaScript framework with AJAX
The choice of DOJO as the main JavaScript framework was dictated by the presence of a large number of ready-made controls, including widgets, as well as the fact that DOJO allows you to perform all the stylistic transformation of components on the client side. At the same time, when a resource is first accessed, all necessary CSS and Java scripts are loaded and cached in the browser, and then only HTML pages with minimal markup are loaded from the server. This significantly reduces traffic and server load.
In addition, and not least, the team already had experience with DOJO Of the difficulties - the integration part of the DOJO-Tapestry had to be invented independently; in the future, we plan to issue it in the form of an open source project and put it in open access.
• SwfUpload 2 - JavaScript component for uploading files to the server.
• Hibernate 3 - ORM Framework. Tapestry 5 seamlessly integrates with Hibernate. For entity mapping use annotations.
• Hibernate Search 3 - Lucene-based full-text search engine. Allows indexing the contents of Hibernate entities, conveniently configured by annotations in entities. Can work in a cluster.
• Spring Security 2.0 is an authentication and authorization system that integrates seamlessly with Tapestry 5 and is configured with annotations. It has ample opportunities, but there was no suitable ACL mechanism for imposing personal user access rights to domain objects, we had to invent our own. We plan in the future to lay out in the form of an open source project.
• Quartz 1.6 - Scheduler for performing background and asynchronous operations. Long operations, such as sending mail, processing a file on the server, etc., we try to implement as asynchronously executed actions in order not to block the user interface for a long time.
With good luck, the number of users (i.e. the number of requests) of the social network and the volume of user-related data grows like an avalanche. In this regard, it was very important to ensure horizontal scaling of the system, as well as a sufficient level of its fault tolerance. These goals were achieved both due to the presence of redundant hardware servers for critical system nodes, and through the use of special software at the OS level: haproxy, hartbeat, etc. So we are almost not afraid of accidents and growth illnesses

Speaking of software. In its infrastructure, Campus uses:
• Nginx 0.6 - load balancer (HTTP requests) and HTTP server for uploading static content (JavaScript, CSS, icons, etc.). The balancing of HTTP requests to the Tomcat application servers is configured so that within one session all requests from one user fall on one instance of the Tomcat server. This load balancing setting is called a sticky session. This is done in order not to replicate user sessions between Tomcat servers. Plus - saving resources and linear horizontal scaling. Minus - in case of server failure, the user will have to re-login on the resource.
• JDK 6 is a set of development tools and a virtual java-machine (JVM) on which the J2EE application server runs.
• Tomcat 6 - J2EE application server (J2EE container for web applications). Under his administration, the Campus.ru application works. We did not combine the Tomcat server into a cluster for reasons of preserving linear horizontal scaling capabilities at this level.
• PostgreSQL 8.3 - versioned DBMS. Chose between MySQL and PostgreSQL. We stopped at Postgres, analyzing various reviews, expert blogs and the availability of tools for clustering databases.
• PgPool-II 3.4 , a load balancer and data replicator, is used to cluster two PostgreSQL servers into a cluster. Requests to change data are simultaneously sent to both database servers, and read requests to one of the servers in turn. Thus, the load is distributed between the two machines.
• PgBouncer 1.3 is a lightweight connection pool management system for PostgreSQL. Simultaneous processing of user requests by a large number of application servers requires the support of a large number of database connections. PgBouncer spends about 2 KB of memory on maintaining each connection, and significantly relieves the load from the PostgreSQL DBMS. Thus, JDBC pools on Tomcat servers are configured to connect to PgBouncer, not to PgPool. An additional advantage is that in case of a short-term inaccessibility of the database (for example, during a quick restart), the PgBouncer will continue to attempt to connect to the database, and if this happens before the set timeout, the application server will not even know that the database was temporarily unavailable.
• ActiveMQ 5.2 - JMS-server. Used for asynchronous JMS messaging in the Hibernate Search full-text search system configured in the JMS Master / Slave configuration mode ( details here ). It is also used to exchange information with the asynchronous task execution subsystem based on the Quartz framework.
• Sendmail is a well-known mail server. Use for mailings.
• Zabbix 1.6 - system infrastructure monitoring system. It monitors the status of the JVM, Tomcat via the JMX protocol through the MBeans interfaces. Monitors up to 30 hosts for free.
• Smssend - a package for FreeBSD that allows you to send SMS. We use to send SMS messages to the system administrator about problems on the resource.
• Chandler Server - Open source Web-application, which is a calendar server that communicates with the outside world using the open protocol CalDav. In order not to reinvent the wheel, we use this server in our Planer service to store schedules of events for users and communities.
• Amazone S3 - used to securely store user files. In addition, the use of this service helps to remove the load from the application servers, since Files are downloaded by users directly from Amazon servers. The file metadata is stored in the Campus database.
To work with Amazon there is a client-side Java API . The service is paid, but not expensive for organization. If you decide to organize something like this on your own, I recommend MogileFS , there is also a Java API for it.
• Fotki.com - photo hosting of our partners, providing storage and conversion of photos uploaded by users to photo albums. Photo uploading to the browser again takes place directly from the photo hosting servers, which offloads the application servers of Campus.ru. If I had to convert the photo myself, I would probably look in the direction of ImageMagick , the Java API is also attached.

All together it works like this:
1) The user's request goes to the load balancer server. Nginx checks if the request contains a header containing information about the user's binding to one of the Tomcat servers. If not, such a header is added, and the request is redirected to the appropriate Tomcat server. Initially, the server is selected on the principle of round-robin. This mechanism provides a sticky session.
2) On the Tomcat server, dynamic content is generated for the HTML page, in fact, only data, with minimal style markup. If to generate a page, data from the database is required, the Campus.ru application takes a connection from the JDBC pool of this server. Connections in the JDBC pool are made with PgBouncer, which in turn establishes a connection with PgPool, and PgPool with PostgreSQL. Since the process of establishing connections in the JDBC pool occurs immediately when the Tomcat server starts, then while the application is running, the connection from the pool is very fast.
3) If the user's request changes the information involved in full-text search, the application sends a JMS message to the Master Node search server, which is responsible for synchronizing full-text indexes on each of the cluster nodes.
4) The content of the HTML page is returned to the user's browser. After that, the user's browser starts downloading static content (JavaScript, CSS, images) from the Nginx server, if this content has not yet been cached by the browser, as well as photos and avatars from external photo hosting servers.
5) The user's browser applies CSS and JavaScript transforms to the loaded DOM tree and eventually renders the page to the user.

This scheme allows, firstly, to reduce the load on the application server and minimize traffic, and, secondly, to speed up the overall page load by overcoming the browser restriction, which allows downloading data from only four threads simultaneously from one resource. The downside of the medal is a side effect observed on old browsers like IE 6, which do not have time to apply all dynamic style conversions to the page before its display, because of which the user can observe how the curved page turns into a beautiful page before his eyes. In addition, the initial download of all CSS and Java scripts on slow channels may take some time.
By the way, here I remembered the problem with updating cached static content (CSS, JS) in the user's browser. To solve it, we add its version number to the URL of a static resource. If changes are made to scripts, their version number in the URL changes, and the browser downloads updated files from the server.

In order to make the above easier to keep within your head, the figure shows a high-level diagram of the deployment of the Web application Campus.ru. In order not to complicate the diagram, some of the connections between the components are not shown.

Deployment Chart Campus.ru

Figure 1. Campus.ru Deployment Chart

Experience shows that the JVM works efficiently with memory sizes up to 2 GB, taking into account the recommendation saying that each instance of JVM on the server should have at least two processor cores. For these reasons, 4 instances of Tomcat application servers were deployed on each hardware server. Campus and Chandler web applications were deployed in each Tomcat container. In-situ testing showed that with the GC settings made, this configuration makes it possible to utilize the hardware resources almost linearly with increasing load and produce an acceptable result in response time.

On all hardware servers, applications run under FreeBSD 7.1. The choice in favor of FreeBSD, and not Linux, was made because, all other things being equal, all Linux systems have their own characteristics, and it would be more difficult for system administrators to transfer knowledge to each other. In our company, all projects are still working on FreeBSD, so it has become something like a corporate standard. Of the minuses of FreeBSD in a Java project, Sun does not release the JDK for FreeBSD, and the ports under FreeBSD lag behind the updates to Sun. Also in the JDK ports there are not all debug utilities.

Fault tolerance on load balancer servers is implemented through the HAProxy utility.

Campus hardware servers have the following configuration:
1) Servers for load balancing and returning static content:
CPU Intel Xeon Dual Core 2.67GHz \ RAM DDR 2 8Gb \ HDD 4xSAS 73gb 15000 rpm
2) Servers for web application deployment:
CPU 2xIntel Xeon Quad Core 2.66GHz \ RAM DDR 2 16Gb \ HDD 4xSATA 300gb 15000 rpm
The main principle when choosing a server for a Web application - the more cores / processors and RAM - the better.
3) Servers for DB:
CPU 2xIntel Xeon Quad Core 2.66 GHz \ RAM DDR 2 16Gb \ HDD 8xSAS 147gb 15000 rpm
The main principle when choosing a server for the database with the expectation of working with large amounts of data - the more disks and the faster they are, the better. Be sure to have hardware RAID. Then the priorities are the RAM, followed by the processors.

Servers are hosted in the data center of a large Moscow Internet provider for a hardware firewall from Cisco.

In conclusion, I would like to say that, despite the large volumes of work done, we still have something to improve. For example, in connection with the release of IE 8, we outlined a painful transition to DOJO 1.3. We also plan to make the layout of the site easier (removing shadows, translucency and unnecessary roundness) due to the fact that students in the regions actually have a channel at 128Kbit / s and IE 6 in schools / universities. In the very near future plans to implement data caching on Memcached distributed database cache. We also have not solved the problem of data sharding, if suddenly the volume of data starts to grow threateningly. Therefore, in the future we intend to explore Hibernate Shards, PL \ Proxy and other means. If someone has practical experience and a desire to share it, I will be very happy.

In the next article, if you support me, I plan to talk about the management of the Campus project and our team.

Thanks for attention!
Technical Director, Creative Media LLC
Sergey Sedov
Published with the permission of Ivan Sokolov, Director General of Creative Media LLC

Source: https://habr.com/ru/post/57162/

All Articles

Overview of the system architecture of the social network Campus.ru

More articles: