
On April 4, we held the first
FailOver Conference - a conference entirely dedicated to the resiliency of sites and their uninterrupted operation.
All day, more than 7 hours, we listened to reports from developers and architects of cloud services, system administrators, hosting providers, experienced DBAs, discussed them, argued ...
')
The topic turned out to be very important and in demand - more than 1,200 people registered for the online broadcast.
Of course, it is difficult to immediately "digest" such a large amount of information. For something you want to come back later, try to apply to a specific project. That is why one of the most frequently heard questions both from the audience and online (on Twitter): “Will the materials be published?”
Yes of course! We publish both presentations and video reports, and we are happy to share them with you!
* * *
1. "Resilience Economics" (Alexander Demidov, "1C-Bitrix")
Do all projects need resiliency? And if so, to what extent? How many “nines” in uptime should we strive for? And most importantly - how much does it cost? Let's try to figure out and find answers to questions.2. "Optimization of costs for fault tolerance" (Nikolay Dvas, Clodo.ru)
How to formulate business requirements for fault tolerance and use cloud tools to ensure optimal - based on business requirements - fault tolerance3. “Building reliable, flexible, expandable systems in the real world” (Yuri Ustinov, Rusonix)
* Iron or cloud? The truth is somewhere near.
* N + 1, the reason why the glands are a difficult but poor choice.
* N + 1 reason why the cloud is an easy but poor choice.
* Jedi technique of building flexible systems: we eliminate infrastructure bottlenecks, avoid cloud problems, expand, scale, exploit.4. “Failover Code” (Ilya Pyatin, LineMedia)
* Theoretical approach (what, how and why you should pay attention?)
* Stream data processing (we save memory, preparation for processing, queues)
* Asynchrony and timeouts (working with external data sources and the failure of external services)
* Efficient and secure resource sharing
* Add to MongoDB code (not only cache, data consistency)5. “New replication features in MySQL 5.6” (Konstantin Osipov, Mail.Ru Group)
One of the major innovations of MySQL 5.6 is the global transaction identifiers (global transaction identifiers). On the mechanism of action of new identifiers, as well as new features 5.6, based on this concept, I will try to tell in my report. In particular, I will consider a new, simplified mechanism for fail-over, multi-source replication, replication conflict resolution.6. “Monitoring of Web Projects: Operational Response Headquarters and Analytical Center” (Alexander Serbul, “1C-Bitrix”)
* Why monitor web-cluster systems, review priority risks
* What to measure and why, useful tools and techniques
* Tactics - actions in emergency situations, emergency restoration of web systems when backups are “not useful enough”
* Strategy - analysis of system trends and development of the web cluster architecture, forecast of the required capacity, selection and evaluation of new technologies7. “Stability of the project in the context of continuous integration” (Denis Mitrofanov, QSOFT)
* Long-term intensive development with permanent releases
* Work organization, division into major and minor iterations
* Different thinking paradigms and KPI Tech Support and production
* Processes of production, shipment and quality control
* Architectural solutions and technical tools that allow for large updates without stopping the project8. “Backups and Backup: Different Scenarios and Restoration Doctrines” (Dmitry Sizikov, Sum IT)
Database backup:
* Tools to create hot MySQL BackUp: mysqldump, Percona Xtrabackup, mysqlhotcopy.
* Automate database backups.
* Replication is not a backup. Monitoring replication status. Multimaster replication: Rubyrep.
Backup and backup of file system / physical media:
* Raid: software / hardware, LVM, raid. Monitoring the status of the raid.
* Common backup methods: tar.gz -> ftp, scp, sftp, key authorization, chroot.
* Services for storing backups: Amazon s3, Glacier, others.
* Full, Incremental, Differential Backup tools: bacula, amanda, rsync, lsync.
* Snapshot: ZFS, LVM, rsnapshot.
* The depth of storage and the periodicity of the backup.
* Data optimization for backup. Since the final file can have a significant size to save resources, we must exclude unnecessary data from backups.
* Security and encryption backup. Since backups contain confidential data, it is necessary either to exclude possibly falling into the wrong hands, otherwise to encrypt them.
* Restore and checksum backups. When the recovery process takes a long time, you need to check the integrity of the backup file before starting.
* Monitoring and control of the process. The system making backup should be able to create an alert about problems at all stages: creation, archiving, downloading, etc.9. “About DDoS on a Web Cluster” (Alexander Krizhanovsky, NatSys Lab)
Modern DDoS attacks use random (or pseudo-random) HTTP headers and URLs, slow requests and requests to the most load-sensitive resources of the site, periodic and varying characteristics of requests. All this makes DDoS attacks more destructive and resistant to blocking. We'll consider:
* than in most cases, DDoS traffic is different from normal traffic;
* how the load from DDoS differs from the usual high load caused by, for example, the flash crowd;
* what types of load on the application services and operating system provides DDoS;
* DDoS protection on your own using Open Source10. “How to reduce the load on a highly visited
project?” (Vitali Gavrilov, Lenvendo)
*one. Conditional division of projects into two groups:
Highly dynamic projects
Low-dynamic projects
* 2. Reducing the load on high-dynamic projects
Separation of content into fully dynamic and conditionally static content.
Re-compiling pages for optimal use of AJAX
Static caching of conditionally static content
* 3. Reducing the load on low-dynamic projects
Separation of content into completely static content, conditionally static content and high dynamic content
Managed caching of conditionally static content
* 4. Search engines and dynamic content in AJAX mode* * *
It seems like a great, useful conference. :)
Many thanks to all who participated! See you at the next #FailOverConf! :)