📜 ⬆️ ⬇️

Creating an authorization system in a high-load project using MemcacheDB

Hello!

In this article I want to talk about the problems of authorization that may be encountered by any visited website during growth.

Where to store the user authentication database?
How to quickly authorize a user by their string login?
How to collect user data distributed across multiple shard tables and multiple databases?
How to make it all work, and how can MemcacheDB help us?
')

A few months ago, on our project, we came up with the fact that the originally created database architecture was unsuitable for current workloads. First of all, it concerned the user base.

Problem


Yes, we used memcached to cache requests, we stored a lot in sessions, we shared user data using sharding across multiple tables and databases, but such basic things as user authentication on the site became indecently large amount of time.

In addition, in order to collect data about his friends in the course of the user's work, having only their identifiers available, the application had to refer to the users table, which by that time had already accumulated one and a half million.

With a high load, the samples on the text index of the user's login began to take a very long time. Most of the users who came to the resource were new users, and accordingly the number of registrations was several tens of thousands per day.

Search for a solution


There were several ways to solve it.

On the one hand, we considered the possibility of partitioning users into a set of authentication tables in MySQL, like user1, user2, ..., userN where N was somehow computed from the user's login. But what to do when the number of tables within one database becomes too large for the normal performance of a relational database?
How do we achieve flexibility in scaling? Large portals create their own authentication systems, but our team works primarily on the result in limited resources, when such work would take too much time, and we began to look for an existing solution.

Then we turned our eyes towards the Key-Value databases of which you can read here ( http://habrahabr.ru/blogs/hi/55077/ )
Due to a number of circumstances, of which, above all, was relative fame, we settled on MemcacheDB.

Architecture


So now about architecture.

As an experiment, we decided to save in MemcacheDB not only basic authorization data, but also information about the location of user data for quick retrieval.

It was decided to duplicate the basic write operations in MemcacheDB in MySQL database, because we were not sure about the stability of MemcacheDB in maintaining integrity, and are not sure until now. The author of MemcacheDB is Steve Chu, in response to our letter, which we will discuss later, recommended the same solution.

Read operations were performed only from MemcacheDB, and once a day, a check of the identity of the MySQL and MemcacheDB databases was made in the crown.

The project works on PHP5, initially as a client for MCDB we used the pecl-memcache module, however during load testing on a large amount of data it turned out that the module has a tendency to break the connection with a large (for memcache) read delay. After a series of tortures and distortions, we wrote a letter to the author, Steve Chu, who reported that he was aware of problems of this kind and recommended using the newly published pecl-memcached module as a client (note added d). This solved our problems, besides the quick response of the author finally convinced us of the correctness of the choice made.

For a quick recovery in case of loss of the database on a separate server, another instance of MemcacheDB is raised to which the main data is replicated. In case of data failure, we will organize a quick switch to the backup server.

So, each user is saved according to the key => value twice.
User data is always saved in the data, and the keys are:

Thus, the registration process appears as:
  1. User data is stored in the database.
  2. User data is stored in MemcacheDB with the key user_ login , where login is the user login
  3. User data is stored in MemcacheDB with the key user_ id , where id is the user ID.

The authorization process:
  1. The user enters his authentication data - login and password.
  2. The system receives user data from MemcacheDB using the user_ login key and performs authentication.

The process of obtaining information about the user's friends:
  1. The user loads a list of friends.
  2. The system receiving a list of identifiers of the user's friends (possibly from the same MemcacheDB) receives information about each friend by user_ id and on the basis of this information collects other user data distributed in the database.

Benefits


The main advantages of this approach are:
1. MemcacheDB on both small and large amounts of data shows the performance by several times higher than the performance of MySQL on the same task.
2. MemcacheDB easily refers to a large number of key-value pairs in the database, currently in the database about 4 million key-value pairs.
3. Sharding and separation of data by entities are also possible in MemcacheDB, it can be represented by running several MemcacheDB instances with different databases.

These changes have been working in production for two months now, and to date there have been no problems.
I really hope that this article will be useful to you, both for educational purposes and in order to solve the accumulated problems.

Thanks for attention!

Source: https://habr.com/ru/post/55484/


All Articles