Third extra: how we implemented mail collection using OAuth 2.0

"Maybe you also have the key to the apartment, where the money is?" - something like this looks like a normal reaction of a person who has a third-party service that requires a password from the main mail. However, most of us regularly have to provide the password to third-party services. Today I want to talk about how we implemented the authorization procedure when collecting letters from our mailboxes through OAuth 2.0, thereby freeing Mail.Ru users from having to trust the “keys” of their mail to a third party.

Typically, when setting up a mail collector, mail client, or a third-party mobile application, you need to enter a name, mailbox address, and password. The most annoying in this procedure is entering a password. If you care about security, you deliberately invented a complex password for this mailbox and entered it only on the service website. And now you have to trust the password to a third party who will store it and transfer it over the network. If the transfer is not so bad (Mail.Ru Mail supports SSL data transfer for the IMAP protocol), then storing the password can be dangerous. How is the password stored? Can they steal it? Can someone else read the mail? And is it only to access mail third-party service? Won't he delete accidentally, say, files from the cloud? Users often ask questions like this.

You can avoid storing a password on a third-party resource server. The solution is obvious: to provide everyone with the opportunity to work through OAuth 2.0 when collecting mail from Mail.Ru via IMAP protocol to the mailboxes of other mail providers, as well as when interacting with mail clients and third-party mobile applications. And we have taken this step. And now about everything in order.
')

OAuth in brief

What is OAuth in general? The full protocol specification is described in RFC 6749 . There is more than one authorization option. For example, a mobile application accesses a resource in a slightly different way than a web application or device. We, for simplicity, will restrict ourselves to a special case of a web application.

There are several roles in OAuth.

A resource owner is a user who wants your application to perform actions on his behalf.

A resource server is a server that serves what the resource owner owns (for example, a resource server can be the mail server where the user’s mailbox is located).

Authorization server - a server that is engaged in authorization by the OAuth provider. In the simplest case, authorization server and resource server are the same, at least from the point of view of the outside world.

Client - in the terminology of OAuth, is a web application that accesses a resource from a user. Each client must be registered on the authorization server; while it gets client_id and client_secret. In fact, this is the login and password by which the OAuth provider can identify the client application. It is important that this pair login + password serves only for identification and in no way coincides with the username and password of the user. Thus, the user under no circumstances transfers his password to third parties: he only exchanges this data with the authorization server — this is as safe as logging into his mailbox.

How it works

So, the user (resource owner) of a certain site (OAuth-provider) wants to transfer to another site (client) the right to work with part of the functions on its own behalf. This procedure is called OAuth authorization grant. To implement it, the client asks the user to go to the OAuth provider server and get an access code there, passing certain parameters, which will be discussed below. Technically, it looks like a browser redirect to a previously known URL. When a user navigates to this URL, the OAuth provider asks the user to log in and asks him if it’s really worth giving the requested access to this application. If the user agrees, the OAuth provider redirects the user's browser back to the client’s server and passes the access code there. After that, the client generates a special HTTP request to exchange the authorization code for an access token using its client_id, client_secret for client authentication and the received code for exchanging it for an access token (access_token). The request is executed from the server side. This token will act as the password for the application to log into the OAuth provider API.

OAuth passwords are exchanged only between the user who owns the password and the only server that can verify this password. The user enters a password only on the server of the OAuth provider. The client application sends the client_secret to the OAuth provider only. At the same time, the provider has the opportunity to make sure that this particular user gave exactly this level of access to this particular application. The application gets the access that it needs to work, but does not know the user's password. The user is sure that only his password is known to him, since he does not give his password to any third parties.

As one of the parameters in the authorization grant stage, the scope is passed. This parameter determines exactly what rights the application wants to receive. Parameters are a string consisting of space-separated sequences that are understood by the OAuth provider. It is noteworthy here that access_token will allow the client application to perform only the actions that were listed in the scope parameter. The same list of permissions OAuth-provider will show the user, before he confirms his consent to the transfer of rights data to the application.

Another interesting parameter of the authorization grant stage is called state and allows you to avoid an unobvious security problem. The application, redirecting the user to the site of the OAuth provider, generates a random token (CSRF token) and sends it in the state parameter. The OAuth provider does nothing with it, but returns it back along with the access code. The application checks the received state with what was sent and terminates the authorization grant stage if the state is incorrect. If this did not happen, a potential attacker could authorize our application to access its mailbox and pass its authorization code to our application.

Suppose a binding of an external account is used for authorization by an external mailbox. In this case, the attacker will be able, logged into your account, to gain access to the victim's account in our application. Therefore, we recommend using state to anyone who implements OAuth work, although this parameter is optional.

In some cases, along with access_token, the OAuth provider issues a refresh_token to the client. This token allows you to get a new access_token or even several. In the simplest case, the user gives permission to the application one-time. For example, your application wants to add some event to the user's calendar. Every time this happens, the user receives a request: whether to allow the application to perform the specified action? If he agrees, an access_token is issued for a short period of time, for example, an hour. If tomorrow your application tries to add another event, access will be requested from the user again. This is how the App Store works on Apple devices. To install the application, you must enter a password, but in the next 15 minutes when installing other applications, this will not be necessary. If you try to install another application later than 15 minutes, you will have to enter the password again.

In some cases, the user wants to give the application the right to work on his behalf always. A striking example - just collectors mail. Regardless of whether the user is online or has gone hiking in the Altai Mountains for a month, the collector must collect mail from one or several mailboxes. It is in this situation that refresh_token is required. The client application can request the so-called offline access and get a refresh_token in the response, and with it the ability to authorize the OAuth provider in the service without user intervention, receiving all new and new access_token services.

How we do it: client

We recently included support for the work of our mail collectors using OAuth. Now we do not force the user to enter the password from the mailbox, and even collecting mail from the mailbox in Mail.Ru, the collector in relation to the mail server acts as an OAuth client. We support OAuth for those services that allow you to work on this protocol, namely Google and Microsoft. To store tokens, we wrote an internal service Fluor. Its tasks, in addition to storing the base of tokens, include issuing them to collectors and other internal consumers on request with minimal delay. The exchange of the user's consent to the token from the external service is handled by a separate daemon, which is responsible for authorization. It guides the user through the process of issuing the rights necessary for the application (authorization grant stage) and stores the received tokens in Fluor.

For services that support refresh_token and limit the lifetime of access_token, it is necessary to update tokens in the database in a timely manner. At the same time, it is necessary not to fall under the limitations of OAuth providers by the number of requests per day from a single application or from a single IP. This task is handled by the fluor-refresh daemon. The Fluor family of demons is written in Perl. Requests to them are processed asynchronously using the AnyEvent library. We use our own protocol IPROTO to interact with the OAuth daemon and the collectors. We also have our own Perl HTTP server, but because of the need to parse headers, the processing performance of requests on IPROTO is five times higher. The most critical tasks from a processor point of view are from Perl to XS. XS allows you to write a piece of code in C and transmit the results of its work in Perl.

Several copies of Fluor and fluor-refresh can be launched at a time. We organize storage of tokens and interaction between demons via Tarantool (also developed in Mail.Ru, which has an open source project, which has already been written on Habré more than once ). Tarantool is a NoSQL database, entirely located in the server's memory, but allowing to write data to disk. Tarantool has replication and the ability to write quite complex procedures in the Lua language, which is very helpful in organizing our specific queue for updating tokens.

The specificity of the queue is that, firstly, it is infinite (tokens must be updated all the time), and, secondly, the tasks of the queue must be completed before a certain period, the deadline. At the same time, it is necessary to ensure that one task in the queue is not taken by two refreshers at once, otherwise the useless work will be done and the frequency of requests to third-party services will be exceeded. We implemented all the relevant logic on Lua.

Fluor-refresh simply calls the function in Tarantool and gets a list of tokens to refresh. For tasks, it gets a fresh access_token and saves it to Tarantool through another Lua function. Lua-functions guarantee that the update of one token will not be entrusted to several refreshers, and that tokens will always be selected, the expiration of which will occur within a given interval. Thus, we save several queries to the database, which would have to be done if, instead of Tarantool, we had, say, memcached.

If it still happens that the token for this email did not have time to update and expired, the collector may ask Fluor to get a new access_token immediately, bypassing the queue. There are also situations when a user withdraws access from an application from an OAuth provider. OAuth does not provide applications with a mechanism for alerting this situation. We will know about the problem when the refresh_token stops working. In this case, it is necessary to delete the token, and the collector goes into the extra_auth state, which means that the user needs to request access again.

Currently, Fluor stores 4.8 million tokens for various services, occupying 7 GB of memory. About 100 million token updates occur per day. However, Fluor handles 125 million requests from collectors per day. Physically, one server can handle this, if you do not take into account redundancy in case of failures.

How we do it: server

In the simplest case, the OAuth server should be able to:

Have the ability to verify authorization.
Generate access and refresh tokens, as well as an authorization code.
Check, store, disable and delete tokens.
By refresh_token update access_token; by authorization code, issue refresh_token and access_token.

Verification of authorization , as a rule, is performed by a separate service. It authorizes the user by a pair of login + password, or by more complex combinations (for example, if we are talking about two-factor authentication). If you are writing OAuth, you already have this service.

Token generation. General advice: tokens should be as random as possible, randomly should be cryptographically resistant.

Token management. Each token has a lifetime and is bound to the user. A simple table in the database will allow you to store tokens, bind to the user and the lifetime. There is not a lot of data, and the speed of work requires high, so a base that stores data in RAM is desirable. You will also need a daemon that will crawl the base and delete obsolete tokens.

Issuing new access tokens by the refresh-token is a rather banal procedure, we will not focus on it. We use Tarantool for this. It stores data in memory, ensures their integrity. And most importantly, it encapsulates the logic for deleting obsolete tokens. This can be implemented on an internal lua procedure. Another interesting point is the removal of tokens in case the user has changed the password. To do this, you have to get all the tokens that are tied to the user. It requires a secondary index, which is based on the user - in Tarantool, unlike many other databases, this possibility exists.

System configuration features. Three points are important here: speed, iron utilization, fault tolerance. The speed of work is provided to us by Tarantool due to interaction only with RAM and a secondary index. For the utilization of iron we are Tarantool Shardim, which allows maximum utilization of the processor cores of the server. Fault tolerance is achieved through replication in different DCs. Replication allows you to restart both individual daemons and machines entirely.

So, today we announced the ability to connect to the IMAP protocol of the Mail.Ru mail service using OAuth authentication. We encourage developers and clients for desktop and mobile devices to implement it when collecting mail from our mailboxes.

Connection documentation is available on our website . At the moment, for our part, we also collect mail in a more secure way from services that provide such an opportunity, and we want their number to increase. We hope that soon work on OAuth 2.0 will become the same gold standard for mail services as work on HTTPS.

Source: https://habr.com/ru/post/264049/

All Articles

Third extra: how we implemented mail collection using OAuth 2.0

OAuth in brief

How it works

How we do it: client

How we do it: server

More articles: