How safe is the use of R packages for working with the API of advertising systems?

Recently, I have often been asked the question of how safe it is to use various ready-made extensions, i.e. packages written for the R language, is there a chance that the advertising account will fall into the wrong hands

In this article I will talk in detail about how the authorization mechanism is arranged inside most packages and APIs of advertising services, and how to use the packages described in the article as safely as possible.

The information in this article is technically not the easiest, therefore, so that the text is not as dry and technical as the usual help I dare to try on the role of a guide, and will take you through the materials of this article, for the most simple of its perception.

Our tour bus has arrived, take a comfortable seat and check out our route today.

Content

How is the authorization process in most modern advertising services?

The collection point of our tour group is the OAuth protocol.

Almost all the services with which I had to work with the API authorize via OAuth 2.0, in more detail about him already repeatedly wrote on HabrE, who are interested to walk in his jungle, please, you have such an opportunity, you can do it here and here .

If we briefly explain its meaning, then OAuth allows the application (in our case, the application will be the R package), to which you have given permission, to perform some action on your behalf, without the need to transfer this username and password to the advertising account, again for security reasons.

Instead of the login and password in the OAuth protocol, a token is used, this is a generated string consisting of a set of letters and numbers, which stores information in encrypted form:

On behalf of which user the application performs the request
Did the user really allow this application to access his data?
Does the user himself have the necessary authority to work with those promotional materials to which he refers

For the authorization process and the work with the API, you usually need to register the application with the API. Further, this application should receive confirmation from the API support team of one or another advertising system, i.e. The author initially describes in detail how and why he will use the API, everything is checked, moderated, and only if the support on the side of the advertising platform does not have security questions, the author of the package will get access to the API, and using his registered application you will be able to package, you can authenticate using the ID and secret issued by this application.

Where did the security question come from

We are moving further, let's try to figure out from where did the safety issue arise when using packages?

In general, the issue of security of access to advertising offices is more than justified, since advertising offices have money, and often they are not quite small, therefore the security of advertising accounts is a much more serious issue than access security, for example, to a regular user profile on social networks.

The fact is that in most cases, when passing through authorization, R redirects the users of the packages to the browser, initially to confirm access to the account, at this stage you are on the service page with the API of which you are going to work. After confirmation, the user is redirected to the page where the token will be generated for him, or an authorization confirmation code which must be subsequently entered into the console R.

So, most users are worried because the sites where the token itself or the authorization confirmation code is generated are third-party and have nothing to do with the ad service itself, of course they either have a Google Analytics counter, or a Yandex.Metrica counter, and the site owner, which in most cases is also the author of the package, in the opinion of many through this site can take possession of their tokens, and through them access to the management of their advertising materials.

What threatens to intercept a token

Let's talk about what a token generally allows to do if it falls into the hands of an intruder.

The token should be stored in the same way as all other data that is necessary for access to the account, i.e. if the token falls into those hands, then the one who took possession of it will be able to manage your advertising materials: delete them, change them, for example, it will be possible to change the text of the ad and the link to which it leads.

The good news is that, as I wrote above, the OAuth protocol allows you to give the opportunity to manage your advertising materials without providing a login and password for your account, i.e. even if someone has taken possession of your token, he will not be able to hijack your account with his help. No API allows you to request, much less change your password, so that the account will not be taken away from you, but to advertise your website through your account is easy.

What to do if someone took over your token

If it so happens that you accidentally passed your token, do not panic. In fact, this is not the end of the world, in most cases there are a number of actions that reset previously issued tokens, for example, in this help section of the Yandex.Direct API, the process of recalling previously issued tokens is described in detail.

In most cases, regardless of which API of the advertising system you are working with, it will be enough just to change the password for your account.

How most safely to use R packages for working with the API of advertising systems

And now we’ve gotten to the most interesting part of our excursion, then I’ll tell you how to use the packages I developed most safely, because I’m familiar with the systems API that they work with.

I want to note that all issued tokens are stored on the side of the advertising platform, and not of the application through which the R package runs, so even the user who registered the application to work with the API of the advertising platform does not have access to the token itself.

ryandexdirect and rym - Packages for working with the Yandex.Direct and Yandex.Metrics API

Both packages use the OAuth service of Yandex, you can read more about it at this link .

There are 2 functions in the ryandexdirect package for authorization:

yadirAuth - two-step authorization
yadirGetToken - request authorization token

When using the yadirAuth function, namely, I recommend using it when working with ryandexdirect, the authorization process follows the scheme described here , the only weak point in this case is the period from the moment the confirmation code is generated until it is entered into the R console.

Let me explain why, here’s how data on a visit to the page generating a verification code are displayed in Google Analytics.

Those. the code comes after the '?' sign, and is considered a GET parameter that captures the Google Analytics counter, but the lifetime of the confirmation code ends immediately after its use, i.e. immediately after you entered it into the R console. The maximum lifetime of such a code is 10 minutes.

The second function, yadirGetToken , performs authorization according to another scheme described here . And when using it, no confirmation code is generated, i.e. after you give the package permission to access the data, you get to the token generation page. The token in the URL itself is returned after the '#' sign, this is not a get parameter, but an anchor, or as this part of the URL is also called a hash. The browser does not transmit this data, respectively, they are not transmitted further to Google Analytics reports, i.e. visiting this page in reports is displayed like this:

In this second case, there are no risks, but the minus of using the yadirGetToken function is that it does not save the credentials to a file on your PC, and subsequently cannot use this data between different R sessions, and this is not particularly convenient. You will store the token received with its help, and use the scripts as a text string, the lifetime of such a token is 1 year, after which the package cannot automatically replace it, as it happens when using the yadirGetAuth function.

In the rym package for authorization, there is the rym_auth function, which is a complete analogue of the yadirAuth function, the operation of which I have already described in detail.

rfacebookstat - Packages for working with Facebook advertising office

How the authentication process in the Facebook Marketing API is arranged is described in detail here .

To pass the authorization in the rfacebookstat package, there is a fbGetToken function, it works just like the previously described function yadirGetToken from the ryandexdirect package, i.e. everything is implemented through one-step authentication. There is no danger that your token will be intercepted through Google Analytics reports, a screen of how a visit to the token generation page looks in Google Analytics.

rvkstat - Packages for working with Vkontakte advertising office

The authentication process of Vkontakte is described in the API help .
In rvkstat for authorization, you can use one of two functions:

vkAuth - Authorization according to the Authorization code flow scheme, i.e. two-step authorization.
vkGetToken - Authorization according to the scheme of Implicit flow , one-step authorization, with the binding of the token to the device.

vkAuth performs two-step authentication, in essence, the yadirAuth function described at the beginning of this block, but only for authorization in the Vkontakte API, and not for Yandex.

The feature of working with the Vkontakte API in this case is that it is quite simple to register your application and get access to the API, you do not need to fill out forms in which you must describe in detail how and why you will use the API. So, since you use your application when working with rvkstat, even intercepting the confirmation code does not give anything, since it is tied to your application, and in order to intercept the token with it you need to know the id and secret of your application, the code itself will not allow you to get the token for you.

The vkGetToken function allows you to get the token in the fastest way, besides, the received token is tied to the device from which it was requested, i.e. even if someone gets it, it can only be used from the same PC from which it was requested. At the same time, when generating a token, the URL is after the '#' sign, and as I said earlier, Google Analytics does not fit into the reports.

rmytarget - Packages for working with advertising MyTarget

At the moment, there are 3 authorization schemes in the MyTarget API, you can learn more about each in the documentation .

The myTarAuth function is myTarAuth to authorize the MyTarget API in rmytarget, by default it uses the Authorization Code Grant authorization scheme, which allows you to work with the MyTarget API without having to gain personal access to its use. Those. I have already registered the application, it was approved by the support of the MyTarget API, and you grant it access to work with the account on your behalf.

The Authorization Code Grant is a two-stage authorization scheme, similar in meaning to that implemented by the yadirAuth function in the ryandexdirect package.

It works as follows:

You run the function, then the browser opens.
On the MyTarget service page, you give permission to access your account.
You redirect to the package page, where the confirmation code is generated. The maximum lifetime of this code is 1 hour, but it stops immediately after you receive a token with it.
The copied confirmation code you enter into the console R and get a token to work with the API.

In this case, the verification code is the get parameter and is recorded in Google Analytics reports.

But, if you look closely, in addition to the code (get the parameter code) , there is another parameter in the URL - state . This is a string, also a token, which is generated by the rmytarget package itself and sent to the browser immediately after the function starts, this parameter is unique, and the authorization confirmation code is attached to it. Even if both the verification code and the state token are intercepted, this combination cannot be used anyway. First, there is nowhere to enter the state token, and as I already wrote, it is unique, and even if it were where to enter it, it can’t be sent again. Therefore, this authorization scheme is completely secure.

But if this variant still seems suspicious to you, then rmytarget and the myTarAuth function allows you to use the remaining two authorization schemes:

Client Credentials Grant , used to work with the data of your own account through the API
Agency Client Credentials Grant , used to work with the data of own clients of agencies / managers.

In this case, you will have to independently access the MyTarget API, at the moment only legal entities can receive it, it is issued in manual mode, you need to use the feedback form to request access, you can find all the details here .

So, if you still managed to register your application to work with the MyTarget API, you can myTarAuth authenticate using one of the two schemes listed above using the myTarAuth function, for this, pass the value FALSE to the code_grant argument and use the following arguments:

grant_type - Your account type, in this case, a regular client account, takes the values "client_credentials" or "agency_client_credentials".
agency_client_name - Client login from an agent account, used only if grant_type = "agency_client_credentials".
client_id - ID is given to you when you confirm access to the MyTarget API.
client_secret - This is issued to you when you confirm access to the MyTarget API along with the Client ID.

Sample code for authorization under the Client Credentials Grant scheme

 myTargetAuth <- myTarAuth(code_grant = FALSE, grant_type = "client_credentials", client_id = "XXXXXXXXXX", client_secret = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")

Sample code for authorization using the Agency Client Credentials Grant scheme

 myTargetAuth <- myTarAuth(code_grant = FALSE, grant_type = "agency_client_credentials", client_id = "XXXXXXXXXX", client_secret = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", agency_client_name = "xxxxxxxxx@agency_client")

If this approach is used, authentication will take place without any interaction with the rmytarget package site.

Conclusion

This is where our tour ends, because today more than 10,000 packages have been published in the main repository - CRAN, and more than 80,000 on GitHub, in conclusion I want to say a few more words about the safety of their use.

First of all, pay attention to whether there is a package you need on CRAN, since this is the official repository for the R language, the packages before being published there undergo a fairly tough moderation by a team of specialists from this repository. And the package will not be published there until it is fully consistent with the policy CRAN . Therefore, if the package is present on CRAN, then you can be sure that its use is safe for you.

Besides, I want to note that the code of all packages for the R language is open, you can always look at the code of any of its functions before starting it.

Also try to find articles on the use of this package, R users are quite willing to share information, and you will surely find cases using more or less popular packages. If they write about the package, it is used, and it seems that no one has any problems with it.

Also look who is the author of the package, there are two ways to do it:

After installing the package, execute the command utils::packageDescription("_")$Author
View the DESCRIPTION file in the package source.

Try to find some information about the author in the worldwide network, if a person is at least slightly public, he is unlikely to risk his reputation in order to get access to your advertising account and your advertising materials. Often reputation is more valuable than money received by dubious means.

If you install a package from GitHub, then install it from the author’s repository, and not any branch, as a rule, there are many such branches in popular repositories:

Ryandexdirect branches

The fact is that the branches are not updated by the author of the package, which means you will not receive its most current version. And besides, the user of GitHub who created the branch itself can make changes to its code, whether you trust these changes or not.

You can see from which repository its branch was created on its GitHub page.

Do not transfer your tokens to anybody under any circumstances, store them in the same way as you store passwords for accounts, even if you need to show an example of a code, do it without specifying tokens.

Remember, in the overwhelming majority of cases, using R packages is completely safe for you, I hope with this article I managed to convince you of this and tell you how the authorization process in the API of the most popular advertising platforms works.

Successes to you, be vigilant but do not give in to paranoia.

Source: https://habr.com/ru/post/430888/

All Articles