The task was to create a visitor registration system for a couple of dozen sites. Sites belong to gaming associations (clans) of a single gaming community. So to say, you need a pivot table in which you will immediately see which site is more popular. The customer approved the counting of unique visitors.
Since the customer had no solid idea of ​​what and how, I could do anything (technical specifications were not available either). Wrote the accounting system (PHP, MySQL). Unique hosts were identified by IP, cookie entries and entries in the DOM storage. In fact, it was an experiment that was not completed. The new version of the accounting system had to use the counters of a ready-made system, such as Yandex.Metrica, Google Analytics, LiveInternet, and the like. I chose Yandex, because there is an API and a sensible reference. The use example peeped on Habré. To work with the metrics API, you need an OAuth token. I will not describe the procedure for obtaining it, everything is in the certificate.
A bit of specifics.
First we get a list of counters:
api-metrika.yandex.ru/counters.json?oauth_token=000000000000000000000000000000&pretty=1
api-metrika.yandex.ru/counters.json?oauth_token=000000000000000000000000000000&pretty=1
Token, of course, each developer has a unique one.
Next, for each counter from the resulting list, we get the number of visitors for the reporting period, say, January 2013:
api-metrika.yandex.ru/stat/traffic/summary.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
api-metrika.yandex.ru/stat/traffic/summary.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
Where id = 12345678, this is the
api-metrika.yandex.ru/stat/traffic/summary.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
id =
api-metrika.yandex.ru/stat/traffic/summary.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
The obtained data is shown in the pivot table.
It would seem that the task is completed, but it was not there! There were malicious elements that wanted to distort the indicators. These unscrupulous comrades contact the traffic exchange and buy referrals to a site from the group, or to several at once. Traffic does not bring such benefits to the site, people working on these exchanges will not remain on the site in more than 99% of cases. Another point is that in the settings of the traffic exchange there can be such an item - “do not send the referrer”. In this case, the transitions to the site look as if they hit the site using a bookmark in the browser. I came up with three solutions to the problem, but as long as none have been submitted to work, I can’t choose the best one.
- Count the visits that lasted more than 30 seconds.
- Count the visits that were visited two or more pages.
- Create a goal for each counter of a metric - a visit to two pages and count how many times the goal was achieved at a given visit time, for example, more than 30 seconds. Hybrid first two options.
For the first two options, we obtain the data as follows:
api-metrika.yandex.ru/stat/traffic/deepness.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
api-metrika.yandex.ru/stat/traffic/deepness.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
Then we
api-metrika.yandex.ru/stat/traffic/deepness.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
up the visits to
api-metrika.yandex.ru/stat/traffic/deepness.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
Then we
api-metrika.yandex.ru/stat/traffic/deepness.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
up visits to
api-metrika.yandex.ru/stat/traffic/deepness.json?id=12345678&oauth_token=000000000000000000000000000000&pretty=1&date1=20130101&date2=20130131&group=month&per_page=1
, 3 ... 14, 15+, according to the number of pages viewed). Or, we add visits in which the
name is equal to the required visit time (the
name has the values ​​"0 - 10 sec.", "11 - 30 sec." And so on to "10 - 30 min.", "More than 30 min."). The third option is to count targeted visits, for it you need to know the goal id (goal_id) for each counter:
api-metrika.yandex.ru/counter/12345678/goals.json?pretty=1&oauth_token=000000000000000000000000000000
api-metrika.yandex.ru/counter/12345678/goals.json?pretty=1&oauth_token=000000000000000000000000000000
And then, as in the first version, we add the necessary data (for example, more than 30 seconds per session).
I don't like these ways. Still, I want to count, not visits, but visitors. There is also a way in which you need to add a couple of lines to the javascript code of the metric counters. Changing the code will affect the accounting system. The visitor will be counted not immediately, but after a specified time. That is, if the user leaves the site before the timer is triggered, his visit will not be taken into account (the defer property and the hit method). The option is also not very convenient - to change the counters on a heap of sites, then follow, so that the counters do not change again (each site has its own owner). Inconvenient. The administration of the metric promised to create a tool for filtering out "bots", but when it will be done is unknown.
')
As a result, we got an almost working system. I myself tend to use the third, combined method of recording visits. Let's see what the customer will say ...
Maybe someone from reading this article has already struggled with the "cheating" and won them? It would be very interesting to know how.
Help on the metric
http://help.yandex.ru/metrikaAPI help on metrics
http://api.yandex.ru/metrika