
Every day several million drivers (static .exe and .zip files) are downloaded from our Download servers. To analyze user behavior, we faced the task of calculating the following parameters: when, how much, how often and even who downloads the drivers.
The most obvious solution would be to use tools like AWstat, GoAccess, ELK stack or Splunk, and in extreme cases to collect Nginx logs.
')
But each option has its drawbacks: an inconvenient interface, paucity of data, complexity of setup and, most importantly, the inability to build segments in user reports.
And then we decided to force Nginx to send events to Google Analytics on its own immediately after downloading the file. We were also able to pass a unique ClientID to GA.
As a result, we received analytics on static files to which previously it was impossible to link the GA counter.
Under the cut ready config and examples of our system.
In the
last post, we talked about how to send events from the DriverPack Online application using the Analytics Measurement Protocol.
Well, today we will show how the downloads of static files on our Download-servers.
Information on downloads comes in "real time".

Works out of the box
Now you can monitor how many real downloads of our product are from servers. Moreover, the counting of unique downloads is performed many times more accurately than if you do it by IP address, since Each user is associated with a unique identifier - ClientID.

404 and 500 errors are tracked through events.

Due to the fact that we transmit the user's real IP address to events, we can use his location when analyzing.

Reports “by behavior” can be trusted, since the real ClientID of the user is forwarded, which allows to evaluate:
- the number of new and returned users
- the frequency of visits by the user and the time since his last visit,
- user involvement
- by what keywords the user came to our site from the very beginning.
Nginx itself sends the correct User-Agent, which makes it possible to build reports on browsers and OS. In the report, you can find wget, with which our DriverPack Online downloads drivers, real browsers, as well as robots and all parsers.

For some, information on downloads from mobile devices will be very valuable.

Unfortunately, we still cannot referrer to GA, because Nginx does not support urlencode ().
Therefore, channel reports will not work (details at the end of the post).

How to configure the same? Instruction
1. Create a GA counter with which all Download servers will work. Or use the number of the existing counter.
2. In the “Custom Parameters” counter settings, add special parameters:
- dimension1. The name “ClientID”, level “User”;
- dimension2. The name “request_time”, level “Hit”;
- dimension3. The name “body_bytes_sent”, level “Hit”.
This will help us calculate the download speed and even the percentage of dangling downloads.
3. Create a config called "google-analytics" in the directory "/ etc / nginx /"
4. Get the configuration file “/etc/nginx/conf.d/default.config” (using our example)
server { include google-analytics; listen 80; server_name localhost; autoindex on; autoindex_exact_size off; access_log off; location / { root /usr/share/nginx/html; index index.html index.htm; post_action @GAlog; } error_page 404 /404.html; location = /404.html { root /usr/share/nginx/html; post_action @GAlog404; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; post_action @GAlog500; } }
5. Restart nginx
$ sudo service nginx reload
6. Configure the counter on the site so that the ClientID is stored in the cookie (do not forget to substitute the number of your counter and the name of the main domain).
<script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-XXXXXX-1', 'auto'); ga(function(tracker) { var clientId = tracker.get('clientId'); // Get clientId from Google Analytics document.cookie = "_ga_cid=" + clientId + "; path=/; domain=.< >"; // Write cookie for web-server ga('set', 'dimension1', clientId); // Write clientId in custom dimension }); ga('require', 'displayfeatures') ga('send', 'pageview'); </script>
Finally about accuracy
Statistics accuracy up to 99%! We analyzed several files and compared GA data with log data.

The comparison shows that GA calculates unique downloads even more accurately than we can do it with our hands.
disadvantages
The script works fine and 100% meets our requirements, but you can add several improvements to it:
- The script loads the server a little, but for us it is not critical at all.
- Nginx does not support urlencode (), so links like example.com/?SomeOptions will beat. One way to solve this problem is to use a lua script.
- The referrer is not passed to the GA parameter (urlencode () must also be used).
- In Nginx 1.8, the $ content_length variable does not work, so we cannot transfer the file size to GA. This parameter would allow to make reports containing information about the percentage of under-downloaded files.
- Send service information from Nginx. For example, the number of connections, etc.
- It would be possible to send download time directly to Google Analytics using the & plt parameter, but Nginx returns the time in seconds, and GA does not like this format (milliseconds are expected). Therefore, you have to send this data to dimension2.
- The script uses the undocumented post_action function. There is a risk that this feature will be eliminated in newer versions.
Friends, please write in the comments, what problems will help you solve the described method?
Real-world examples of use will be very helpful in further improving
our product .
Well, in advance - thanks!