There was the task of tracking file downloads from the site (images, documents, videos, distributions, ...), because Regular statistics services cannot do this without changing the URL of the files. And the statistics should be visible in the usual place (for example, Google Analytics or FireBase).
After going through several plugins (most of them have the words Download and Manager in the title), I found that all of them are organized according to the principle of manually compiling a list of files for monitoring. And in many of them, protection against unauthorized downloads is implemented, which is redundant in this task. It would be possible to use them, but if there are many files, then in the end:
')
- it is too inconvenient and long to get an element for each file;
- files can change their location - again you have to correct the item.
As a result, its own implementation was made in the form of a plug-in for WordPress, which simply indicates the directory (relative site path) and then monitors the downloads of its content.
Link to the free plugin
here for those who have enough information above. Below are examples of statistics results and details of the technical implementation.
Where statistics are sent
For the time being, the two most basic places of aggregation of statistics are supported.
Google Analytics
Statistics are published in the form of messages (Events), in which a Category (Event Category) is set through settings, the URI to the file is specified in the Event Action, and the request parameters are specified in the Event Label if the corresponding setting is set. As a result, you can comfortably observe the download dynamics of each file in the catalog in the Google Analytics console.
Table in wordpress database
Mainly for debugging. It simply counts the number of downloads, the temporal dynamics are not visible. Table fields: IP, file URI, request parameters (if any) and counter. Data can be seen by any SQL editor (for example, phpMyAdmin).
Each entry is assigned an ID to delete them individually if necessary.
Interception file accesses
Uploading files is handled by the Apache Web server itself, so a handler has been made in .htaccess with redirection to a PHP script.
It looks like this:
<FilesMatch "\.(.*)$">
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} !\.( htaccess|php|js|css )$
RewriteCond %{REQUEST_URI} ^/ mypath /(.*)
RewriteRule ^(.*) /index\.php\?seraph_dlstat_api=Get&uri=%{REQUEST_URI} [L,QSA]
</IfModule>
</FilesMatch>
Specially made exceptions for system files with types htaccess, php, js, css.
To minimize the response time, the script call is implemented via the seraph_dlstat_api parameter for index.php, which is checked almost immediately after downloading all the WordPress scripts needed for processing. This is done on the
do_parse_request action hook — the very first callback after loading the entire working environment (running wp-load.php).
Next, the script processes \ registers the URI and returns the contents of the file through the readfile system function. Also, partial download of files via HTTP_RANGE is supported, where the file is already read by blocks.
Deferred data sending
To maximize the response time, asynchronous statistics sending is supported. When accessing the file, an entry is created in the database and the file is immediately returned to the client. And already on the WordPress (
WP Cron ) header triggering, the data is taken from the table and the statistics are sent.
For Google Analytics, this is valid because It supports asynchronous message reception by specifying
the delay time .
By default,
WP Cron is triggered when any page loads. You can configure
WP Cron from the system scheduler to further optimize the response time.
Conclusion
As a result, for the client, the file download is indistinguishable from the standard processing by the Web server and now it is possible to track this.
I would appreciate any feedback.