📜 ⬆️ ⬇️

Geo targeting nginx, special case

There was a task to do geo-targeting for the regions of Russia on the news site, i.e. when entering the main page, there should be a redirection to the regional page of the site with addresses like: region / [region number], and the client should be redirected to nginx without sending data to Apache, otherwise it is an unnecessary unnecessary load on the server.

The average attendance of the resource is 40 tons per day. Drupal, a caching boost module that creates static pages that nginx issues.

A search for a solution in google offered options for js redirection on the client side, or data transfer to Apache, a database query to get the desired URL, which was not initially suitable.

Having reviewed the available databases of ip addresses: www.wipmania.com/ru/base , www.maxmind.com/en/home , ipgeobase.ru , a “brilliant idea” had arisen that if the database had the necessary urls [region number], that happiness would be complete.
')
Proceeding from this and the conditions that geo-targeting is done only for the regions of Russia, I stopped at the nginx geo module and the base of addresses from ipgeobase, since the geo module can accept a text file as the base of addresses, but the base from ipgeobase is distributed in text format. It remains, in fact, to bring the database of addresses in the desired format ...

So:
From here ipgeobase.ru/files/db/Main/geo_files.tar.gz download the database, get the archive, unpack and get 2 files cidr_optim.txt and cities.txt.
cidr_optim.txt has the following recording format:
<start of block> <end of block> <address block> <country> <city identifier>

<block start> - the number obtained from the first ip address of the block (range) of ip-addresses of the type abcd by the formula a * 256 * 256 * 256 + b * 256 * 256 + c * 256 + d
<end of block> - the number obtained from the second ip address of the block (range) of ip-addresses of the form efgh by the formula e * 256 * 256 * 256 + f * 256 * 256 + g * 256 + h
<address block> - a block (range) of ip-addresses of the type abcd - efgh, for which the position is determined
<country> - two-letter country code to which the block belongs
<city identifier> - the identifier of the city from the cities.txt file. If instead of the identifier there is a dash, then either the city could not be determined, or the country of the block is not Russia or Ukraine.

cities.txt has the following recording format:
<city identifier> <city name> <region name> <district name> <city center latitude> <city center longitude>
File description is here ( ipgeobase.ru/Help.html#35 )

From all this, parsing files into a database, got 2 tables with data from files, which, then, using the names of the regions led to the format

“Block (range) of ip-addresses of the type abcd - efgh” -> (required url) [region number]

The case remains for the small - translate format:
"Block (range) of ip-addresses of the type abcd - efgh" in the format understood by the geo module:
0.0.0.0/0 (starting address / bitmask).

Here, oddly enough, the fun began. I interrogated all the admins I know, they all said amicably that they were going through how to translate the range into the format I needed, but, because they were unnecessary, everyone had forgotten a long time ago, but there was no time to recall. Google, which is always helping us, offered either instructions on how to calculate the range by address and mask, or thoroughly study the principle of ip4 networking.
To solve the problem chose the 3rd option. I found ip-calculator.ru in the network, contacted the domain administrator, who kindly agreed to help with the translation and explanation of the principle of translating addresses into the required format. (thanks again).

The result was a file format “0.0.0.0/0 (required URL);” with 57 thousand lines, let's call it, say, geo_ru.conf.

Now, actually, nginx:
in the http {} block we turn on the module
geo $region_number { default all; include [   ]/geo_ru.conf } 

those. after the request, if the client’s address is in the file, the variable $ region will contain the corresponding value, namely, [region number], otherwise 'all'. (more info : nginx.org/ru/docs/http/ngx_http_geo_module.html )
Further, actually, a redirect:
in the server block of the site
 #   $get_redirect   donot_redirect set $get_redirect donot_redirect; #  ,          do_redirect if ($uri = '/') { set $get_redirect do_redirect; } #  nginx          $region_number  'all' ,     if ($city = 'all') { set $get_redirect donot_redirect; } #    , ..             (      ) if ($cookie_geolocate = 1) { set $get_redirect donot_redirect; } #         if ($get_redirect = do_redirect) { rewrite ^(.*)$ http://example.com/region/$region_number redirect; } 

(i.e. in the end, we got what we wanted - moving to the region / [region number])

And finally - so that the client could still see the main page in the block
location / {} send the client a cookie:
 add_header Set-Cookie "geolocate=1;Path=/;Domain=.example.com;"; 


That's all. I hope someone will help.
Check in work can be on fedpress.ru .
The reason for writing the article was the fact that the decision, seemingly obvious, did not appear immediately. I would be happy to comment, advice, clarification.

PS Dear administrators present at Habré, please write an article with step-by-step tutorial "for dummies" about what ip addresses are, how to calculate a mask over a range and vice versa, why do we need <block start> (a * 256 * 256 * 256 + b * 256 * 256 + c * 256 + d) and <block end> ​​(e * 256 * 256 * 256 + f * 256 * 256 + g * 256 + h), I think many would be grateful.

Source: https://habr.com/ru/post/159335/


All Articles