⬆️ ⬇️

ipgeobase in nginx

When a problem arises, it seems to get the city and the tax (car) code of the region at the visitor’s address - it’s just that, it’s full of such pieces in the internet!

And then you look: some are paid, others cannot be deployed, others are possible, but this is resource-intensive, the fourth people do not know anything about the regions of the Russian Federation ...

And here the sick brain of a programmer with an obsessive idea hurries to the rescue: "Do not have the others - do it yourself"







As soon as you begin to think in this vein - here, nginx has an excellent geoip module, which “is not only fast, but also optimized to the point of impossibility”. But here is an ill luck, he does not understand any of the known database formats (MaxMind, Sypex, ipgeobase).

')

A couple of hours in an embrace with a python and now there is a good converter that pulls out all we need from the site ipgeobase.ru.

(Yes, there were rumors that everyone had been fired there for half a year now, but the bases are regularly updated, which is good news)



And so that there are no concerns, I will comment on the code below (if not interested, you can immediately flip to the setting)

Code





1. Download the database
There is nothing complicated, requests + zipfile:

archive = requests.get("http://ipgeobase.ru/files/db/Main/geo_files.zip") if archive.status_code != 200: error("IPGeobase no answer: %s" % archive.status_code) extracteddata = ZipFile(StringIO(archive.content)) filelist = extracteddata.namelist() if "cities.txt" not in filelist: error("cities.txt not downloaded") if "cidr_optim.txt" not in filelist: error("cidr_optim.txt not downloaded") 






2. We load the dictionary of regions
 REGIONS = dict(l.decode("utf8").rstrip().split("\t")[::-1] for l in open("regions.tsv").readlines()) 


where regions.tsv is a list of automotive / tax codes of regions, of the form:

66

77

78 -







3. Get a dictionary of cities
For each city we need to know its id, name and region code:

 CITIES = {} for line in extracteddata.open("cities.txt").readlines(): cid, city, region_name, _, _, _ = line.decode("cp1251").split("\t") if region_name in REGIONS: CITIES[cid] = {'city': b64encode(city.encode("utf8")), 'reg_id': REGIONS[region_name]} if cid == "1199": # Zelenograd fix CITIES[cid]['reg_id'] = "77" 




I note that here immediately, with an eye to the future, the utf-8 name of the city is encoded in base64, to expand the possibilities of use (for example, in the nginx logs), without the need to work with transliteration.





4. We glue the address ranges and cities
 for line in extracteddata.open("cidr_optim.txt").readlines(): _, _, ip_range, country, cid = line.decode("cp1251").rstrip().split("\t") if country == "RU" and cid in CITIES: database["".join(ip_range.split())] = CITIES[cid] 


Obviously, if the country is not Russia, then there will be no regions or cities in ipgeobase, and our tasks do not need such ranges.





5. Generate files for geoip module
 with open("region.txt", "w") as reg, open("city.txt", "w") as city: for ip_range in sorted(database): info = database[ip_range] city.write("%s %s;\n" % (ip_range, info['city'])) reg.write("%s %s;\n" % (ip_range, info['reg_id'])) 






Nginx configuration





For everything to work, you need to enable the nginx.org/ru/docs/http/ngx_http_geo_module.html module in nginx geo,

put the generated files in a known place and add such a config to the http section:

 geo $region { ranges; include geo/region.txt; } geo $city { ranges; include geo/city.txt; } 


After such manipulations, two variables $ city and $ region will appear in nginx, which can be used anywhere:







In fact, such a module works just instantly, does not load nginx, and due to the easy automation of database updates, it is fairly accurate (it all depends on trust in the ipgeobase.ru databases). In this connection, there was a feeling that he might be useful to someone else. So I propose to use and, maybe, make converters to other data providers.



GitHub code (ipgeobase-importer branch)



PS After a while, after writing the article, I rewrote everything on Go and added support for MaxMind

Source: https://habr.com/ru/post/264219/



All Articles