📜 ⬆️ ⬇️

How to calculate (user city) by IP

Knowing the location of a person, you can do a thousand useful and not so much things: to offer the right product and call the delivery price in advance, show the habitat of the Pokemon, display local news or advise a cafe nearby.

Location is important.



What are the methods of geolocation?


There are 2 basic methods of geolocation, if we exclude the parsing of geotags of photos and espionage from satellites.
')
Take the IP-address of the user and in a special directory to find a city with the country.



Find out the location through the HTML5 Geolocation API.



In this article we tell how to find the city of the user, because such accuracy is usually enough. City is enough online stores, courier services, news aggregators, sites with weather forecasts.

The city is better identified by IP: the method always works and does not disturb the user. And in geolocation over IP, the main thing is to find a directory that conveniently connects and gives the city without errors. The second part of the article is about this.

How we chose the directory of IP addresses


There is a big problem in comparing directories: it is impossible to verify whether the IP address belongs to the city found by the directory right now. Yesterday IP was related to Peter, and today it is Nizhny Novgorod.

Therefore, we compared reference books by such criteria:

  1. Cost
  2. The frequency of updates.
  3. The number of IP address ranges for Russia.
  4. The number of addresses "on the ground", or completeness. To measure completeness, we drove all the addresses from each directory through the standardization API "Dadata" . The service brought addresses to the same format and sorted out by type: region, district, city. We considered these standardized addresses.
  5. Format: how handy it is to use.
  6. Libraries and integration with popular frameworks.
  7. What can be pulled out of the base in addition to the city.
  8. Whose is Crimea (politics is politics, but business needs work).
  9. Detailing by human settlements. To find out, we drove 35,000 random unique IP addresses through each directory. Then they compared how many unique cities each directory has split.

We considered such reference books:


IPGeoBase


Cost Free.

Updates. Everyday.

IP address pools in Russia. 43751 pool, this is the first place .

Completeness. 728 objects:


Third place for this parameter.

Base format Tab-separated text files. In one city file with an ID, in the other - the IP ranges linked to them.



Encoding in files is a pain called Windows-1251. Well, there is iconv - we received UTF-8 with a flick of the wrist:

iconv -f WINDOWS-1251 -t UTF-8 cities.txt > cities_utf8.txt 

The base is slow (still, this is a text file) - the traversal of 35,000 addresses took several minutes.

Libraries. There are ready for Perl , Ruby and Python , but the newest one is from 2013. For 4 years, Trump became the president of the United States, PHP 7 was released, a million JS frameworks appeared, but none of the libraries were updated to this handbook.

It took an hour to port the library to Python 3.

What can be pulled out of the base.

 ('RU', '-', '-', '59.939037', '30.315784') 

Crimea. Our.

Detailing The sample of 35,000 addresses found 372 different localities.

This is the third place with a slight lag from the second.

Verdict. IPGeoBase is a collection of cities and IP address ranges that is wrapped in .txt files with a tab-separated structure. Updated frequently.

The minuses are the libraries of Ochakov times, and the text file is not the most convenient solution for accessing data.

I've seen life, but still flying Tu-154.




SypexGEO


Cost Free, distributed under the BSD license.

Updates. A couple of times a month.

IP address pools in Russia. Total ranges 1696337, but it is unclear how many of them belong to Russia: the data are buried in the directory. By this parameter, the place is not awarded.

Completeness. 832 objects:


Second place. Not bad!

Format. Strange .dat file with structure at offsets. The insides quickly pick it up did not work out - the creator on the forum says that there is no converter for translating the base into a human-readable form.

How to work with the directory, if not through the library - is unclear. For the curious there is a specification on the site directory .

The speed is good: bypassing 35,000 addresses took a few seconds.

Libraries. There are for Python , PHP Yii , PHP Laravel , Java , Ruby . Updated 2-3 years ago. There is also integration with symfony and a plugin for wordpress .

What can be pulled out of the database:

 {'city':   {'id': 498817,    'lat': 59.93863,    'lon': 30.31413,    'name_ru': '-',    'name_en': 'Saint Petersburg'}, 'region': {'id': 536203, 'name_ru': '-', 'name_en': 'Sankt-Peterburg', 'iso': 'RU-SPE'}, 'country': {'id': 185, 'iso': 'RU', 'lat': 60.0, 'lon': 100.0, 'name_ru': '', 'name_en': 'Russia'}, 'region': '-', 'tz': ''"} 

Crimea Not ours.

Detailing 400 different settlements were found on a sample of 35,000 addresses.

This is the second place.

Verdict. Very captivating speed, bias - force. The authors say that they specifically optimized the base for high loads.

In terms of content and accuracy, it is similar to IPGeoBase - here there are a few more objects, 10% of the addresses are resolved differently.

The base is fully open.

Unfortunately, the Crimea does not rezolvit to Russia.

Helicopter Black Hawk - it flies cool, but not everyone will do.




MaxMind Lite


Cost Free with a Creative Commons license. There is a paid version, which costs $ 1,470 per year.

Updates. The first Tuesday of each month (just like a passport office).

IP address pools in Russia. 91432. If you remove the IP-addresses that rezolvitsya in Russia without a city - 42822. This is the second place.

Completeness. 1392 objects:


First place by a wide margin!

Base format Own .mmdb. Cities and ranges of IP addresses are also available in .csv files that are archived with the database.

The database has versions with different accuracy: to the country, to the city, as well as the ASN directory (unique numbers of Internet providers). There is also a base for IPv6 addresses.

Libraries. There is a complete order here - there are several dozen libraries for working with the database on the Githab.

What can be pulled out of the base. Extraction is super detailed and multilanguage. MaxMind gives an interesting parameter accuracy_radius - the accuracy of the radius of coordinates in kilometers.

 { "city": { "geoname_id": 498817, "names": { "de": "Sankt Petersburg", "en": "Saint Petersburg", "es": "San Petersburgo", "fr": "Saint-Pétersbourg", "ja": "サンクトペテルブルク", "pt-BR": "São Petersburgo", "ru": "-", "zh-CN": "圣彼得堡" } }, "continent": { "code": "EU", "geoname_id": 6255148, "names": { "de": "Europa", "en": "Europe", "es": "Europa", "fr": "Europe", "ja": "ヨーロッパ", "pt-BR": "Europa", "ru": "", "zh-CN": "欧洲" } }, "country": { "geoname_id": 2017370, "iso_code": "RU", "names": { "de": "Russland", "en": "Russia", "es": "Rusia", "fr": "Russie", "ja": "ロシア", "pt-BR": "Rússia", "ru": "", "zh-CN": "俄罗斯" } }, "location": { "accuracy_radius": 20, "latitude": 59.9321, "longitude": 30.1968, "time_zone": "Europe/Moscow" }, "postal": { "code": "191023" }, "registered_country": { "geoname_id": 2017370, "iso_code": "RU", "names": { "de": "Russland", "en": "Russia", "es": "Rusia", "fr": "Russie", "ja": "ロシア", "pt-BR": "Rússia", "ru": "", "zh-CN": "俄罗斯" } }, "subdivisions": [ { "geoname_id": 536203, "iso_code": "SPE", "names": { "en": "St.-Petersburg", "es": "San Petersburgo", "fr": "Léningrad", "ru": "-" } } ], "traits": { "ip_address": "109.205.249.212" } } 

Crimea. Not ours.

Detailing On a sample of 35,000 addresses, the directory found 749 address objects.

This is the first place.

But there are nuances:


Verdict. Detailed base with elegant issuance.

In 50% of cases, the results diverge from the previous two databases - the accuracy and detail of MaxMind Lite is higher.

But there are fundamental disadvantages - the frequency of updates and the Crimea.

Heaped spacecraft, which is updated once a month and does not consider the Crimea Russian.




ip2ruscity


Cost Paid, costs 5,000 rubles a year.

Updates. Once a month.

IP address pools in Russia. 34907 pool, third place .

Completeness. 486 objects:


The fourth place, which is much weaker than the third.

Base format Tab-separated text files or SQL files. In them - cities, regions, ranges of IP addresses. There are also telephone codes of cities, but for some reason they are only available in MySQL-format. In general, as in the program of the party “Uncertain Russia” - it will be medium (not directly cool, so normal).

The beginning and end of the ranges of IP addresses to save space are wrapped in a uint-format. They will have to independently lead to the appearance of IP addresses.



Not very convenient, but you can live. In Python, it is easy to do:

 import socket, struct socket.inet_ntoa(struct.pack('!I', 84098303)) '5.3.60.255' 

Libraries. Not a single one was found :(. I had to nest my avant-garde teaming for research, I will not publish the code.

The service recently appeared API. Through it give:


The API is relatively free - no more than 20 requests per day from a single IP address. In the paid version give 3000 requests per hour.

What can be pulled out of the base.

 {'city': '-', 'region': '-', 'region_id': '78'} 

If you use the MySQL-format database, the telephone code of the city is also returned.

Crimea. Our.

Detailing The sample of 35,000 addresses found 273 localities. This is the last place.

Verdict. It seems to be inexpensive, but for money it could be better.

Screw ATR-72 airline Air Serbia.




Summing up (as the first channel)


Free MaxMind Lite in almost all parameters faster, higher and stronger than the rest. However, it has 2 important drawbacks - it is updated only once a month and does not consider Crimea to be Russian.

At Dadat, we did not sleep at night and thought about which directory to choose for our geolocation API . As a result, they took IPGeoBase as a basis and screwed it on top of any advantages.

Compared with the “bare” IPGeoBase, “Dadata” is more convenient.

Updated automatically. The service updates the directory as soon as a new version is released; there is no need to recall this.

Libraries are not needed . The reference book is accessible on API, any HTTP-library will be connected to it. The request is very simple: you only send the IP address and the token that you give when registering at DaData.ru .

 curl -X GET \ -H "Accept: application/json" \ -H "Authorization: Token ${yoursecrettoken}" \ https://suggestions.dadata.ru/suggestions/api/4_1/rs/detectAddressByIp?ip=213.180.193.3 

The service gives much more data than the "bare" directory. In addition to the name of the found object:


In total, there are several dozens of fields for IP, and there is a full specification on DaData.ru.

We turned a working but unpainted Tu-154 into an Airbus A-380.

In the economy we carry free of charge - you can make 10,000 requests per day to the API, simply by registering . If you need more, it will cost from 4000 rubles per year.

Source: https://habr.com/ru/post/340466/


All Articles