📜 ⬆️ ⬇️

Sypex Geo - quick city identification by IP

At the beginning of the year I published the article Country Definition by IP: we test the speed of the algorithms , in which my “bicycle” was mentioned, which is distinguished by high speed. One of the popular questions was the possibility of determining the city by IP.

And a few months later, the project started as a “for fun” developed into an independent project.
A separate site dedicated to the Sypex Geo project has been opened, where you can download the latest versions of the API and databases, as well as get acquainted with the documentation.

For those who want to test the correctness of determining the city by IP, this is a link to a demo page . And under Habrakat, I will describe some technical details and give the results of a little testing.

Sypex Geo 2.1 format


Since the publication of the last article, the format of Sypex Geo (abbreviated SxGeo) has been optimized, and the possibility of including two directories in the database - cities and regions - has been added.
')
When creating a new format, the following priorities were set:

The format allows you to store data in different encoding. After exiting the beta, scripts for converting a database from MySQL to a binary database file will be published.

The following information is stored in the database:

This list can be modified and / or expanded if necessary. More information about the SxGeo 2.1 format can be found on the website.

Own database


After a very close acquaintance with the popular geolocation system GeoLite City from MaxMind, it was decided to create its own database. The problem is that in GeoLite City there are a lot of inaccuracies, rubbish, duplicate cities, excessive splitting of the ranges, as well as problems with the cities of the former USSR (for example, the names of companies or responsible names from Whois are taken instead of cities).

At the moment, the base is based on GeoLite City, but already contains a completely revised coverage of Russia, Ukraine and Belarus. Gradually, other countries will be refined, primarily the CIS and Europe. The Sypex Geo City database contains the names of cities and regions in Russian in UTF-8 (there are not yet translated names), as well as the timezone.

In addition, other databases available on the site are converted to the SxGeo 2.1 format.

Using


Using Sypex Geo API is extremely simplified.
1.  SxGeo.php  SxGeoCity.dat (  )   2.   SxGeo.php   ,   include("SxGeo.php"); 3.   SxGeo //$SxGeo = new SxGeo(); //   ,   SxGeo.dat $SxGeo = new SxGeo('SxGeoCity.dat', SXGEO_BATCH | SXGEO_MEMORY); //    4.   (SxGeo City, GeoLite City, IpGeoBase) $SxGeo->get($ip); (   ,      ) // $SxGeo->getCityFull($ip); (      ) 

Performance testing


And for dessert, a little comparative performance testing. The opponents are GeoLite API and Geobaza API. All test participants use a binary database of their own format, and use the PHP API. Testing takes place under Win 7 (proportions are preserved on Linux), PHP 5.2.17.

Results after 10 runs for each API in two modes (normal and with caching in memory) averaging and rounding to dozens. For each run, an array of 50,000 random IP addresses is created, and a loop is searched by each algorithm.


Suggestion and wish are welcome. Also looking for help with porting API to other languages, and the creation of modules for Apache and nginx.

Source: https://habr.com/ru/post/146597/


All Articles