
At the beginning of the year I published the article
Country Definition by IP: we test the speed of the algorithms , in which my “bicycle” was mentioned, which is distinguished by high speed. One of the popular questions was the possibility of determining the city by IP.
And a few months later, the project started as a “for fun” developed into an independent project.
A separate site dedicated to the
Sypex Geo project has been opened, where you can download the latest versions of the API and databases, as well as get acquainted with the documentation.
For those who want to test the correctness of determining the city by IP, this is a
link to a demo page . And under Habrakat, I will describe some technical details and give the results of a little testing.
Sypex Geo 2.1 format
Since the publication of the last article, the format of Sypex Geo (abbreviated SxGeo) has been optimized, and the possibility of including two directories in the database - cities and regions - has been added.
')
When creating a new format, the following priorities were set:
- high speed
- low resource consumption
- openness (specifications of the format of the binary database file and API are open)
- universality (the ability to create databases with any data sets)
The format allows you to store data in different encoding. After exiting the beta, scripts for converting a database from MySQL to a binary database file will be published.
The following information is stored in the database:
- Country ID
- ISO 3166-1 country code (two character)
- Region code FIPS 10-4 (two-character)
- Region name (optional)
- City
- Latitude
- Longitude
- Timezone (optional)
This list can be modified and / or expanded if necessary. More information
about the SxGeo 2.1 format can be found on the website.
Own database
After a very close acquaintance with the popular geolocation system GeoLite City from MaxMind, it was decided to create its own database. The problem is that in GeoLite City there are a lot of inaccuracies, rubbish, duplicate cities, excessive splitting of the ranges, as well as problems with the cities of the former USSR (for example, the names of companies or responsible names from Whois are taken instead of cities).
At the moment, the base is based on GeoLite City, but already contains a completely revised coverage of Russia, Ukraine and Belarus. Gradually, other countries will be refined, primarily the CIS and Europe. The Sypex Geo City database contains the names of cities and regions in Russian in UTF-8 (there are not yet translated names), as well as the timezone.
In addition, other databases available on the site are converted to the SxGeo 2.1 format.
Using
Using Sypex Geo API is extremely simplified.
1. SxGeo.php SxGeoCity.dat ( ) 2. SxGeo.php , include("SxGeo.php"); 3. SxGeo
Performance testing
And for dessert, a little comparative performance testing. The opponents are GeoLite API and Geobaza API. All test participants use a binary database of their own format, and use the PHP API. Testing takes place under Win 7 (proportions are preserved on Linux), PHP 5.2.17.
Results after 10 runs for each API in two modes (normal and with caching in memory) averaging and rounding to dozens. For each run, an array of 50,000 random IP addresses is created, and a loop is searched by each algorithm.

Suggestion and wish are welcome. Also looking for help with porting API to other languages, and the creation of modules for Apache and nginx.