In the process of developing a new service, we had an interesting problem. It is necessary to determine the ownership of the enterprise to any administrative-territorial or municipality and assign the enterprise to the district or district in which it is located. The representation for the end user should be a filter in a search form that would allow organizations to be found only in a given area or district of the city. And it must be done for companies all over Russia.
Introductory data: we had some accumulated base, numbering quite a large number of organizations throughout Russia. The database included the addresses of enterprises that were an ordinary line. Accordingly, there were no obvious ways to make a territorial link.
To implement such a seemingly simple task, I had to pretty much break my head. At first, the idea was to use Google-maps to draw the outlines of areas using custom maps, and get the coordinates of organizations through the Yandex geocoder. But this idea turned out to be utopian - not everyone will master maps of areas for all of Russia.
A suitable decision came to my head suddenly - to use the ready-made bases of administrative and territorial divisions of KLADR
. This database contains a complete list of settlements, streets and houses in Russia. The KLADR database also contains the OKATO code (All-Russian Classifier of the Objects of Administrative-Territorial Division) for each territorial unit. It should be noted that the OKATO base itself is not part of KLADR and must be downloaded separately
So, the base by which you can determine the district or district is available. It remains to figure out how to match her existing addresses. The data on houses in KLADR is stored in a rather specific way: information about a house can contain many different symbols, such as the body, the structure, the parity, which must also be taken into account when determining the area. So we need to analyze the available addresses. This can be done in two ways:
The first one is the simplest and the most unreliable
: feed the address of existing companies to the Yandex geocoder
, which will parse the
address into pieces. But there is a big disadvantage of this method - if for some reason there is no such address in the geocoder's base, it will return the structure closest to the specified location. And maybe nothing at all to return ...
The second way is the Jedi way. Implement the address parser with your own hand. Since the accuracy of the address determination was critical for our service, it was decided to implement the parser using our own resources. The simplest implementation example is here
. In the example, the address string is parsed into an array, the keys of which are the types of territorial units. In the above example, there is one “but”: the address must already be in the “correct” format. That is, for example, the house in the address must go after the street, and not in front of it.
Now that the address represents a more understandable structure, it can be compared with the existing KLADR base and get the OKATO code. By itself, the KLADR base does not give an idea of the belonging of a territory to some district or district. With it, you can determine the maximum of the OKATO code and there is also a zip code. And the necessary representation can be given by the base
. It is located in it information on intracity areas, districts of cities of republican, regional, regional subordination.
So, the scripts are written, the codes are mapped. As a result, such functionality appeared on the light:
Implemented and described by zdanchik