📜 ⬆️ ⬇️

Creating a directory of address information with blackjack and API

Part 1. The tragic. “Why do I need all this ?!”


We somehow faced the need to enter correct information about the location (registration, registration) of users, and with the fact that this problem is not solved quite as easily and simply as we would like. At first we tried KLADR, in its free incarnation. Not that we really didn’t like everything, but there was at least one very annoying circumstance - some addresses were missing in the directory. For example, there is house 10, and 10k1 - sorry, they did not deliver. In general, KLADR was attractive because it has a simple API and plugins (jQuery in particular) that can be easily embedded into the application, but repelled by content. We thought - if there is no such resource containing the full and most relevant address information, with API and plugins, the only way out is to create such a resource yourself.

Part 2. Seeker. “Where is the dog buried ?!”


And we heard about FIAS. About how he is infinitely full and beautiful. And this is the way out! There are bases on the FIAS website, and all that is needed is a young growing information system! However, the size of the base is more than 4 Gb, but oh well, this is all of Russia! The base is updated regularly, so there is where to roam. Things are easy - to deploy the base, to tie API and plugins. It is necessary to make a reservation that there are several relevant and important articles. For example, a series of these articles here , which initially helped a lot.

Part 3. "How it works"


The problems began with the fact that the FIAS database is supplied in the .dbf format, the data from which had to somehow be transferred to PostgreSQL, which was decided to be used as a database.
')
It should be noted that in addition to the .dbf format, the database is in xml format. As you know, a person wants everything at once. To turn on and work. But to deal immediately with what FIAS gives is quite problematic.

So, at the fias.nalog.ru address in the update section there is the following:


Note: to work with .dbf files, you need to install a third-party extension for php_dbase.dll.

So what is all the same to choose the version to download .xml or .dbf? The differences are as follows:


So, unzipped the files. What information do we need? We see the following set of files:


In addition to these tables, there are a number of other service tables that contain information about abbreviations in other tables.

In most cases, it is enough to form an address right up to the house. Although, if anyone needs it, you can go further.

Thus, we will create 2 tables in the postgresql database.

Table with addresses:

CREATE TABLE addrs ( "ACTSTATUS" integer, "AOGUID" character varying(36) COLLATE pg_catalog."default", "AOID" character varying(36) COLLATE pg_catalog."default", "AOLEVEL" integer, "AREACODE" integer, "AUTOCODE" integer, "CENTSTATUS" integer, "CITYCODE" integer, "CODE" character varying(20) COLLATE pg_catalog."default", "CURRSTATUS" integer, "ENDDATE" timestamp, "FORMALNAME" character varying(120) COLLATE pg_catalog."default", "IFNSFL" integer, "IFNSUL" integer, "NEXTID" character varying(36) COLLATE pg_catalog."default", "OFFNAME" character varying(120) COLLATE pg_catalog."default", "OKATO" VARCHAR(11), "OKTMO" VARCHAR(11), "OPERSTATUS" integer, "PARENTGUID" character varying(36) COLLATE pg_catalog."default", "PLACECODE" integer, "PLAINCODE" character varying(20) COLLATE pg_catalog."default", "POSTALCODE" integer, "PREVID" character varying(36) COLLATE pg_catalog."default", "REGIONCODE" integer, "SHORTNAME" character varying(15) COLLATE pg_catalog."default", "STARTDATE" timestamp, "STREETCODE" integer, "TERRIFNSFL" integer, "TERRIFNSUL" integer, "UPDATEDATE" timestamp, "CTARCODE" integer, "EXTRCODE" integer, "SEXTCODE" integer, "LIVESTATUS" integer, "NORMDOC" character varying(36) COLLATE pg_catalog."default", "PLANCODE" integer, "CADNUM" integer, "DIVTYPE" integer ) WITH ( OIDS = FALSE ) TABLESPACE pg_default; ALTER TABLE address OWNER to postgres; 

Table with house numbers:

 CREATE TABLE hous ( "AOGUID" character varying(36) COLLATE pg_catalog."default", "BUILDNUM" character varying(10) COLLATE pg_catalog."default", "ENDDATE" timestamp, "ESTSTATUS" integer, "HOUSEGUID" character varying(36) COLLATE pg_catalog."default", "HOUSEID" character varying(36) COLLATE pg_catalog."default", "HOUSENUM" character varying(15) COLLATE pg_catalog."default", "STATSTATUS" integer, "IFNSFL" integer, "IFNSUL" integer, "OKATO" VARCHAR(11), "OKTMO" VARCHAR(11), "POSTALCODE" integer, "STARTDATE" timestamp, "STRUCNUM" VARCHAR(15), "STRSTATUS" integer, "TERRIFNSFL" integer, "TERRIFNSUL" integer, "UPDATEDATE" timestamp, "NORMDOC" character varying(36) COLLATE pg_catalog."default", "COUNTER" integer, "CADNUM" VARCHAR(50), "DIVTYPE" integer ) WITH ( OIDS = FALSE ) TABLESPACE pg_default; 

Data import is carried out in a simple way. Open files in Excel and save them as csv. Additionally, it is recommended to change the encoding, because unlike xml files that are represented in utf-8 encoding, dbf files are encoded in win-866. Open the files in the editor (notepad ++ is suitable for this purpose) and convert to utf-8.

Import a table with addresses:

 COPY addrs FROM 'PathToTheFile\ADDROB01.csv' DELIMITER ';' CSV; 

Import tables with houses:

 COPY addrs FROM 'PathToTheFile\HOUSE30.csv' DELIMITER ';' CSV; 

What is the ADDROBXX table made of?

Despite the abundance of fields, only some of them will be needed.


Note: on fias-nalog.ru you can find a complete description of all fields.

Important in the HOUSEXX table:


Fields can contain similar or identical information and have to leave the situation experimentally. In the QuerryController class (the link to the repository is below) there is a method: chooseBuilding, in which we tried to solve this problem. Maybe someone will get better. Queries to the database are in the same place.

Then, after the base was imported into Postgres, we started creating an API and a plug-in for our system.

For the API, in order not to get stuck, used Laravel. The query scheme turned out quite simple. The hierarchy of objects is as follows:


By the way, testing the system, we are faced with the fact that not all settlements have streets, and not all streets have houses, which surprised us a lot and puzzled us. This circumstance did not give rest because if you give the user the opportunity to enter information himself, they will write a question that involuntarily raises the question - “Do these people learn anything at school ?!”.
Therefore, it was decided not to give the user the slightest opportunity for self-activity, and for such “missing” addresses to provide the opportunity to apply for the inclusion of the missing address in the help system.

The query scheme looks simple:


For the last item, a list of houses with letters, buildings, buildings, and other things is formed.
After all the ordeals made a plugin on vue.js, to work with the directory, and its alternative to jQuery.

The region and the district can be removed as unnecessary, as they are pulled together with the cities.
As you type, autocomplete hints pop up, as in KLADR. True, the difference is that the KLADR plugin is intended for autocompletion, and here only the address selected from the prompts is considered valid.

In the ASPUDcomponent folder, there is a VueJs component for working with the address base.
Sources are available in our repository .

Part 4. How to update all this?


With updates, the situation is the following: you first need to get the update versions via the SOAP protocol. You can see how this is done in the UpdateController class (method: filesVersions ()).

Note: the version that is listed last in the list is not necessarily the one that can be downloaded on the main page. But do not rush to download only the latest version, as it may be “bat”. There have been such precedents. Further the archive with the latest version is downloaded and unpacked. To work, you must use the extension for php (php_rar.dll).

Well, then select the desired region file (or, if necessary, all files) to update the database.

Source: https://habr.com/ru/post/333424/


All Articles