📜 ⬆️ ⬇️

FIAS or KLADR: choose the directory of addresses

July 1, 2014, one of the most significant events in the history of the Russian state took place: from this moment on, in our country a reference database of addresses for all even the smallest localities finally appeared! The name of this database is FIAS. Actually, the FIAS reference book itself appeared much earlier, but on July 1, the Federal Law 443 came into force, according to which all state and municipal structures should now rely on it as the only true address base. We decided to investigate whether it is worth moving to FIAS, and what pitfalls will be faced by those who decide to do it.

After reading the article, you will learn:


Why not KLADR?


At the moment, KLADR is considered the main address directory of Russia. Why did he not make people, and where did the need for a new come from?

Initially, KLADR was most likely conceived as a clear, structured directory containing up-to-date information at the addresses of all of Russia. Currently, this is unfortunately far from the truth. There are many features in the records of KLADR, and now we will tell about the most interesting of them.
')
Hell in house numbers or a programmer's nightmare

The record of the house number and its extensions (everything that comes after the number: the building, the building, the letter) in KLADR is stored in one line separated by commas. In this case, the general rules for the formation of the house part, described in the documentation, do not always apply in practice. So, if you decide to connect KLADR right up to the house, you will have to figure out what to do with the following symbols:
not to look nervous
1kA, 1_A, 31b, 2k1_A, 1p, 21_25, 5 / 34k1, 21/13 / a, 6ld2, 5/2lld2b, 42lld1_4, 21k5 / 2str2b, 2k6str2_7, N (1-700), dvld14_14A, 5kDyoDyoEdk_DeDo, dkdst_2, 7k .2, construction of buildings1, vl22 / 7construction of 3EST, construction of VPL_11 ...
In total there are 6436 different types of home parts records without numbers.


It seems that because of the abundance of different spellings in the reference book, even its creators get confused, since on the same street one can often find different valid entries of the same house. For example, in the village of Novy (Krasnogorsk district of the Moscow region) in KLADR there is an entry with house 8 and separately with dvld8. Theoretically, home ownership and a house are different things, but in reality few people write “home ownership,” and we can safely assume that dvld and just a house are one and the same.

Theoretically, KLADR is a directory of addresses that everyone should rely on when compiling any directories with addresses, and, therefore, they must store a key to this database in order to be able to synchronize with KLADR itself to receive updates. But the KLADR code - the only identifier in this directory - can change from version to version for the same objects. Therefore, in other directories you will not find it as a key to the address database, everywhere only the address without any id is used to specify the address. This is bad because the addresses in the directories may contain errors, be irrelevant or not exist at all, and it will take a lot of effort to bring it to KLADR (or use the dadata.ru service).

Where is this street, where is this lane?

In KLADR, the address is divided into levels (region, district, city, town, and street), and for each level there is a type and name. For example, the type is an autonomous region, the name is Yamalo-Nenetsky ... Unfortunately, it is not always possible to determine exactly what is the name and what is the type. And it is not always clear that the problem is KLADR, and what is actually called that.

For example, you can find such addresses:
And again: do not look nervous
Type: Autonomous Okrug
Name: "Khanty-Mansiysk Autonomous Okrug - Ugra"
According to KLADR, the correct address is: Russia, Autonomous Okrug, Khanty-Mansiysk Autonomous Okrug - Ugra,…

Type: "Chuvashia"
Name: "Chuvash Republic -"
Yes, yes, just like that - with a hyphen at the end. And the type is excellent.

Type: "Street"
Name: "QUARTER NEW CERAMICS 32A"
We regularly receive remarkable addresses of the form: Moscow quarter, New Cheryomushki 32A k8, sq. Xxx - notice that, according to KLADR, the house number is in the street name, and the street type is not a “quarter”, but a “street”.

Type: "Lane"
Name: “Ul. Soviet "
In the village of Dosotuy in the Chita region, there is Sovetskaya Street and Lane Ul. Soviet ". Therefore, the address Dosotuy st. Soviet and Dosotuy side street Sovetskaya - different addresses


Leo or Tolstoy?

There are a lot of mistakes in KLADR. Indexes of five characters, duplicate records of houses with double numbering and so on.

Here are some of them in more detail:


Probably, the reason for the errors is that local authorities are responsible for the current state of the directory, and it is possible that the entered information is not verified in any way. Anyway, the problems of the directory are aggravated by the lack of support: we have repeatedly written letters to the Federal Tax Service indicating errors, but none of them were corrected.

So if the address is in KLADR, it is not a fact that it exists in real life, and vice versa.

What with FIAS


Let's see what FIAS is and whether it solves the problems of KLADR.

Data and structure

The first thing you pay attention to when working with FIAS is more information than in KLADR. But the useful information was not added as much as we would like. I highlighted the most significant address information in the form of a comparative table below.

Field
KLADR
FIAS
Regions and cities of federal significance
+
+
Areas
+
+
Cities and rural districts
+
+
City areas
--
Streets
+
+
Homes and Extensions
+
+
Index
+
+
Center status
+
+
Action status (what happened to the object: renamed, reassigned, ..)
+ (conditionally encoded in KLADR code, but very poor decoding of codes)
+
Relevance Status
+
+
Record start and end date
-+
Condition of the house (does it require repair, how much)
-+ (but the relevance of the data is in doubt, since more than 95% of houses have the same status)
Object geo-coordinates
--
Apartment data (list, number or range)
--
Population (at any level)
--
Sign of company towns
--
Unique ID for each house
-+
Purpose of the building (residential / non-residential)
--
Floors, commissioning year, material of the walls of the house
--

Thus, only the fixed house ID can be distinguished from the useful, which is supposed to never change and can serve as a key for external systems, as well as the start and end dates of the recording. Otherwise, all new information consists of identifiers that periodically overlap or are part of others.

Quality information about homes

In FIAS there are two tables for houses. The data structure itself is very good: for everything there is a field.

The first table, HOUSE, contains the house numbers, and for each there is the following information:



What are the main differences from the table of houses in KLADR?

Pros:


Minuses:


The second table with houses, HOUSEINT, contains the spacing of houses. In KLADR, the table of houses contains records of the type H (1-999), which means all odd houses from interval 1 - 999. In FIAS they are divided into fields: the beginning of the interval, the end, and its attribute. Unfortunately, the contents of this table are as far from the truth as in KLADR: for example, in Kirov there is an incredibly long Shchors street with all houses in the range from 1 to 9999.

The quality of everything else

Let's look a little higher - on address objects down to the street. They are in the ADDROBJ table.

Pros:


Minuses:


Format

FIAS is available in three types: KLADR format, dbf and xml. The latter seemed to me the most convenient - the files are not divided by region, unlike dbf, but are stored in grouped form in xml. However, the weight of the source directory in this format is about 14GB.

FIAS in dbf format weighs 9GB instead of 14GB, however it doesn’t have a very convenient structure: the tables of houses and regulatory documents are broken down by region, and as a result, FIAS in this view contains 187 files.

FIAS in the KLADR format is essentially the same as the KLADR itself, with rare exceptions, and weighs the same 330 MB. Line by line comparison of the KLADR and FIAS tables in the KLADR format revealed less than 0.1% of discrepancies, which are probably caused by different unloading times of the KLADR and FIAS databases under review.

What does business think

How can switching from KLADR to FIAS affect work? Is the business ready to move to this directory?

Our colleagues from banks, for which the use of address information is key at all stages, do not see the business benefits of switching to FIAS, but plan to do this in order to meet the requirements of the regulator. Due to the transition of all federal agencies, ministries and departments to FIAS, in the future, there may be requirements to use FIAS when communicating with them (public services, SMEV, reporting, Central Bank).

findings


The biggest problem of official reference books in Russia was and remains the irrelevance of the information provided. As long as there is no normal, well-established system for replenishing FIAS, the quality of data will not be checked and refactoring of what is already in the directory will not be performed, we will meet with all the same problems as in KLADR.

The main advantages of FIAS are the initial attempts to standardize addresses and the presence of a stable key for each house.

Summarizing:


So, while the transition to FIAS makes sense only as a reserve for the future. If you are already working with KLADR and do not interact with external systems, then you can not switch to FIAS, but use KLADR further. If you are just starting your acquaintance with the addresses and plan to connect them to your product, or you need reporting and integration, then you should choose FIAS.

PS: All the information in the article is relevant for the KLADR version 03.07.2014 and the FIAS version 30.06.2014

Source: https://habr.com/ru/post/230823/


All Articles