📜 ⬆️ ⬇️

Parsing sites or long-term construction of the Moscow region

After reviewing the primary housing market in the Moscow region, we, of course, faced the presence of deceived co-investors and problem objects, the so-called “long-term construction projects”. Naturally, the question arose how likely this situation is.


The goal was to carry out the classification of primary construction objects according to a comprehensive set of features: information about the object, the developer, etc. However, publicly available data were rather scarce. Nevertheless, it was possible to collect some descriptive statistics ...


Data sources


To estimate the share of long-term construction in the total amount, a base of new buildings is needed with signs according to which the object can be classified as long-term construction. We opted for novostoykin.ru, as compared with similar sites (novostroy-m.ru, mskguru.ru, novostroev.ru, novostroy.ru, novostroykirf.ru) at the time of the analysis contained the largest number of objects - 1756 against 1093, 674, 426, 1296, 392 from competitors, respectively.


According to the results of the initial parsing of the pages of novostoykin, duplications were discovered - some (small) part of the new buildings were listed in order, objects with the mark “PROJECTS” are also present - it turned out that there was not a single sale for them. After the correction (porosit records are merged, projects are excluded), there are 1,641 objects left in our upload, still more than those of competing sites.


It is important to note that the listed resources are advertisements. Despite the legislative requirement of authenticity (Article 5 No. 38-FZ “On Advertising”), the correctness of the information remains on the conscience of the site owners. Therefore, another spot check was performed on the correctness of information on the object, the state of which is known directly from those interested in providing reliable data. Such an object was chosen LCD "White Dew" (G. O. Kotelniki). Of the above sites:


  1. On novostoykin.ru, novostroev.ru the status of the construction site and the name of the developer are correct;
  2. On mskguru.ru incorrect deadlines are indicated;
  3. On novostroy-m.ru, the incomplete hulls are submitted, the erection of hulls, the construction of which has not started yet;
  4. The developer is incorrectly indicated on novostroy.ru - the GVSU Center (the oldest Moscow developer based on the Main Military Construction Directorate of the Ministry of Defense, formed on January 27, 1964) instead of Stroycomfort LLC (a legal entity formed specifically for this project);
  5. On novostroykirf.ru information on the object is missing.

Thus, novostoykin is still leading as a source of information for assessments and market analysis.


In addition to advertising sites, there are official open sources. Since January 1, 2018, the State Unified Information System for Housing Construction (UISM) has been launched in the Russian Federation, designed to increase the transparency of participatory construction. However, on January 29, 2017, there was no verification object in it, despite the fact that according to clause 5.6 Art. 23.3 №214- in the system should place the information not only developers, but also the regulatory authorities, namely the RosReestr (information about the land plot), GlavStroyNadzor (the results of the inspections that took place in relation to the specified object). It was also not possible to compare EHR data with advertising sites in terms of the number of objects - information was entered by order in UHRIS, part of new buildings doesn’t have information on buildings, so automated data verification is impossible.


In total, it was decided to use the data from Novostroykin for performing the evaluation classification of objects.


Actually, parsing


Parsing is done using the Scrapy framework. The address of the page of the object has the form www.novostroykin.ru/novostroyki/name_in_laclinice/ . The site was crawled in several stages. The first step is to bypass the pages of the directory containing links to new buildings. The second is to read the data for each new building from the corresponding page and save the results to a file. Python spider code:


import scrapy class NovostroykinSpider(scrapy.Spider): name = "novostroykin" def start_requests(self): urls = [ "https://www.novostroykin.ru/novostroyki/find/?t=2&pt=0&ignorenf=1&nf_mode=0&pot=0&pdo=0&reg_mo=1&dometro=0&k1=&k2=&k3=&k4=&beza=&ipoteka=&fav=&econom=&s2017=&rassrochka=&razr=&prod=&warh=&dometrot=&me=&snewmetro=&sdan2=&sdan=&sr_2017_4=&sr_2018_1=&sr_2018_2=&sr_2018_3=&sr_2018_4=&sr_2019=&sr_2020_0=&napl=&otdelka=&balkon=&r=&ord=3&pg=" + str(i) for i in range (1,75)] for url in urls: yield scrapy.Request(url=url, callback=self.parse) def parse(self, response): for url in response.xpath("//*[@id='nff_list']/div/div[2]/div/a/@href").extract(): next_url = response.urljoin(url) if next_url.find("adrai.novostroykin.ru/wow/")==-1: yield scrapy.Request(next_url, callback=self.parse_n) def parse_n(self, response): l = response.xpath("//*[contains(@id,'korpinfo_') and string-length(@id)>10]/p[2]/span[2]/text()|//*[contains(@id,'korpinfo_') and string-length(@id)>10]/p[2]/text()").extract() l2 = [s for s in l if s != " "] StrKeyDate = "/".join(l2) if StrKeyDate is None: StrKeyDate = "" else: StrKeyDate = StrKeyDate.replace("\xa0","") l = response.xpath("//*[contains(@id,'korpinfo_') and string-length(@id)>10]/p[3]/span[2]/text()|//*[contains(@id,'korpinfo_') and string-length(@id)>10]/p[3]/text()").extract() l2 = [s for s in l if s != " "] StrState = "/".join(l2) if StrState is None: StrState = "" else: StrState = StrState.replace("+ /","+ ") yield { 'Name': response.xpath("//*[@id='main_container_table']/tr/td[2]/div[2]/h1/text()").extract(), 'Address': response.xpath("//*[@id='main_container_table']/tr/td[2]/div[2]/div/index/div[5]/text()").extract(), 'Description': " ".join(response.xpath("//*[@id='tab1_1']/div/div/div/div/h3[contains(.,'  ')]/parent::*/p/text()").extract()), 'PayAttention': " ".join(response.xpath("//*[@id='tab1_1']/div/div/div/div/h3[contains(.,'   ')]/parent::*/p/text()").extract()), 'Buildings': "/".join(response.xpath("//*[contains(@id,'nkt2_')]/a/nobr/text()").extract()), 'KeysDate': StrKeyDate, 'State': StrState, 'BuildingType': response.xpath("//*[@id='tab2_1']/div/p[1]/text()").extract(), } 

In the process of parsing, we found out that the structure of the pages of new buildings changed over time - the DOM tree changed, fields were added and / or deleted. The problem was solved by re-unloading pages with a change in the corresponding Xpath in the spider code.


After the data was uploaded, the technical work began - defining the municipal district at the address, clearing the data, identifying duplicates, convolving the buildings into objects, etc.


Analysis: the criterion of "protracted"


The list of new buildings with broken deadlines of construction, we took from publicly available sources:


  1. Consolidated list of problem objects on the territory of Moscow Region on December 29, 2017 and the progress of construction works from the Ministry of the Construction Complex of the Moscow Region;
  2. The list of objects discussed at the meetings of the Moscow Regional Duma Working Group on solving the problem of deceived real estate investors in the Moscow region in 2017;
  3. The status of completion of the SU-155 problem objects - contains all the unfinished objects of the SU-155, even if the work on the object is fully completed by another developer from the Russian Capital JSCB;
  4. The list of objects from the register of “defrauded real estate investors” (citizens whose money was attracted for the construction of apartment buildings and whose rights were violated as of May 01, 2017) contains all the objects included in this register, even if they were handed over as of the reporting date (in general is located).

All objects found in these sources, marked the sign of "protracted."


For objects that did not fall into the above-stated lists of “dolgostroi”, but having the value of “construction suspended” in the “Condition” field on the new building, the latest posts on the forum were automatically downloaded and disassembled. If the messages contained phrases like “appeal of deceived real estate investors to VV Putin”, “rally”, “fraudsters”, “bankruptcy”, etc., then the object was also referred to as long-term construction.


However, the resulting list of protracted is not exhaustive, since:


  1. Since July 2017, the Ministry of Construction and Social Affairs has removed the registry of “defrauded real estate investors” from open access - the most recent data we have is May 2017;
  2. The information in the register of “defrauded co-investors” is entered with a significant delay. Acceptance of documents for entry into the register throughout the Moscow region is carried out by only two officials who work part-time and part-time. It is impossible to accept all comers, because the queue for filing documents stretched out over several months. In addition, the submitted package of documents is considered by the commission, which meets once a month. The shareholder is entered in the register only after a positive decision of the commission;
  3. "The consolidated list of problem objects in the territory of the municipality" does not contain all the problem objects. According to the law MO â„–84 / 2010-OZ, the object must be recognized as problematic if the developer has delayed by more than 9 months the fulfillment of obligations under contracts concluded with citizens and (or) other persons whose funds were attracted for the construction of this apartment building. The list of problem new buildings should include the appropriate municipality. But local authorities are naturally not interested in this, since the low level of problematic construction is one of the criteria for the success of their work. Currently, the inclusion in this list is only by a court decision. For example, the court forced the municipality to about. Khimki recognize the residential complex Aviator as a problem, but a month after the recognition, the municipality reversed its decision. The above-mentioned residential complex "White Dew" in g. Kotelniki is not recognized as a problem, although the delay is 2 years 4 months. At the moment, the municipality disputes the court decision on the recognition of the object as a problem in the appellate instance. LCD "Green City" (g. Lyubertsy) - after the change of the developer the object is removed from the list of problem, but the construction is not being conducted and a new investor;
  4. Many real estate buyers are not active enough and know little legislation in order to get into official lists, and also do not use the forum novostroykin.ru to discuss their problems. For example, the LCD "Lobnya City" with a delay of delivery of more than 2 years, the LCD "Kotelnichesky skyscrapers" with a delay of delivery of 6 months does not appear in any of the above-mentioned lists;
  5. Cases of reliable "healing" of objects from the problematic are vanishingly small and they can be neglected within the framework of the statistics obtained.

Thus, there is every reason to believe that there are much more problem objects in the sample than we were able to identify. We will work with the fact that we managed to classify them by open sources.


results


The probability of protracted when buying a new building was calculated as the ratio of the number of protracted to all new buildings, with a planned deadline of at least one building until June 30, 2017 (in the field “Commissioning”). The deadline until June 30, 2017 is chosen, since according to the legislation, the object can be included in the official lists of long-term construction projects only after a delay of 6 months or more. If the “deadline” field was empty or “no data”, such object was not removed from the base for calculation. It is worth noting that on the new building the “Putting into operation” field contains not the initial, but the adjusted deadline, which underestimates the probability of a long-term construction. For example, the residential complex "Kotelnichesky skyscrapers": according to the project declaration, the deadline for the 2nd quarter of 2017, according to the new building, is the 1st quarter of 2018. Therefore, this LCD is not involved in our calculations.


According to our data, the probability of buying a protracted in the Moscow region was 15%. For the reasons discussed above (not all long-term construction, constant renewal of the delivery date), the calculated figure is a lower estimate, the real value may be higher. However, it is twice as high as the “7% toxicity level of the construction industry” from the official statement of the Minister of Construction and Housing of the Russian Federation Me, M.A. We do not undertake to judge whether such a discrepancy is caused by differences in the data or in the method of calculation or by the special scale of the problem in the Moscow region. However, in addition to buying the notorious dolgostroi (in our sample of those 222), the risk of buying samostroi is also possible - there are about 200 such objects at the end of 2017, 400 at the end of 2012, according to the Ministry of Construction of the Ministry of Defense. An assessment of how self-construction intersects with our sample was not carried out.


We presented the results for the entire database - for new buildings, the construction of which began / ended in different periods. It would be more interesting to calculate the probability of long-term construction in the context of the year the construction began, the initial date of completion of the object. However, we do not have such data.


In the tables below are the probabilities of protracted construction in the context of the municipalities of the Moscow region (not an indication for action!)


Table number 1. Probability of protracted construction in municipalities with weak construction activity (fewer than 10 objects are on sale)


image


Table number 2. Probability of protracted construction in municipalities with medium and strong construction activity (more than 10 objects are on sale)


image


')

Source: https://habr.com/ru/post/347996/


All Articles