After reviewing the primary housing market in the Moscow region, we, of course, faced the presence of deceived co-investors and problem objects, the so-called “long-term construction projects”. Naturally, the question arose how likely this situation is.
The goal was to carry out the classification of primary construction objects according to a comprehensive set of features: information about the object, the developer, etc. However, publicly available data were rather scarce. Nevertheless, it was possible to collect some descriptive statistics ...
To estimate the share of long-term construction in the total amount, a base of new buildings is needed with signs according to which the object can be classified as long-term construction. We opted for novostoykin.ru, as compared with similar sites (novostroy-m.ru, mskguru.ru, novostroev.ru, novostroy.ru, novostroykirf.ru) at the time of the analysis contained the largest number of objects - 1756 against 1093, 674, 426, 1296, 392 from competitors, respectively.
According to the results of the initial parsing of the pages of novostoykin, duplications were discovered - some (small) part of the new buildings were listed in order, objects with the mark “PROJECTS” are also present - it turned out that there was not a single sale for them. After the correction (porosit records are merged, projects are excluded), there are 1,641 objects left in our upload, still more than those of competing sites.
It is important to note that the listed resources are advertisements. Despite the legislative requirement of authenticity (Article 5 No. 38-FZ “On Advertising”), the correctness of the information remains on the conscience of the site owners. Therefore, another spot check was performed on the correctness of information on the object, the state of which is known directly from those interested in providing reliable data. Such an object was chosen LCD "White Dew" (G. O. Kotelniki). Of the above sites:
Thus, novostoykin is still leading as a source of information for assessments and market analysis.
In addition to advertising sites, there are official open sources. Since January 1, 2018, the State Unified Information System for Housing Construction (UISM) has been launched in the Russian Federation, designed to increase the transparency of participatory construction. However, on January 29, 2017, there was no verification object in it, despite the fact that according to clause 5.6 Art. 23.3 №214- in the system should place the information not only developers, but also the regulatory authorities, namely the RosReestr (information about the land plot), GlavStroyNadzor (the results of the inspections that took place in relation to the specified object). It was also not possible to compare EHR data with advertising sites in terms of the number of objects - information was entered by order in UHRIS, part of new buildings doesn’t have information on buildings, so automated data verification is impossible.
In total, it was decided to use the data from Novostroykin for performing the evaluation classification of objects.
Parsing is done using the Scrapy framework. The address of the page of the object has the form www.novostroykin.ru/novostroyki/name_in_laclinice/ . The site was crawled in several stages. The first step is to bypass the pages of the directory containing links to new buildings. The second is to read the data for each new building from the corresponding page and save the results to a file. Python spider code:
import scrapy class NovostroykinSpider(scrapy.Spider): name = "novostroykin" def start_requests(self): urls = [ "https://www.novostroykin.ru/novostroyki/find/?t=2&pt=0&ignorenf=1&nf_mode=0&pot=0&pdo=0®_mo=1&dometro=0&k1=&k2=&k3=&k4=&beza=&ipoteka=&fav=&econom=&s2017=&rassrochka=&razr=&prod=&warh=&dometrot=&me=&snewmetro=&sdan2=&sdan=&sr_2017_4=&sr_2018_1=&sr_2018_2=&sr_2018_3=&sr_2018_4=&sr_2019=&sr_2020_0=&napl=&otdelka=&balkon=&r=&ord=3&pg=" + str(i) for i in range (1,75)] for url in urls: yield scrapy.Request(url=url, callback=self.parse) def parse(self, response): for url in response.xpath("//*[@id='nff_list']/div/div[2]/div/a/@href").extract(): next_url = response.urljoin(url) if next_url.find("adrai.novostroykin.ru/wow/")==-1: yield scrapy.Request(next_url, callback=self.parse_n) def parse_n(self, response): l = response.xpath("//*[contains(@id,'korpinfo_') and string-length(@id)>10]/p[2]/span[2]/text()|//*[contains(@id,'korpinfo_') and string-length(@id)>10]/p[2]/text()").extract() l2 = [s for s in l if s != " "] StrKeyDate = "/".join(l2) if StrKeyDate is None: StrKeyDate = "" else: StrKeyDate = StrKeyDate.replace("\xa0","") l = response.xpath("//*[contains(@id,'korpinfo_') and string-length(@id)>10]/p[3]/span[2]/text()|//*[contains(@id,'korpinfo_') and string-length(@id)>10]/p[3]/text()").extract() l2 = [s for s in l if s != " "] StrState = "/".join(l2) if StrState is None: StrState = "" else: StrState = StrState.replace("+ /","+ ") yield { 'Name': response.xpath("//*[@id='main_container_table']/tr/td[2]/div[2]/h1/text()").extract(), 'Address': response.xpath("//*[@id='main_container_table']/tr/td[2]/div[2]/div/index/div[5]/text()").extract(), 'Description': " ".join(response.xpath("//*[@id='tab1_1']/div/div/div/div/h3[contains(.,' ')]/parent::*/p/text()").extract()), 'PayAttention': " ".join(response.xpath("//*[@id='tab1_1']/div/div/div/div/h3[contains(.,' ')]/parent::*/p/text()").extract()), 'Buildings': "/".join(response.xpath("//*[contains(@id,'nkt2_')]/a/nobr/text()").extract()), 'KeysDate': StrKeyDate, 'State': StrState, 'BuildingType': response.xpath("//*[@id='tab2_1']/div/p[1]/text()").extract(), }
In the process of parsing, we found out that the structure of the pages of new buildings changed over time - the DOM tree changed, fields were added and / or deleted. The problem was solved by re-unloading pages with a change in the corresponding Xpath in the spider code.
After the data was uploaded, the technical work began - defining the municipal district at the address, clearing the data, identifying duplicates, convolving the buildings into objects, etc.
The list of new buildings with broken deadlines of construction, we took from publicly available sources:
All objects found in these sources, marked the sign of "protracted."
For objects that did not fall into the above-stated lists of “dolgostroi”, but having the value of “construction suspended” in the “Condition” field on the new building, the latest posts on the forum were automatically downloaded and disassembled. If the messages contained phrases like “appeal of deceived real estate investors to VV Putin”, “rally”, “fraudsters”, “bankruptcy”, etc., then the object was also referred to as long-term construction.
However, the resulting list of protracted is not exhaustive, since:
Thus, there is every reason to believe that there are much more problem objects in the sample than we were able to identify. We will work with the fact that we managed to classify them by open sources.
The probability of protracted when buying a new building was calculated as the ratio of the number of protracted to all new buildings, with a planned deadline of at least one building until June 30, 2017 (in the field “Commissioning”). The deadline until June 30, 2017 is chosen, since according to the legislation, the object can be included in the official lists of long-term construction projects only after a delay of 6 months or more. If the “deadline” field was empty or “no data”, such object was not removed from the base for calculation. It is worth noting that on the new building the “Putting into operation” field contains not the initial, but the adjusted deadline, which underestimates the probability of a long-term construction. For example, the residential complex "Kotelnichesky skyscrapers": according to the project declaration, the deadline for the 2nd quarter of 2017, according to the new building, is the 1st quarter of 2018. Therefore, this LCD is not involved in our calculations.
According to our data, the probability of buying a protracted in the Moscow region was 15%. For the reasons discussed above (not all long-term construction, constant renewal of the delivery date), the calculated figure is a lower estimate, the real value may be higher. However, it is twice as high as the “7% toxicity level of the construction industry” from the official statement of the Minister of Construction and Housing of the Russian Federation Me, M.A. We do not undertake to judge whether such a discrepancy is caused by differences in the data or in the method of calculation or by the special scale of the problem in the Moscow region. However, in addition to buying the notorious dolgostroi (in our sample of those 222), the risk of buying samostroi is also possible - there are about 200 such objects at the end of 2017, 400 at the end of 2012, according to the Ministry of Construction of the Ministry of Defense. An assessment of how self-construction intersects with our sample was not carried out.
We presented the results for the entire database - for new buildings, the construction of which began / ended in different periods. It would be more interesting to calculate the probability of long-term construction in the context of the year the construction began, the initial date of completion of the object. However, we do not have such data.
In the tables below are the probabilities of protracted construction in the context of the municipalities of the Moscow region (not an indication for action!)
Table number 1. Probability of protracted construction in municipalities with weak construction activity (fewer than 10 objects are on sale)
Table number 2. Probability of protracted construction in municipalities with medium and strong construction activity (more than 10 objects are on sale)
Source: https://habr.com/ru/post/347996/
All Articles