📜 ⬆️ ⬇️

Optimization of the process of searching for violators of land legislation

Good afternoon, Habr! It has long dreamed of contributing to this wonderful project that I have been reading for several years.

But since I am not a programmer by training, my projects were not as elegant as those presented here. Therefore, I thought for a very long time what could be placed here and that they would not shower with minuses. As a result, within the framework of the work, the idea of ​​optimizing the work of employees appeared, which makes it possible (in my opinion) to simplify life.

The essence of the idea is that there are land plots on which it is possible to build only private residential buildings (Individual residential construction), and it is prohibited to use these premises for commercial activities. Although it did not stop anyone in Russia, it turns out that employees must walk and check that the house is built as a dwelling, but used as a stall. As a result, you need to walk for a long time and a lot, plus you constantly need access to information to clarify what kind of house it is. Well, or in the office, select addresses for verification and then harness camels, replenish water and go on an amazing journey.

Since we in the media say that the allegedly digital economy is in the yard and in general the Internet of things. That approach in the form look in the database and go for a walk somehow looks strange.
')
As a result, realizing the idea of ​​creating a heat map based on a public cadastral map and analyzing the data on the site, data on all cadastral neighborhoods within the district were obtained first. The extraction method was not very humane. In the analysis of tiles loaded on the map, all SVG files were saved, and the names of quarters were exported from the tags describing the geometry.

Later, in each quarter, all objects within its boundaries were obtained. The problem appeared at the stage of data analysis, that the coordinates of the objects are given in the mercator P3857 format, we had to add the translation to the usual coordinate system with longitude and latitude. The coordinates of the object are given in two forms, this is the exact center of the site and the extreme points of the general geometry of the site.

json data
"center": { "x": 5135333.187724394, "y": 6709637.473119974 }, "extent": { "xmax": 5135346.924914406, "xmin": 5135320.375756669, "ymax": 6709666.568613776, "ymin": 6709611.5618947 } 


The first coordinates after the transfer I used for the heat map. While fiddling with the map in QGIS (which I saw for the first time) came the crazy idea that you can search for objects by coordinates in the popular Yandex map and 2Gis services. Immediately decided to check this theory. And the truth is, if you drive in coordinates, then both services provide information about this house. Consequently, the conclusion suggests itself, if you select objects with characteristics only for residential construction, and check them through the data on organizations, if the cards returned a list of organizations, then this structure is not used for its intended purpose.

The initial verification was carried out on the data on the center of the plot, and as practice has shown, the data did not always converge, since the plot is larger than the house on it, and the house is not always in the middle of the plot. But to confirm the theory, it fit. Checked on that site which precisely I know that there is a private sector and precisely there are small shops or services.

To get data from the maps, I used a blunt Python script (I apologize right away? If this code offended programmers):

code
 import requests import json import time import csv import pyproj from random import choice from random import randint from pandas import read_csv def get_proxy(): headers = {'User-Agent':'Mozilla/5.0 (Windows NT 8; WOW32; rv:54.0) Gecko/20100101 Firefox/54.0'} proxies = open('proxy.txt').read().split('\n') result=None proxy="" while result is None: try: proxy = {'http':'http://'+choice(proxies)} r = requests.get('http://ya.ru',headers=headers, proxies=proxy,timeout=(60, 60)) if r.status_code==200: result=proxy except: pass return proxy def main(): headers = { 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', 'Accept-Encoding':'gzip, deflate', 'Accept-Language':'ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4', 'Cache-Control':'max-age=0', 'Connection':'keep-alive', 'Cookie':'_ym_uid=1495451406361935235; _ym_isad=1; topCities=43-2; _2gis_webapi_session=0faa6497-0004-47c5-86d8-3bf9677f972a; _2gis_webapi_user=7e256d33-4c6e-44ab-a219-efc71e2d330f', 'DNT':'1', 'Host':'catalog.api.2gis.ru', 'Upgrade-Insecure-Requests':'1', 'User-Agent':'Mozilla/5.0 (Windows NT 7; WOW32; rv:54.0) Gecko/20100101 Firefox/54.0', 'X-Compress':'null' } print("start") csvfile = open('GetShopsByCoordinats.csv', 'a', newline='') fieldnames = ['id', 'x','y','shops','names'] writer = csv.DictWriter(csvfile, fieldnames=fieldnames, delimiter=';', quotechar='|',) writer.writeheader() inputCSV = read_csv('for2GIS.csv', sep=';', skiprows=[0], header=None) for i in range (0,5): s= 'point='+str(inputCSV[1][i])+'%2C'+str(inputCSV[2][i])+'&fields=search_attributes%2Citems.links&key=rutnpt3272' print (s) with requests.Session() as session: r=session.get('https://catalog.api.2gis.ru/2.0/geo/search?'+s, headers=headers,timeout=(60, 60)) JsonData = r.json() try: print(JsonData['result']['items'][0]['links']['branches']['count']) writer.writerow({'id': str(inputCSV[0][i]),'x': str(inputCSV[1][i]), 'y': str(inputCSV[2][i]), 'shops': JsonData['result']['items'][0]['links']['branches']['count'] }) except: writer.writerow({'id': str(inputCSV[0][i]),'x': str(inputCSV[1][i]), 'y': str(inputCSV[2][i]), 'shops': 'none' }) time.sleep(randint(1,3)) csvfile.close() print ('fin') if __name__=='__main__': main() 


The result was a file in which all checked objects with a cadastral number and coordinates of the site center and data from the site about the number of stores at a given point are reflected. The names and contact details were not needed.

Banal autofilter in Excel on a column with the number of stores more than 0, and we get a list of potential violators, you just need to go over the already "verified" addresses.
The plans for the future to make the search is not at the point and within the radius of the borders of the site, like the possibility is in the API. And if the number of identified objects is sufficient, then you can even make the builder of optimal routes and templates of acts of verification of the available data.

The result was a very simple system that allows you not to just walk around employees, but to check only the necessary objects. After confirming the identified objects and if you need to adjust the algorithm, you can create lists for verification automatically by simply selecting the required area of ​​verification on the map.

Regarding the data upload, download, now received a legal API key.

Source: https://habr.com/ru/post/336972/


All Articles