📜 ⬆️ ⬇️

Data service for elections and candidates

Good afternoon, colleagues!

As many know, on September 14, 2014, a single voting day will take place, in many regions of Russia they elect deputies and sometimes mayors.
In this case, the information support, in my opinion, is lame. The main disadvantage is that information about candidates cannot be viewed in a list, only a list without details (also divided into pages of 20 people) and one person with details.

On a sunny summer day, an idea came to me to pull out this information so that it could be convenient to analyze and select candidates visually and wisely. Unfortunately, the CEC does not provide any export options for all elections (I at least did not find it), so the solution is to parse the pages with a robot.

')
The first decision was on the Rock, I wanted to consolidate knowledge in the language and deal with the new Play framework for me. The parser wrote, tested, but, unfortunately, did not master the documentation for Play, could not find the answer to some initial question for a long time. After that, I decided to deal with the Django framework, since the documentation was all much better, so the parser was rewritten to Python.

The project can be viewed on Github , the “scala-parser” folder remains on the Scala parser.

In the process of development, when drafting models, a remarkable bonus turned out, we can get the entire history of the candidate’s participation in the elections (since 2007, when the CEC switched over to the current format, I’m not old, it would give a maximum of +1 election to history, the resource itself started in 2003). This, in fact, can be considered the main value of the project, since now the voter can get complete information about where, when and with whom this or that candidate participated in the elections. In the list of elections, a column is displayed, how many times a candidate participated in elections, and you can go to the candidate's page, see all his elections and all information. As far as I know, a pair of name + date of birth is unique for Russian citizens, so there will be no mistakes.

The models are obvious constructions , election objects (name, date, and link), human objects (name and date of birth), and information objects with all the election data with links to specific elections and a particular person.

Parsing a site on python using the BeautifulSoup library can be found here . During the development I had to solve the problem with the commissions, which are sometimes confused with the dates of the candidates and their full names when they are entered into the database, I check the update date for all records of information at the end of the election processing. If the date of updating information on the candidate is much less than the date of updating the election - this information is superfluous, it can be deleted.

And then comes the most ordinary Django project, which is of no particular interest.

For dynamic filters and sorts in the table, use the js-library http://tablefilter.free.fr/

Initially, the project was placed on Heroku, but I rather quickly exceeded the free limit on the database (no more than 10,000 lines), now, after parsing the elections of the Moscow region, Moscow and St. Petersburg, the number of candidates is about 50,000. The call on Facebook about sponsoring the project gave me a free virtual server from Sergey Arsentiev , for which he thanks a lot!

It was for me the first experience of setting up a linux server via ssh for a Django project via Gunicorn with Nginx, so the growth of knowledge was just amazing. One question remains, for some reason, the logs are not written when you start via Upstart, if someone is in the subject line - help . Upstart and Nginx configs can also be found on the githab.

Actually, the link to the working site
elections.istra-da.ru

For example, information on the elections to the Moscow City Duma can be found here:
elections.istra-da.ru/election/1399

If there is a need - call your regions and areas, I will also include them in the tasks for the robot. I didn’t scan all regions yet, only the Moscow Region, Moscow and St. Petersburg, I fear that the CEC would be offended and block the parser.

Comments, suggestions, suggestions, ideas for further development, development assistance are welcome.

Source: https://habr.com/ru/post/235977/


All Articles