
In this article, I will talk about how the search ad housing service from Vkontakte works and developed, why a service-oriented architecture was chosen, and what technologies and solutions were used in its development.
Service works more than nine months.
During this time:
')
- We managed to cover 21 largest cities of Russia. Among them are Moscow, St. Petersburg, Yekaterinburg and Kazan.
- It turned out to increase the total number of metro stations from 65 to 346 .
- The average number of ads increased from 131.2 to 519.41 per day.
- A settings control panel has been added.
- Bots for Telegram and Vkontakte have been added. They automatically notify subscribers of new announcements.
Hereinafter, I will use the word
service - as an
SOA module, and not the entire web service.
I chose the
SOA architecture because it enabled:
- Use different technologies to solve different problems.
- Develop service independently of others.
- Deploy services independently.
- Scale services horizontally.
You could call it microservice architecture, but there were some minor differences. Between services, data exchange is used based on the “Common Database” using the
MDBWP protocol instead of the
HTTP API API that is usual for microservices and storing the data of each service in its database. This approach was due to the rapid development with the ability to retain all the advantages of the described
SOA approach.
Ansible was chosen to automate the warmup.
This is one of the configuration management systems that has a low entry threshold.
MongoDB was chosen as the database. This document-oriented database was perfect for storing ads with a list of metro stations, contact details of landlords, and a description of the ad.
At the moment, the general scheme of interaction between services is as follows:

Services:
rent-view - service display ads and search for them
github.com/mrsuh/rent-view
The service is written on
NodeJS , because The most important criterion of its quality was the speed of the server’s response to the user.
The service requests ads in
MongoDB , renders HTML pages using the
doT.js template
engine and gives them to the browser.
The service is built using
Grunt .
To work in a browser, scripts are written in pure
JS , and styles are written in
LESS .
Nginx is used as a proxy server, which caches part of the responses and provides an
HTTPS connection.
rent-collector - ad collection service
github.com/mrsuh/rent-collector
The service collects ads, classifies them and writes them to the database.
It is written in
PHP for several reasons: knowledge of the necessary libraries for writing the service, as well as high speed of development.
The
symfony 3 framework is used.
Beanstalk was selected as the queuing service. It is lightweight, but does not have its own message broker. This is exactly what is needed for a small virtual server and for non-critical data to be lost.
Using
beanstalk ,
4 messaging
channels were made:
- parser - highlights from the text such facts as ad type, price, description and links. To speed up the processing of data, I launched several consumers for this channel.
Note: the consumer communicates with the rent-parser service. - collector - writes the processed data about the ads in the database.
- notifier - notifies users of new ads. Note: the user communicates with the rent-notifier service.
- publisher - publishes ads in several Vkontakte groups.
rent-parser - classified ads service
github.com/mrsuh/rent-parserService written in
Golang .
To extract structured data from the text, the service uses the
Tomita parser from
Yandex . Performs preprocessing of the text and subsequent processing of the results of parsing.
So that you can test the service, I made
an open API .
Try parser onlineRequest:
curl -X POST -d ' 30 . + 7 999 999 9999' 'http://api.socrent.ru/parse'
Answer:
{"type":2,"phone":["9999999999"],"price":30000}
Types of ads:
+ 0 - room
+ 1 - 1 bedroom apartment
+ 2 - 2 bedroom apartment
+ 3 - 3 bedroom apartment
+ 4 - 4+ room apartment
+ 5 - studio
+ 6 - no ads
For more information about the classification of ads, I wrote here
habrahabr.ru/post/328282rent-control - settings management service
github.com/mrsuh/rent-control
It is written in
PHP for several reasons: knowledge of the necessary libraries for writing the service, as well as high speed of development.
The
symfony 3 framework is used.
Bootstrap Style Library
3 .
The settings managed by the service include:
- ads;
- black list;
- publication configurations;
- configuration parsing.
Initially, all the data to control the parsing lay in the configuration files. With the increase in the number of cities, it was necessary to visualize them and simplify the editing of records. In addition, it was required to simplify the addition of new parameters.
rent-notifier is a bot service for sending out new announcements in Telegram and Vkontakte.
github.com/mrsuh/rent-notifierExample of subscribing to ads:

The service is written in
Golang due to the criticality of the speed of response to the user.
The essence of the service is as follows: you subscribe to the distribution of new announcements, and as you add, the bot sends you messages about them. The service inserts a link to the original ad in the message text.
Auxiliary repos
Code for PHP common database
github.com/mrsuh/rent-schemaGeneral database schema:

With the addition of the rent-control service, the duplication of the database schema code appeared. Therefore, it was decided to make the code in a separate package. Now for any service in
PHP, it is enough to add this package to the dependencies via
composer .
composer require mrsuh/rent-schema
ODM for mongoDB
github.com/mrsuh/mongo-odmThe first ODM for PHP MongoDB that I thought was
Doctrine 2 . It comes with
symfony 3 and has good documentation.
But at the time of writing the service, in order for this
ODM to start working with the latest version of drivers for
Mongo PHP , it was necessary to install another package as a layer between the new and the old
API .
Doctrine 2 is a fairly large project in itself, and with an additional package it became even bigger. Instead, I wanted something lightweight. Therefore, I decided to write
ODM myself with a minimal functional set. And I did it -
ODM completely copes with its responsibilities.
Some statistics
The service adds an average of
519.41 ads per day to the site.
The most popular metro stations, among the largest cities of Russia, were the following:
- St. Petersburg - Devyatkino
- Moscow - Komsomolskaya
- Kazan - Victory Avenue
- Ekaterinburg - Uralmash
- Nizhny Novgorod - Avtozavodskaya
- Novosibirsk - Marx Square
- Samara - Moscow
More statistics can be viewed on the site itself.
Conclusion
If you have not yet decided whether you need an
SOA architecture, then make a monolithic application with a breakdown into modules. So it will be easier to transfer your application to services if necessary. But if you still decide to use
SOA architecture, you should understand that this may increase the complexity of the development, the complexity of the deployment, the amount of code, as well as the volume of messages between services.
PS I found the last two apartments with the help of my service. I hope he helps you too.