Ad Exchange for Real-Time Bidding (RTB) is one of AdTech solutions that modify the online advertising market. Its main function is the docking of a large number of SSP and DSP, which do not have direct integration between themselves, as well as the resale of various advertising traffic between them.
Thanks to the order for the US market, we plunged into the specifics of building the Ad Exchange platform. And in this article we present some ideas and results.
Formulation of the problem
Real-Time Bidding (RTB) provides the sale of advertising space on sites in real time to show relevant ads to the target audience.
')
In short, the process diagram is as follows:
- the end user requests a web page or a mobile application where space is reserved for the banner (embedded code for the sale of advertising inventory - SSP, Supply Side Platform);
- to ensure the maximum sale price of SSP inventory through Ad Exchange, it organizes bidding between different DSP (Demand Side Platform), whose goal is to buy inventory as cheap as possible;
- after the announcement of the auction winner, the winning DSP sends the SSP code of the banner, which is shown to the user;
- Another side of the process is DMP, a third-party system that provides the DSP with detailed information about the end user (beyond what can be sent to the SSP as cookies, etc.) to justify the purchase and the proposed price.
There are quite a few Ad Exchange exchanges today. In addition, many SSPs implement their own trades (actually closing the Ad Exchange functionality). But our customer was sure that due to some new ideas he would be able to quickly enter the market and withstand competition.
The exchanges operate according to different principles: someone offers a large margin, someone less, someone trades unique equipment, others focus on conditional consumer goods. The market is quite young and actively developing, so there are no proven business models over the years: everything is built on bold hypotheses and experiments. Most players work in a simple way: they receive a request from one of several SSPs with which they were able to agree, and send it to all integrated DSPs in anticipation of a better bet. Ad Exchange revenue is the difference between the purchase price and the sale of advertising inventory from SSP and DSP minus operating costs.
This scheme was proposed by our client to optimize by correctly distributing SSP requests to the DSP — not to send out obviously “losing” requests, thereby reducing operating costs. Due to this, you can reduce the commission of the stock exchange, without losing income, and make your offer more attractive against the backdrop of competing Ad Exchange in the fight for SSP and DSP. And connecting more partners will give both income and stability in the market.
In order to implement this strategy in the US market, we were tasked to make an Ad Exchange with intelligent query distribution, which was to provide a good redemption rate. In theory, for such a distribution, you can use a lot of information accompanying the request, even data from the above-mentioned third-party systems (DMP). However, complex analytics requires resources, so the task is really to find a balance between the costs of smart distribution and the gain (compared to other market players) from its implementation. In a relatively new immature market, building very complex solutions, squeezing a few tenths of percent optimization, simply does not make sense.
An important feature of the project, in addition to the expected high loads, was the fulfillment of the non-functional requirement for the speed of the auction put up by SSP. Adequate in this segment of the market is the timeout waiting for a response from the SSP of 300 ms, which was necessary to meet with calls to external systems (DSP).
The project started in the fall of 2016. Thanks to the experience of the team in this area, after three months, we made the first prototype, and after three more months, the MVP (Minimum Viable Product) was ready, which allowed us to assemble the first analytics to launch a smart distribution of requests within the Ad Exchange.
The launch of MVP showed that the hypothesis about the commercial success of the project is correct - the Ad Exchange began to earn money for the client. The initial development of the Ad Exchange was a deeper study of the data - connecting to analytics information about end users from external systems. But at the MVP stage, it was decided to use only the data possessed by the SSP. That was enough to get the expected profit.
Solution Architecture
The solution is built on the Chain of responsibility template, which allows not to fix the request route within the system, easily adding handlers and various services, from the auction itself to filtering tools.
The customer did not limit us to the stack of technologies used. Therefore, caring for the future development and support of the project, we built a horizontally scalable solution using Postgres and Hadoop.
Ad Exchange itself is written in Java - at the same time we did not use any frameworks so as not to sag in load (we worked at a low level).
In order to meet the mentioned SSP timeout, we selected the garbage collector parameters (used by G1) and worked synchronously with a large number of requests - we used an HTTP client that does not block the stream, as well as an HTTP keep-alive protocol extension that allows you to send several requests within single tcp connection
Software components are deployed on hardware leased from a hoster, since conditions of the task did not allow to use the cloud due to the overlapping of resources of virtual cloud machines (the allocation of the necessary resources may take time, but we do not have it). At the moment, Ad Exchange uses four physical servers, one of which is redundant (for seamless updates, etc.).
The open source Apache Kafka is used as a message broker - it is perfectly integrated into our “one subscriber - many publishers” model, although it had to be “screwed up” a little so that repeated messages did not come.
Each of the servers provides in normal mode the processing of the order of 10 thousand requests per second (these parameters were laid during the development of the solution). Now the average load is 15-20 thousand requests per second, and at the peak the request flow reached 40 thousand per second within a few hours, and the Ad Exchange coped with it perfectly.
The distribution of requests between servers is performed by the software load balancer nginx, which is configured for our task. In our experience, nginx can hold up to 60–70 thousand requests per second without allocating a separate hardware balancer. If in the future the Ad Exchange load is above this threshold, we are planning to purchase a hardware balancer, which will distribute the requests among several nginx of the same type.
Monitors what is happening, subject to continuous load growth, the monitoring system, which is part of the Ad Exchange created.
Storage
Given the rate on analytics during the distribution of requests, the database is an integral part of our Ad Exchange. The system stores information about bids, bidders and deals made.
It makes no sense to collect this amount of data for the entire period of operation of Ad Exchange, so the storage has a multi-layered architecture. All auction data is stored for a week. On their basis, higher-level intermediate aggregates are built, which are stored for several months. And already on the basis of intermediate end assemblies are assembled, used in long-term analytics and for reconciliations with SSP and DSP. Among other information in these units there is data on how many bets were made and how much money the exchange will pay SSP or expects to receive from the DSP.
Final aggregates are stored for the duration of the Ad Exchange.
Collecting analytics and building aggregates provide separate services.
So that the storage corresponded to the speed of the system itself, it also had to work with it. In particular, for some time we fought with the hoster, because data about transactions simply did not have time to register in the database. It turned out that the hardware problem with the RAID array was to blame. After replacing it, we were able to squeeze out 90 thousand queries per second to Postgres (on inserting data into the database).
The rest of the Ad Exchange is stateless, which in future provides easy horizontal scaling. It does not store any data on requests - the maximum, the obtained information about which DSP to choose. So we can add new servers to process requests as needed.
Traffic filtering
The key element of the system, which allows to reduce the load and meet the timeouts indicated by the customer, is traffic filtering.
According to the task performed by any Ad Exchange:
- accepts requests from the SSP;
- conducts an auction (sends requests to several DSPs, compares the prices offered, identifies the winner);
- coordinates the victory with SSP (informs the winner’s price minus his commission, waits for a response with the total price of the show);
- completes the transaction (informs the desired DSP about his victory, conducts user traffic).
Clever query distribution in our Ad Exchange is enabled at the beginning of the auction.
Receiving a request from an SSP with certain information (IP, user agent), we detail it using information accumulated in the system — known data about the user, a list of DSPs to which similar requests were sent, their responses, etc. This is necessary to select the most advantageous DSP combination for each request. Thanks to the selection of such a combination, the system allows not to send requests to those DSPs that do not bet or do, but are too low. To do this, a separate service in real time maps how the DSP responds to requests (these cards are stored in Redis).
In parallel, we check the status of the DSP - if the proportion of responses within the timeout drops, the system automatically reduces the number of requests to this DSP. As soon as the load on the DSP decreases (and the proportion of correct answers increases in a reasonable time), the number of requests gradually returns to the previous level.
Among the DSPs that responded on time, we conduct an internal auction - we highlight the best offer and send it to SSP. From the time of the request from the SSP to our response, it takes no more than 300 ms, in accordance with the industry requirement.
Since we give the data to the SSP where our auction takes place, we need to consider the bets winning there. Their auction is engaged in the auction server at the next stage, when processing user traffic. Thanks to him, the DSP response map is enriched with new data (along with the collected information about the end user).
Comparison of data obtained at the auction stage, and parameters known from user traffic, allows you to screen bots (clickers for advertising, search bots, etc.). Such traffic is not redeemed by DSP, and in the absence of its own filtering system, it turns into customer losses, which will have to be closed with a margin.
It should be noted that the filtration of bots traffic was not launched immediately. But after the inclusion of simple locks, the margin gain was about 50%.
By the way, in addition to automatic traffic filtering tools in our system, it is possible for the customer to manually change the threshold values ​​of a number of parameters, thus adjusting the margin.
The user traffic itself is critical for us, but when processing it, it is no longer necessary to fit in 300 ms. It uses a separate processing system, which may hold the user a little, but will not allow losing this request.
To ensure the stability of the solution, a subsystem was introduced, which, realizing the current Ad Exchange load, “cuts off” the requests for auctions that it cannot physically process. So the system is protected from an uncontrolled increase in load from the SSP.
Perspectives
To date, the Ad Exchange created by us works and brings a good profit. And we support and integrate new partners (DSP / SSP) as needed. In total, several dozen systems have already been integrated. Each such integration implies not only a software connection, but also comprehensive testing of the service, because under heavy loads, the problems of the connected service may affect other partners.
In general, the market moves to the fact that SSP and DSP will connect directly, which will make exchanges unnecessary. But integration rests on the capabilities of SSP and DSP. Despite the existence of an open described API (OpenRTB protocol), it is not yet generally recognized in the market. For example, such a large player as Appnexus has integrated support for OpenRTB quite recently.
Essentially, an Ad Exchange is a liquidity provider. So the decision in the near future is unlikely to lose its relevance. Moreover, the rest of the advertising market exchange model is gaining popularity.
Article author: Nikolay Eremin
PS We publish our articles on several sites Runet. Subscribe to our pages on the
VK ,
FB or
Telegram channel to learn about all of our publications and other news from Maxilect.