⬆️ ⬇️

Airbnb pricing algorithm secrets





What would you charge for strangers living in your house? Or how much would you pay to live with someone? Would you pay more or less if it was a planned vacation or a spontaneous trip?

Not so easy to answer all these questions. At one time, we were faced with the fact that forcing landlords and users to respond to them, we thereby reduced the active housing database. Collecting focus groups, we watched how people put their housing on the list of available for rent places on our portal. And the majority got stuck when it was necessary to assign a cost. Many began to look at what prices were set for housing nearby, opening up a bunch of tabs in the browser and trying to compare their offer with similar ones. Someone has already come, having a specific goal, perhaps, to earn a little money to pay for a mortgage or pay for a vacation. Such people set the price based on their premeditated goals, without taking into account the real situation on the market. And some, unfortunately, just gave up and did not indicate the cost of renting their housing.



We came to the conclusion that you need to offer landlords a convenient automated service to help you decide when assigning a rental price. Development began in 2012, and we are still refining it periodically. This summer we introduced dynamic pricing: approximate prices are recalculated daily, based on the current market situation. We set up the algorithm so that it takes into account the presence of unusual, even surprising properties of the proposed housing. We also introduced a unique, as we believe, machine learning mechanism that allows the system not only to learn from experience, but, if necessary, to use a small bit of “human” intuition.



Algorithms for assigning and bidding are used in many online services. For example, eBay will show a list of similar products sold and on its basis will offer to choose a price. But the task of this trading platform was relatively simple: it doesn’t matter where the sellers and buyers are geographically located, or when exactly the goods will be sold or bought. At the same time, geography and time are important for Uber and Lyft services. But in these companies, prices are enforced, and they have no need for transparency in pricing.

')

We are faced with a very difficult problem. Each of the more than a million positions on our portal is unique, has its own address, differs in size and decoration. Each landlord has their own wishes regarding guest service, cooking, or the role of a guide. Add to this all sorts of holidays and events: some of them are regular, like seasonal weather changes; others are rare and unpredictable.



Three years ago, we began to create a service that prompts owners how to set prices based on the housing parameters: the number of rooms and beds, the neighborhood, the development of infrastructure, and much more. The first version of the service was released in 2013 and for the most part performed its task well. But she had certain limitations. For example, it was impossible to change the way in which the algorithms determined prices. If they came to the conclusion that, for example, in the Perl area in Portland there are certain price restrictions, or that housing in coastal homes should be so much more expensive than a block further, then the algorithms used these metrics always until they were corrected. manually. Also, the pricing was not dynamic: advice to landlords did not depend on the time of year or the demand for housing in the area.



From the middle of 2014, we tried to improve the service. We wanted to make him learn from his mistakes and successes by interacting with users. We also wanted to teach him to adapt to current demand and, if necessary, reduce prices in order to fill idle housing. Or, on the contrary, raise prices if demand is high.



Three situations



To assess the difficulties that we had to overcome, let's consider three different situations.



Imagine that you live in Brazil , and the next Football Championship is about to begin. Tourists and fans from all over the world will rush to your city. You have a suitable room in the house, and you want to earn some extra money, renting it to football fans. In order for our service to offer you an adequate price, it had to take into account several factors. First, the FIFA World Cup is a unique event for this country, which happens extremely rarely. And we just have nothing to push off. Secondly, the rooms in all hotels are already booked, so there was a huge imbalance of supply and demand. Thirdly, the arriving tourists have already paid a lot of money for international flights, so they are certainly ready to spend well on accommodation. All this should be taken into account in addition to the basic criteria such as the size of housing, location and number of rooms.



The second situation: you inherited from rich Uncle MacLeod a castle in the mountains of Scotland with all attributes due. And you need to regularly lay out a round sum for cleaning the moat with water, working distillery, feeding falcons, etc. To ease your financial burden, you decided to turn one of the towers of the castle into rentable apartments. In contrast to the World Cup, you have something to build on, given the number of castles in the area. Moreover, according to some criteria, statistics are available for long-term periods, for example, the seasonality of tourism. And you can estimate with a high degree of accuracy that demand and supply are balanced at the moment. Although your castle is all unique. How does a system evaluate in money the attractiveness of this uniqueness to visitors?



And finally, the third example. Suppose you are the owner of a typical two-room apartment in Paris . You want to take a few weeks vacation in August and go to the south of the country, in Montpellier . Your proposal can be compared with a huge number of other positions, so the price is quite easy to determine. But after the initial placement, you see that people showed interest in your property, and decided to raise the price a bit in order to earn more. But this is a delicate matter - what will happen if you hurt the rent too much or raise it too close to the date you booked? Maybe you won’t earn anything at all, left without clients. But if you don’t risk playing with the price and are satisfied with the recommended, fairly low amount, then for several months you will reproach yourself for indecision and worry about possible loss of profits. How can we help owners get the best information so that they are not in a state of uncertainty or regret?



Here we have encountered such problems. But we wanted to create an easy-to-use service, useful in making pricing decisions, with transparent rules.



Architecture



According to our calculations, the architecture of the service should have been surprisingly simple. When the owner adds housing to our portal, the system extracts from the data obtained what we call the “key parameters” of this position. She then compares with positions located in the same terrain and having the same or similar parameters, selects those that have been successfully booked, determines the influence of seasonality and, based on average values, calculates what rental value can be recommended.



The difficulties began already at the stage of determining the key parameters as such. There are no two identical positions, both in decoration and in layout. The offered housing is scattered throughout the city, many of the positions are not apartments and houses, but some castles, huts and yurts. Then we decided that our service should focus on three main types of data: similarity, novelty and location.



To determine the similarity, we began to collect all quantifiable parameters. After that, we analyzed which of them best correlated with the prices that guests are willing to pay for specific positions. We estimated how many people can spend the night there, whether the entire facility is rented or just a room, what kind of housing (apartment, castle, yurt), the number of reviews and much more.



The most surprising was related to the number of reviews. According to our observations, people are willing to pay more for positions that have a large number of reviews. Amazon, eBay and other services are guided by the ratings given in the reviews, helping users to determine the products and sellers, they have no clear effect of the number of reviews on the price. We have the same position with a single review will be very different from similar, but without reviews.



We chose the parameter “novelty” because the situation in the markets often changes, especially in the field of tourism. This business is highly susceptible to seasonality, and in assessing the need to focus today, because last year at this time or last month everything could be different.



In developed markets, such as London or Paris , all this data is quite simple to collect thanks to a large history of bookings. And we divide new and developing markets into groups according to their size, level of tourism and the dynamics of growth of offers on our portal. It helps to compare with positions not only in the same city, but also with similar offers in other markets. For example, if someone first put up housing in Kyoto , then we can compare with the offers in Tokyo , Okayama or even in Amsterdam .



And finally, we estimate the location of housing, which is much more difficult for us than for hotels. Those are usually grouped in several main districts, while our positions are scattered throughout the entire settlement.



Early versions



First, our pricing algorithm drew a circle with the center at the housing address specified by the owner. The radius was taken on the basis of similar parameters of nearby offers. For a while this approach worked well, but once we discovered a critical problem. Take some apartment in the center of Paris, say, in front of the Pont-Neuf Bridge, near the Louvre or the Tuileries Garden. But in this case, the circled circle is very different houses and apartments on the other side of the river, much less expensive and prestigious. In Paris, the cost of housing varies greatly depending on which bank of the Seine it is located on. And in other cities there are even more pronounced division. For example, in London, the cost of apartments in the Greenwich region may more than double the cost of housing in the docks area across the river.







An example of price changes in Austin , Texas during the SXSW and Austin City Limits festivals.



As a result, we had to make schematic maps of neighborhoods and districts in major cities around the world, taking into account local conditions. This made it possible to very precisely cluster the positions in the database based on the main geographical features and structures: rivers, transport routes, etc. And now, on the weekend of October, a double room at Greenwich costs about $ 130 per night, and similar accommodation, but across the river, is about $ 60.



Rolling out the next version of the algorithm, we were very pleased with ourselves. After all, we were able to fix the bug, because of which the system recommended for many positions to assign a value of $ 99, regardless of the specific characteristics. This did not last long, and not in all regions, but if we left everything as it is, we would sooner or later start asking questions about the correctness of the service tips.



Over time, we have more than once improved our algorithms, teaching them to analyze thousands of different factors, to evaluate geographic location in great detail. But our service still had two flaws. The advice given to them was static. Yes, he could assess the impact of some local events and the tourist season, recommending different prices for the same property during the year. However, he did not recalculate the cost, taking into account the proximity of the booking date, as the same airlines do; did not vary prices depending on market dynamics.



In addition, the service itself remained static. His estimates were based on more and more statistical data, but the algorithm itself did not improve.



New algorithm



Last summer, we launched a project that was designed to solve both of these problems. First, we wanted to do dynamic pricing - so that our service could predict the cost of housing for each day in the future. This approach is far from new; airlines have been using it for several decades, changing prices - often in real time - in order to maximize the filling of airplanes and to get the maximum profit from each passenger. Today, large hotel chains are following a similar path, analyzing arrays of statistical data on the work of all their branches. And since by that time we had also accumulated a lot of information about rental housing markets around the world, we also decided to concentrate on dynamic pricing, despite the higher demands on computing resources.



It was harder to make the algorithms self-developing. Especially given the fact that we wanted to make the system with transparent and understandable rules for humans, so that we could somehow influence them if necessary. Complex and powerful machine learning systems that would cope with the solution of our problems often work somehow strangely. For example, Google Brain, which learned how to find a kosh on a video on a network, contains several levels of algorithms that classify data. And the person is practically unable to reproduce the order of "conclusions", as a result of which Google Brain decides whether the cat is in front of him or not.



We chose a machine learning model called a “classifier”. It analyzes all the parameters of positions in the database based on current market requirements, and then predicts how successful this or that housing will be. The system calculates the recommended cost based on hundreds of parameters, such as included in the price of breakfast or the presence of a bath. Then we began to train the system, giving the task to analyze how often and how quickly a particular property gave in, and also at what cost. Thus, the algorithm evaluated the validity of recommended prices and the probability of successful delivery of housing.



Of course, the owners have the right not to follow the advice of the service and assign the cost higher or lower than recommended. In this case, the system recalculates the probability of success of the change, and later compares the forecast with the real state of affairs. Comparison results will be used in the calculation of future recommendations.







The algorithm uses rental price statistics to group proposed positions into microclusters by degree of similarity.



This is an element of the learning algorithm. Knowing how successful this or that recommended price was, the system adjusts the weights assigned to different housing parameters, the “signals,” as we call them. We started with a number of assumptions. For example, that the geographical location is extremely important, and the presence of a hot tub is not particularly. We also revised the system of parameters, eliminating some, adding some. Some of the new signals, such as “the number of days remaining until the date of entry”, affect the dynamic pricing function.



We added a number of signals only because the analysis of statistics revealed their importance. For example, certain photos contribute to booking accommodation. But the overall picture may surprise you: photos of stylish, brightly lit rooms, made by professional photographers, attract the attention of users much less than pictures of cozy bedrooms, decorated in warm colors. In the future, we plan to introduce a mechanism for the continuous automatic refinement of the weights of such signals in order to improve the accuracy of cost recommendations.



If necessary, we can influence the calculation of the scales, if we consider that this will improve the accuracy of the recommendations. Our system can generate a list of factors and weights that affect the cost of renting each position. If we decide that some side of the question is not fully represented, then we will manually add the necessary signals to the model. For example, we know that in Seattle , housing without Wi – Fi has very little chance of being rented, and at any price. And we do not need to wait until the algorithm is “thought out” before this; we can fix the metrics ourselves.



Also, our system constantly corrects on the map the maps of the regions combined by a number of parameters. We do not focus on existing official maps, but proceed from our own accumulated statistics on prices and bookings. This approach also allows us to isolate neighborhoods that we have not previously taken into consideration. They can unite numerous popular positions, and their borders do not necessarily coincide with the municipal ones. Also, neighborhoods can be distinguished by the presence of some local features that attract tourists to the "big" district.



Today, our service of price recommendations is served daily by many homeowners who have posted their offers on the portal. But we believe the service is capable of doing much more. Therefore, we have released the Aerosolve platform on which it is built, under the Open Source license. With its help, many developers will be easier to start working with machine learning systems. By clearly presenting how the platform works, you can not be afraid of a new area for yourself and freely experiment with the tools. For example, using the same Aerosolve, we created a system that paints pictures in the style of pointillism . So we offer everyone and other sympathizers to work independently with the platform. Who knows how many great products the community will be able to create on it.

Source: https://habr.com/ru/post/267127/



All Articles