Hello! I am sure many of you have ever bought a jersey, a ball, sneakers, or some other sports equipment in our stores, but few know what Sportmaster is from a technical point of view.
A little Sportmaster sample 2003 year from the site web.archive.orgMy name is Dmitry, I’m a senior java-developer at Sportmaster, and today I would like to tell you about our online store, about what way it went to become what you know it now: how we started, how we developed what happened and what did not, about the problems today, and about plans for the future. Interesting? Welcome under the cut!
The history of the presence of our company in the web began as early as 1999 in the first Sportmaster site, which was just a business card and a catalog of goods for wholesale buyers. Actually the company's online store has been leading its history since 2001. At that time, the company did not have its own online-project development team, and the online store managed to change several self-made platforms (now I don’t remember how much). The first relatively stable solution for us was created by the next integrator in 2011 in PHP, based on the CMS 1C Beatrix. The site turned out to be unpretentious, in fact, the boxed Bitrix functionality was used, with some customization on the order. By hardware - the startup configuration included 2 application servers and one database server.
')
Meanwhile, the company began to actively increase its own competencies in the field of online sales, primarily on the business side, which, I must say, quickly got a taste, and the development team had to grow rapidly in every sense in order to keep pace with its needs. Less than a year later, three teams began to answer for the development and support of the site - the integrator itself, Sportmaster's internal team, which at that time had literally several people, and another contractor - its appearance was, in fact, due to the fact that the integrator - the moment could not provide the power we need for people.
What problems did we have at that time? There were a lot of problems, but the main one is the unstable operation of our online store.
We could even fall from the fact that the business conducted some sort of newsletter, after which ~ 2000-2500 people came to the site, or, as I remember, an advertising banner on Yandex sent us into a deep knockdown. Of course, such things are unacceptable, because this is not only lost profits, but also the image of the company - in general, we understood that we need to change something. First of all, it came to the realization that standard solutions with our loads (at that time and not super-large, but still not small) will not work. Then we had ~ 1000 visitors online in the regular mode, ~ 2500 in peaks, plus plans for the development of x2 annually.
Immediately increased in hardware: added 2 more application servers and made a cluster of 2 database servers. Our stack at that time is nginx, MySQL, PHP. In parallel, we tried to optimize the current solution - we searched for bottlenecks, tried to rewrite everything that was possible. Since the base was our bottleneck, it always "died" first, we decided to unload it to the maximum. Introduced sphinx for full-text search and display of product tiles with facets on selected filters and hooked up caches. And voila - those loads that yesterday turned out to be fatal for us, we began to hold with ease.
At the same time, a pilot was launched in parallel, within which they wanted to carry out a technological update of the site - a transfer to a fundamentally different platform. There were a lot of ideas and ideas - at that time personalization of everything and everyone, personal recommendations, mailings, discounts and other useful things were gaining popularity, and we, of course, wanted to use it all. We looked at what is on the market out of this, and bought the most expensive platform on the principle “Once is more expensive - it means cooler”. The implementation was planned with the help of an integrator, and we had the support and further development of the conditionally old IM until the new one on the new platform was put into operation.
But since the speed of the functional development of the current site was very high, we decided that we would begin the implementation of the new e-commerce platform from the Austin online store, which is smaller and simpler at that time, which was also serviced by the Sportmaster IT team. In the process, we realized that the thing was hefty and functionally heaped up, but from a technological point of view it was outdated, and finding people for its full implementation turned out to be a huge problem. In addition, the sizing before the start of the project gave much lower requirements for hardware and the number of licenses - life turned out to be much more brutal. In general, we understood one thing: we will not do Sportmaster on it. And since the team for migration to the platform was already in the process of recruiting, the guys decided to start prototyping their own solution based on the requirements set by the business for the new platform.
The technology stack was chosen as follows: Java, Spring, Tomcat, ElasticSearch, Hazelcast.
As a result, approximately by the end of 2014, we had a new version of the IM ready, completely samopisnaya, to which we successfully switched. She is the first version of the site that you see today. Naturally, today's version is much more functional and technologically advanced, but their basic platform is the same.
Main tasks
Of course, when we talk about a large online store, we are talking about the willingness to cope not only with daily but also peak loads - to be stable for business and end users.
The main approaches here are the ability to scale horizontally and apply data caching approaches at different levels. And so, just like some time ago, we decided to optimize access to our data. But we can’t use regular page caching. At all. This is a business requirement, and the requirement is quite reasonable - if you show the site price or the wrong availability of a product to a site user at a specific point in time, this will most likely lead to a rejection of the purchase and a drop in customer loyalty.
And it would be great if the client ordered 15 pairs of socks for 299 rubles, and in the store he found out that in fact there are only 14 pairs and 300 rubles each - you can somehow live with it. To accept, to buy, that is, and to live further with this scar in a shower. But if the discrepancies in the numbers are serious, or you were looking for a specific size - and you bought it, while you read the reviews of happy owners of checkered shorts, here it’s already sadder. That is, immediately and the loss of a satisfied (up to this point) client, and the loss of time and money for the work of the call center, where this client will call to find out what happened and why.
Consequently, the user must always see the latest price and the most current data on commodity balances, and therefore our caches are smart and know when the data in the database changes. For caching we use Hazelcast.
Speaking of leftovers
It is important to note here that we have a small depth of commodity balances. And on the pickup is a very large number of orders (very). Therefore, the customer should normally book the goods in the right store and keep track of the balances. At one time, on Bitrix, the problem of remnants was taken by simply considering any residues of more than 10 units to be infinity. That is, everything that is greater than 10 is always equal to 10, and now the lower values ​​are interesting for us to calculate and we take them into account, upload them to the site.
Now it is no longer possible to do this, so we load leftovers from all stores of presence every 15 minutes. And we have about 500 stores, plus a number of regional warehouses, plus several retail chains. And all this must be promptly updated. The cherry here is the fact that on the scale of the Russian Federation, the working conditions of courier companies change very often, so you have to load the delivery parameters as well. Plus, the company’s warehouses are continuously supplied with a continuous flow of goods, which is why the quantity of goods in the warehouses changes as expected. So, he, too, must be re-tightened.
And this is how product item identifiers (SKU) are formed. We have about 40,000 so-called color models of goods. If you delve further to the size of the goods - you get about 200,000 SKU. And for all of these 200,000, it is necessary to update balances across 500 stores.
We also have tens of thousands of cities and villages to which we deliver goods from stores or from warehouses. Therefore, it turns out that the cache variability for only one product page (city * SKU) is in the millions. Our approach is the following: the calculation of the availability of a particular commodity unit occurs on the fly when the user enters the item card. We look at the work of couriers in the region of the user, we look at their work schedule, we calculate the delivery chain and consider its duration. Along with this, residues are being analyzed in stores nearby, from which delivery can be arranged.
To make it easier to manage all of this, we have a certain number of very fast caches in the application - thanks to this, we can quickly get all the necessary data by ID, and sort them out on the fly. The same with couriers - we group them into clusters, and then the cluster is already saved in the database. Once every 15 minutes, all this is updated, for each incoming request we calculate a certain cluster of couriers with the necessary parameters, aggregate them and quickly issue them to the customer - everything is OK, these are the 50th-size green shorts we definitely have, we can or pick up pens in these three stores nearby right now, or order a store across the street (or even home) for 3 days, choose.
For Moscow, this situation may seem superfluous, but for the regions it is quite another matter, they often order goods to some of the stores (to which, perhaps, you also need to get there specially).
Numbers
Now the site receives thousands of requests per second, taking into account the static and 500-1000 requests per second to the application servers. The number of application servers has not changed, but their configuration has grown significantly. During the day, an average of about 3 million views.
DDoS-s on the site are sometimes found. Knocking with the botnets, and ours, relatives, from the Russian Federation. For a long time there have been cases of attempts to knock with botnets from Mexico and Taiwan, but now there is no such thing.
There are a number of solutions for DDoS cloud protection on the market, yes, and quite good ones. But for certain security policies, we cannot use cloud solutions of this kind.
What now
We begin to make exactly the platform solution, dividing the commands not vertically (some are sawing one site, and the second - another), but horizontally, highlighting the common platform layer, dividing it into parts, forming a team around it. And on them we are closing the site and not only, including any clients of the company, both external and internal. Therefore, we have a lot of difficult and interesting work.
For obvious reasons, the stack on the front has not changed much during this time - Java, Spring, Tomcat, ElasticSearch, Hazelcast are still good for our needs. Another thing is that now behind the site lies a lot of back-office systems on various technologies. And, of course, reengineering is actively underway (because requests to internal systems and work with them as a whole need to be optimized, plus we do not forget about the requirements of the business and new business functions).
And also you can feel free to throw in my PM (or in the comments) any suggestions for improving the site - both in terms of new functions, as well as the visual component and the overall user experience. We will try to promptly answer and take everything into account. And if you want to become part of the team and saw it all from the inside -
welcome .