📜 ⬆️ ⬇️

12 billion requests per month for $ 120 in java

When you launch your product, you absolutely do not know what will happen after launch. You can remain an absolutely unnecessary project to anyone, you can get a small stream of customers or a whole tsunami of users right away if the leading media write about you. We did not know either.

This post is about the architecture of our system, its evolutionary development for almost 3 years and the trade-offs between development speed, performance, cost and simplicity.

Simplified task looked like this - you need to connect the microcontroller with a mobile application via the Internet. Example - press the button in the application the LED on the microcontroller lights up. We extinguish the LED on the microcontroller and the button in the application accordingly changes the status.
')
Since we started the project on kickstarter, before launching the server in production, we already had a fairly large base of first users - 5000 people. Probably many of you have heard about the famous habr effect that many web resources put in the past. We, of course, did not want to repeat this fate. Therefore, this was reflected in the selection of the technical stack and application architecture.

Immediately after the launch, our entire architecture looked like this:



It was 1 virtual machine from Digital Ocean for $ 80 per month (4 CPU, 8 GB RAM, 80 GB SSD). Taken with a margin. Since “what if the boot goes?”. Then we really thought that, let's start, and thousands of users would rush at us. As it turned out, attracting and luring users is also a task and load on the server - the last thing worth thinking about. Of the technologies at that time, only Java 8 and Netty with our own binary protocol on ssl / tcp sockets (yes yes, without a database, spring, hibernate, tomcat, websphere and other delights of a bloody enterprise).

All user data was simply stored in memory and periodically flushed to files:

try (BufferedWriter writer = Files.newBufferedWriter(fileTo, UTF_8)) { writer.write(user.toJson()); } 


The whole process of raising the server came down to one line:

 java -jar server.jar & 

Peak load immediately after launch was 40 rec sec. This tsunami did not happen.

Nevertheless, we have worked hard and constantly, constantly adding new features, listening to feedback from our users. The user base, though slowly but steadily and steadily, grew by 5-10% every month. The server load also increased.

The first serious feature was reporting. At the moment when we began to implement it - the load on the system was already 1 billion requests per month. And most of the requests were real data, such as readings of temperature sensors. It was obvious that storing every request was very expensive. Therefore, we went to the tricks. Instead of saving each request, we calculate the average value in memory with minute granularity. That is, if you send the numbers 10 and 20 within a minute, then you will receive a value of 15 for this minute at the output.

At first I succumbed to HYIP and implemented this apache spark approach. But when it came to deployment, I realized that the game was not worth the candle. So of course it was “right” and “enterprise”. But now I had to deploy and monitor 2 systems instead of my cozy monolithic. In addition, an overhead was added to serialize data and transfer them. In general, I got rid of the Spark and just count the values ​​in the memory and once a minute I drop it onto the disk. The output looks like this:


The single-server monolith system worked fine. But there were quite obvious disadvantages:


8 months after launch, the flow of new features was a bit asleep and I had time to change this situation. The task was simple - to reduce the delay in different regions, to reduce the risk of the fall of the entire system at the same time. Well, do it all quickly, easily, cheaply and with minimal effort. Startup, all the same.

The second version is:





As you probably noticed, I chose GeoDNS. It was a very quick decision - the entire setting of 30 minutes in Amazon Route 53 was to read and configure. Pretty cheap - Amazon's Geo DNS routing costs $ 50 per month (I was looking for alternatives cheaper, but I could not find it). Quite simple - because no load balancer was needed. And it took a minimum of effort - I only had to prepare a little code (it took less than a day).

Now we had 3 monolithic servers for $ 20 (2 CPU, 2 GB RAM, 40 GB SSD) + $ 50 for Geo DNS. The entire system cost $ 110 per month, while it had 2 cores more for the price of $ 20 cheaper. At the time of transition to the new architecture, the load was 2000 rec sec. And the old virtual was loaded only by 6%.



All the problems of the monolith above were solved, but a new one appeared - when a person moves to another zone - he will get to another server and nothing will work for him. It was a conscious risk and we went to him. The motivation is very simple - users do not pay (at that time the system was completely free), so let them suffer. We also used statistics, according to which - only 30% of Americans at least once in their lives left their country, and only 5% regularly moved. Therefore, it was assumed that this problem will affect only a small% of our users. The prediction came true. On average, we received about one letter in 2-3 days from the user who “Lost projects. What to do? Save! ” Over time, such letters began to be very annoying (despite the detailed instructions on how quickly the user could fix this). Moreover, such an approach would hardly have arranged a business, to which we have just begun to switch. It was necessary to do something.

There were many solutions to the problem. I decided that the cheapest way to do this would be to send microcontrollers and applications to one server (to avoid overhead when transferring messages from one server to another). In general, the requirements for the new system loomed like this - different connections of one user should fall on one server and need a shared state between such servers in order to know where to connect the user.

I heard a lot of good reviews about Cassandra, which perfectly suited this task. Therefore, I decided to try it. My plan looked like this:



Yes, I am a rogue and naive Chukchi youth. I thought that I could raise one Cassandra node on the cheapest virtual machine for up to $ 5 - 512 MB RAM, 1 CPU. And I even read the lucky article that raised the cluster on Rasp PI. Unfortunately, I failed to repeat his feat. Although I removed / trimmed all buffers, as described in the article. I managed to raise one cassandra node only on a 1 GB instance, while the node immediately fell from OOM (OutOfMemory) with a load of 10 rec-sec. More or less stable, Cassandra behaved with 2GB. It was not possible to increase the load of one cassandra node to 1000 rec-sec, again OOM. At this stage, I refused cacandra, because even if it showed a decent performance, the minimum cluster in one data center would cost in 60s. For me it was expensive, considering that our income then was $ 0. Since I had to do it yesterday, I proceeded with plan B.



Good old postgres. He has never let me down yet (well, almost never, yes, full vacuum?). Postgres started perfectly on the cheapest virtual machine, absolutely did not eat RAM, the insertion of 5000 lines took 300ms and loaded the single core by 10%. What you need! I decided not to deploy the database in each of the data centers, but to make one common storage. Since postgres scaling / shardit / master slave is harder than the same casing. Yes, and this margin of safety.

Now I had to solve another problem - to direct the client and its microcontrollers to the same server. In essence, make a sticky session for tcp / ssl connections and your binary protocol. Since I did not want to make drastic changes to the existing cluster, I decided to reuse Geo DNS. The idea was this - when a mobile application receives an IP address from Geo DNS, the application opens a connection and sends a login to that IP. The server, in turn, either processes the login command and continues to work with the client in case it is the “correct” server or returns a redirect command specifying the IP where it should go. In the worst case, the connection process looks like this:



But there was one little nuance - the load. At the time of implementation, the system processed 4,700 rec-sec. ~ 3k devices were constantly connected to the cluster. Periodically, ~ 10k. That is, at the current growth rate in a year, it will already be 10k rec-sec. Theoretically, a situation could arise when many devices are simultaneously connected to the same server (for example, when restarting, ramp up period) and if, all of a sudden, they all connected to the “wrong server”, then too much load on the database could occur, which could lead to her refusal. So I decided to play it safe and made the information about user-serverIP in radish. The final system turned out like this.



With the current load of 12 billion rivers per month, the entire system is loaded on average by 10%. Network traffic ~ 5 Mbps (in / out, thanks to our simple protocol). That is, in theory, such a $ 120 cluster can withstand up to 40k rec sec. From the pros - do not need a load balancer, simple deployment, maintenance and monitoring is rather primitive, there is a possibility of vertical growth of 2 orders of magnitude (10x due to utilization of current iron and 10x due to more powerful virtual machines).

Open-Source Project. Sources can be found here.

That's all. I hope you liked the article. Any constructive criticism, advice and questions are welcome.

Source: https://habr.com/ru/post/316370/


All Articles