📜 ⬆️ ⬇️

The services of the Firebase service went up 70 times, and no one warned us.

On its blog at Meidum, the Startup team HomeAutomation told an instructive story with a relatively happy ending on how a harmless service like Firebase could turn into a trap and how short-sighted planning in the early stages turned out to be a catastrophe for the company several years later.



Start


“Like many others, our startup started with a very simple idea. In fact, it was a tool designed to help programmers of automation systems for the “smart home” to deal with emerging issues, integrate devices and save time.
')
We distributed our product for free, and it soon began to gain popularity. It was so exciting! In just a few months, our audience of ten beta testers at a Skype conference grew to hundreds, and then thousands of users. We were in seventh heaven! I remember how I just sat and looked at the statistics of Google and Woopra, watching the actions of users.

I’m the first to admit that at this stage we made some serious mistakes (from which, I hope, we can warn others). Then we actively created and implemented additional functions, trying with all our might to process an endless stream of requests.

The error was not that we did not read the documentation. And not in the fact that they chose services that have some problems with functionality or poor performance. No, we made a small, but dangerous misstep (and I suspect that many other application developers allow it at this very moment) - we allowed the service to become a trap.

As a result, we found ourselves in the following situation: the service on which we fully relied, changed the policy and the price of their services suddenly jumped 70 times. And we simply do not have a solution that could be urgently applied to prevent the costs from reaching tens of thousands of dollars. In the end, we are just a startup that finances itself on the bootstrap model.

Trapped! The beginning of the black strip


It all started in April of this year. Since Firebase became part of Google, we have been on the Flame tariff plan, which cost us $ 25 a month. We started using the services of this company long before Google bought it and turned it into a star of its initiatives related to mobile cloud storage.

After the next payment, only a few days passed when I received an alert that our application was being disabled for exceeding the traffic limit. What?! Deeply alarmed, I logged in to see what happened.

The Firebase Flame Tariff Plan provides 20 gigabytes of outgoing traffic per month for a $ 25 fee. The system does not allow the specified limit to be exceeded, so the case does without unpleasant surprises after the fact, which is good. However, during these few days, we, according to statistics, have already spent 30 gigabytes of traffic.



Of course, we couldn’t allow the application to be disabled and our users were prevented from using it. Yes, and pay $ 10 on top for 10 extra gigabytes - not great trouble. We immediately contacted technical support on this issue and, following the results of the discussion, switched to another plan, “Pay As You Go” - this was the only alternative that they could offer us.

I barely had time to click on Enter, as the claimed traffic consumption jumped to 100 gigabytes. You understand, I felt uneasy. But I waved my hand and decided that I would talk on this topic with technical support tomorrow.

When I arrived at the office the next morning, I found that we had already pumped up 180 gigabytes of traffic. Considering that we paid a dollar per gigabyte, you can imagine what a horror I was in when I calculated how much it would cost us. What's happening?! Why?! How to fix it? What to do?! This will result in thousands and thousands of dollars of unplanned expenses!

Firebase does not provide any analytics or explanation of the traffic flow; all that is available to you is the blue bar on the graph, which shows that you have spent so many gigabytes.



Some kind of nonsense - where did we suddenly get so much traffic? Why? Who are all these people? Unfortunately, Firebase does not provide answers to such questions.

However, to my misfortune, I discovered that they have a database profiling tool. I launched it and checked the results in an hour: according to its data, it turned out that we spent only a few megabytes. In this case, the analyst showed the consumption of two gigabytes over the same period of time!



In the end, I contacted tech support and they informed me that they had changed the traffic calculation rules: they now also take into account the excess of SSL requests, including unsuccessful attempts blocked according to their security requirements. In the same conversation, they mentioned that the database profiling tool does not display data on SSL excess. In my case, it turned out that none of the tools available to me revealed absolutely any problems, and the bill for services increased by 70 times!

According to them, in most cases, the change in the algorithm does not particularly affect ... unless you use the REST API. In our application, the source code was written in a language that is not supported, so we resorted to the REST API to compensate for this. All he did was read the boolean value every minute, checking if there was any need for further processing. For two years it did not cause any problems, and then suddenly it began to ruin us.

After a closer look at the situation, the tech support staff stated the following: the traffic overrun was caused by the fact that we did not use something called TLS Tickets (now we are using, but before that I had never seen anything mentioned in libraries or packages) and “Keep Alive” was not set as true.

There is not a word in the documentation about this. I think this is explained by the fact that there were no special problems before changing the rules for payment. We honestly read the documentation and made requests based on it. I personally re-read everything thirty times, no less.



At first it seemed to us that the site administration was ready to meet us halfway. They set a time to discuss the situation with us, assured that they would help us solve the problem. In a telephone conversation, they said that they would take care of everything and give us time to bring the application into proper form. We also received a letter stating that the team is working on the problem and will soon provide us with a loan.

And then my money was withdrawn from my card. And we stopped responding to our letters. In general, they stopped responding to attempts to get in touch, leaving us only one way out - to cancel the payment, as a result of which the service would be disabled, the application would fall, and the users would run away. We had only one communication channel - e-mail, and for more than a month our letters were simply ignored, not responding to repeated requests. We were trapped, deadlocked, without any opportunity to act.

Summing up:

  1. The site administration changed the traffic calculation scheme, as a result of which they began to charge us 70 times more, although the expense, in fact, remained the same. And all this happened when we used the service for several years.
  2. We did not receive any alerts or warnings about policy changes - we were contacted only when we were about to disable the application.
  3. The profiling tools they offer do not reflect the growth in traffic consumption — you only learn about it from a bloated check.

At the bottom


Well, at least, to fix the situation seems to be easy. If one kind of thing spoils everything, you just need to change it! If you told me this now, when the architecture of the application has already been reworked, I would answer: “Not a question, everything will be in an hour.”

Now we no longer write the URL hard. Instead, we turn to proxies, which can be modified if necessary. We are introducing third-party APIs as an intermediate link, which, if all of a sudden, allow us to switch to a similar service, and generally learn more and more useful techniques every day. We have a small startup and self-sufficiency, so we will be happy for ideas, suggestions and suggestions.

But at that time, alas, it was not so simple. You see, this is one of the first lines of code that we wrote when, at first, everything was integrated with the Firebase service. This is strictly written in version 1.0 code (a blunder on our part!) And deployed into thousands of systems around the globe. That code snippet was created when the application was just a curious idea. For many years, there were no problems with it, and even now, none of the profiling tools available to us sees anything unusual. From the moment of launch, the bill for services has never exceeded $ 25.

However, since all of this is spelled out in the first version of the application (which is locally downloaded to the automation systems of most countries of the world), the only thing we can do is turn off the service altogether until we prepare a new version ... and this, beyond any doubt, will deprive us user base, because the consequences will be disastrous - many will have to go to customers at home, hundreds of miles to fix the problem.

When the application began to gain momentum, we quickly realized that, tightly associating ourselves with one service, we could face the unpleasant consequences of this kind. But it never occurred to me that everything would happen so suddenly and on such a scale!

After this incident, we changed the model to one that, thanks to the architecture without servers, allows us to dynamically make the necessary changes. She would solve all our problems, but we cannot finally implement it until we pay the Firebase 5000 - $ 10,000 of unplanned expenses, which we, unfortunately, cannot afford at the moment.

Therefore, I ask you very much ... learn from our idiotic and, as it now turns out, fatal mistakes.


Conclusion


I can not believe that I am writing this article. I am not one of those people who love to tell such stories and wash dirty linen in public. Like many here, me and my team spent long hours learning, planning, writing code to launch the startup of their dreams. Those are still a roller coaster: ups and downs, joyful excitement and right there - great stress.

In fact, I am very ashamed that we got into such a story. The huge array of knowledge, which, as it seems from the outside, many people possess in virtual reality, sometimes greatly suppresses. It was hard to overcome the fear that we would be condemned for the mistakes made, and to force ourselves to write this text. I only hope that any of you will benefit from the story of these stupid mistakes and come up with new solutions to the growing problem of service traps.

It’s hard to come up with something more eerie and exciting than to take a leap into the unknown, quit your job and decide to build a career around one of your crazy ideas. I can say without hesitation that this is no other event in my life. But by itself, when you learn something new, you can’t do anything without blunders. Unfortunately, this is not easier, when such minor mistakes happen to you after a while. Especially when the service decides to change the principle of operation without warning.

This is not a stone in their garden


I do not expose specific services and platforms. Firebase did not act out of spite or bad intentions. It all happened because a slight change on their side turned into a significant change in our check (70 times!), As a result of which our subscription fee of $ 25 turned into as much as $ 1,750, if not more (the amount continues to grow).

What I want is to save you from our mistakes. Errors that are now endangering the future of our project simply because Firebase began to rely on traffic in a different way.

Take care of yourself: do not take the time to analyze the services you are going to implement. Consider that they are able to fundamentally change the fate of your product so that in a matter of hours all the positive dynamics will fly to hell.

UPD : I am glad to announce that this morning I received a call from Firebase. They made their sincere apologies for the current situation and explained in more detail what happened to the account.

So far we have reached a final agreement, but they took note of the circumstances that I described in this article. Many of the voiced claims (for example, those related to metrics), they identified as points that will be finalized in the near future.

I will keep you posted if something new appears.

UPD 2 : As a result of further discussion, we dealt with loans and developed a plan to quickly solve the remaining problems. Andrew (the founder of Firebase) and his team very actively got down to business and commented on the whole situation here .

Of course, for a company, any situation in which an official statement has to be made is undesirable, but how they approached this is in their favor. We have become much calmer.

UPD 3 : Firebase has provided us with a loan; This week we are going to discuss the traffic consumption in detail. Since I got in touch with Alex and Andrew, they have been at their best. In my opinion, they offered a comprehensive solution to the problem and, according to my feelings, understood how to act in the future so that it would not touch other users. ”

Source: https://habr.com/ru/post/329282/


All Articles