This text is a free translation of a wonderful post by
Kaushik Sathupadi on the topic of distributed systems and the existing restrictions on their creation.
When developing distributed systems, you will probably often hear references to the CAP-theorem. Let's try to understand it through a situation that could arise in real life.
Part number 1: The idea of ​​a new service - “Call, I will remind!”
Yesterday, when your spouse once again appreciated the fact that you remembered her birthday and presented a chic gift, a funny idea came to mind. "Hmm, but people always forget everything." And you just have a brilliant memory! Why not make a new service that will fully reveal your talent? With every thought about this idea, you more and more like it. You have even invented an ad that could be printed in a newspaper:
“Call, I will remind” - Never forget, even if you do not remember that you forgot!
Feeling bad because you forgot something? Do not worry. Help at the distance of one phone call!
If you need to remember something, just call and tell us about it! Suppose call us and tell the phone of your boss. Forget about it. When you need to remember him, call back and we will definitely remind you.
Only 3 rubles per call.
A typical call to your service would look like this:
- Client: - Hi, could you remember my neighbor's birthday?
- You: - Of course, what number?
- Client: - January 2nd.
- You: - (Write the date in a notebook) Everything recorded. Call us any time you need something.
- Client: - By the way, an idea came to me here that he should have given a radio-controlled helicopter as a gift, he adores such things!
- You: - A great idea for a gift, we will definitely remind you of it.
- Client: - Thank you!
- You: - Always welcome, 3 rubles have been withdrawn from your account.
It's been a few months ...
- Client: - Hi, I think I forgot something ...
- You: - Good afternoon, of course, a couple of seconds ... (Looking for a client page in a notebook)
- You: - Thank you for waiting. After a week is your neighbor's birthday, do not forget to congratulate him and donate a radio-controlled model of a helicopter.
- Client: - Exactly, thank you very much! And a super gift, you guessed it?
- You: We are always happy to help, 3 rubles have been withdrawn from your account.
')
Part 2: Business is growing
Hurray, business has gone! Your idea is as simple as it’s effective. Service shoots, you get hundreds of orders every day.
Everything is good, but there is a problem: more and more customers are waiting for their turn to talk with you. Some do not meet the expectations and just hang up. Moreover: when you fell ill and could not work, the whole day of business and all the proceeds disappeared. And this is not to mention unsatisfied customers who have lost the opportunity to receive information. Resolved - it's time to expand! Take a spouse to help.
So, the plan is simple:
- We hand over to you and your wife by an additional phone.
- Customers continue to call the same number: there is no need to memorize several numbers.
- The PBX redirects customer calls to the one who is currently free.
You are very excited about this idea, because:
- You can serve twice as many customers.
- Even if someone gets sick and cannot work, the service does not stop and continues to function.
Gorgeous! This is a distributed system! And why all these guys making software are so noisy about distributed systems is still so simple!
Everything goes according to plan, before ...
Part 3: First Failure
Two days after the introduction of the new system, you receive a call from your regular customer, Ivan Andreevich:
- Ivan: - Good afternoon.
- You: - Hello, “Call, I will remind!”, How can we help you?
- Ivan: - Remind me, please, I did not forget anything there?
- You: - For a second ... (look in the notebook, but on the page of Ivan Andreevich there is nothing worth mentioning)
- You: - No, everything is fine, you have not forgotten about anything, Ivan Andreevich!
- Ivan: - Great, thank you very much.
A day later, Ivan Andreevich calls you again:
- Ivan: - You let me down badly, you have a terrible service. I had a business trip to New York on important matters, and I missed the plane. And most importantly, I asked you to remind you, but no, you lied. I am very angry. (beeps)
- You: - But how ...
How could this ever happen? Maybe Ivan Andreevich just lied? You think about what happened, and a thought bumps into your head. Maybe Ivan Andreevich called your wife? You find her notebook, and, oh yeah, as you expected, the page on Ivan Andreyevich indicated that he had to leave for New York yesterday.
What a terrible flaw of your seemingly beautiful scheme!
Your distributed system is not consistent! There is always a chance that the client will report something to you or your wife, and the next call will get him to another person who will not be aware of the latest changes.
Part 4: Solving the consistency problem
Your competitors could safely ignore poor service, but you certainly care about your customers and reputation. While your wife was sleeping, you were not bothered by the problem, you thought all night and tried to find a way out and situations. Bingo! When the wife woke up, you are at the same time talking to her about new plans:
- When a client calls someone of you and wants to remember something, before you say “Thank you, we have recorded everything” you call another person and report changes.
- So you both have all the latest updates.
- When a customer calls to remind himself of something, you do not need to call your spouse - you always have all the correct and relevant information with you.
You notice the only problem - you cannot work in parallel. Every time a colleague calls you to synchronize, you cannot answer customer calls — you will be busy. But it's not so scary, because most of the calls come in with requests to remind something (search), and not something to remember (update).
The main thing is to answer the client correctly at any cost.
“Excellent,” the spouse tells you, “However, there is another gap that you have not thought of. What if one day someone can't go to work? Then we will not be able to accept a single request to remember, because the other person will not be able to write the changes in his notebook. This, brother,
we have an accessibility problem, because, for example, if an update request comes to me, I cannot complete the customer’s call: even if I wrote down the changes in my notebook, I won’t write them down to yours. Thus, I can not say goodbye to the client! "
Part 5: The Brilliant Idea Comes to You
You have already realized why distributed systems are not as simple as you thought. Is it difficult to come up with a solution that would be
“accessible and consistent” at the same time? Someone may have given up, but not you! Your competitors never dreamed that you came up. Again you will impatiently wake your wife ...
“Look, this is what we need for accessibility and consistency.” The plan is almost the same as last time, but with important changes:
- When a client calls someone of you and wants to remember something, before you say “Thank you, we have recorded everything” you check if a colleague is available and, if available , call another person and report changes.
- If a colleague is not available, then you write an e-mail with information about new updates.
- First of all, after being away from work, you or your spouse check your mail and write down the changes before you receive calls.
Brilliant! You can not find a single flaw in the resulting solution. Now "Call, I will remind" at the same time available and coordinated service.
Part 6: Spouses sometimes quarrel
It seems that everything is fine already that day. Your system is consistent. Everything works, even if one of you cannot go to work. But what happens if you both go to work, but one of you cannot update the information of the other? Remember how you woke your wife with the next Genius Brad?
What if your wife decides to take calls, but is too offended at you and decides not to talk to you all day? Your whole business will turn into a pumpkin again! Your idea is still good for its consistency and accessibility, but very sensitive to the separation of communications! Of course, you can not take a single call while you are in a quarrel, but then your system will be
unavailable all this time ...
Part 7: Conclusions
So now let's take a look at the CAP theorem. It is argued that when developing a distributed system, you cannot simultaneously achieve three properties: accessibility, consistency, and tolerance for network separation. You can choose only two of:
- Consistency - once your clients have updated the information, they can always get the most relevant data upon their subsequent request. And no matter how quickly they call you back.
- Availability ( Availability ) - “Call, I will remind” is always available for calls at a time when at least one of the employees went to work.
- Tolerance to network separation. ( Partition Tolerance ) “Call me, remember,” always works correctly, even if you have lost contact with your wife.
Bonus: reach availability over time with the help of a courier.
Here you have one more reason to think. You can hire a courier. He will update your spouse or notebook at the moment when the information in another book has changed. The biggest benefit of this approach is that it can work in the “background” mode and someone’s update blocks another person in order to update. Thus, many NoSQL solutions work, one node locally updates itself and the background process synchronizes the other nodes, respectively. The only problem is that consistency is lost for some time. For example, a customer calls your wife and until the courier reaches you, he calls back and gets on the line to you. As you can see, he will receive an inconsistent response. But still, it remains a great idea if such cases are limited. For example, our clients do not suffer from amnesia and do not forget what they reported 5 minutes ago.
I tried to explain to you the CAP-theorem and Eventually Consistency in a simple, accessible language. I am pleased to receive your questions, comments and comments.