📜 ⬆️ ⬇️

Everything you didn't know about the CAP theorem

During my first experience with distributed systems, I was constantly confronted with a certain CAP-theorem, I had to dig a little to learn and understand it from all sides. I am not a database wizard, but I hope that my little exploration of the world of distributed systems will be useful for ordinary developers. In the article I will talk about what CAP is, its problems and alternatives, and also consider some popular database systems through the CAP prism.

CAP theorem


This theorem was presented at a symposium on the principles of distributed computing in 2000 by Eric Brewer. In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of the Brewer hypothesis, making it a theorem.

According to Brewer, he wanted the community to start a discussion about compromises in distributed systems and after some years began to amend it and make reservations.

What is behind CAP


CAP states that in a distributed system it is possible to select only 2 of the 3 properties:
')

There are already sufficiently visual proofs of this theorem, therefore I will give references to Bauman University and proof in the form of the service “Call me, remind!” .

Basically this is all a triangle.


Many articles boil down to such a simple triangle.

image


We put into practice


To use CAP theorems in practice, I chose the 3 most, in my opinion, suitable and quite popular database systems: Postgresql, MongoDB, Cassandra.

Let's look at Postgresql


The following items refer to the Postgresql abstract distributed database.


Thus, the system can not continue to work in the case of a partition, but provides strong consistency and availability. This is a CA system!

Let's look at MongoDB


The following items refer to the MongoDB abstract distributed DB.


Thus, the system can continue to work in case of network separation, but the CAP-availability of all nodes is lost. This is a CP system!

Look at Cassandra


Cassandra uses the master-master replication scheme, which in effect means an AP system in which network separation leads to the self-sufficient operation of all nodes.

It would seem that everything is simple ... But this is not so.

CAP problems


A lot of detailed and interesting articles are written on the topic of problems in the CAP theorem, here, on Habré, so I will leave the link to CAP no longer relevant and the myths about the CAP theorem . Be sure to read them, but treat each article as a kind of new look and do not take it too close to your heart, because some are scolded, others are praised. I myself will not go too far, but I will try to give some necessary compilation.

So, the problems of the CAP theorem:


What is wrong with the definitions?


Consistency in CAP actually means linearizability (and it is really difficult to achieve). To explain what linearity is, let's look at the following picture:

image

In the case described, the referee finished the game, but not every client gets the same result. To make its system linearized, we need to instantly synchronize data between the referee and other data sources so that when the referee finishes the game, each client will receive the correct information.

Availability in CAP, based on the definition, has two serious problems. The first is that there is no concept of partial availability, or some degree of it (percentages for example), but there is only full accessibility. The second problem is unlimited response time to requests, i.e. even if the system responds in an hour, it is still available.

Resistance to distribution does not include fallen nodes, and here's why:

  1. By definition. In the availability and spelled "... every node (if not failed) always ..."
  2. Based on the evidence. The proofs of the CAP theorem state that some code must be executed on the nodes.
  3. Well, some of my (and not only) conjectures. In case of a node fall, the system can recover, communicate with other nodes and continue working as if nothing had happened. In case of network separation, you will have to wait for the connection to be restored.

Therefore, you need to remember about the ability of the system to recover, but beyond the CAP theorem.

AP / CP selection


Communication between nodes usually occurs through an asynchronous network, which can delay or delete messages. The Internet and all our data centers have this property, and these are not unlikely incidents, so CA systems are rarely considered as part of the development.

Many systems are just P


Imagine a system in which two nodes (Master, Slave) and a client. If suddenly you have lost contact with the Master, the client can read from the Slave, but cannot write - there is no CAP-availability.

Ok, like the CP system, but if the Master and Slave are synchronized asynchronously, then the client can request data from the Slave before successful synchronization - we lose CAP-consistency.
image

Clean AP and CP systems


Pure AP systems may include just 2 number generators. Pure CP systems may not be available at all. I will try to come to an agreed state and will not answer us. Go ahead, CP systems give us not the strong consistency that we expect, but eventual consistency. We'll talk about it a little later.

How to live with it


In the end, it's just an attempt to classify something abstract, so you don't need to reinvent the wheel. I recommend using the following approach when trying to work with distributed databases:


PACELC


The PACELC theorem was first described and formalized by Daniel J. Abadi from Yale University in 2012. Since the PACELC theorem is based on CAP, it also uses its definitions.

The whole theorem reduces to IF P -> (C or A), ELSE (C or L).

Latency is the time for which the client will receive an answer and which is regulated by some level of consistency. Latency (latency), in a sense, represents the degree of availability.

image


A little bit about BASE


BASE is a peculiar contrast to ACID, which tells us that true coherence cannot be achieved in the real world and cannot be modeled in highly scalable systems.

What is behind BASE:


I was asked several times about what is better than ACID or BASE - it depends on your project. For example, if your data is not critical, and the user really cares about the speed of interaction, BASE would be the best option. If it's the other way around - ACID will help you make the system as reliable as possible in terms of data.

A fresh look


Now that we know about most of the pitfalls, let's try to look at the same popular database systems from the perspective of knowledge gained.

Postgresql


Postgresql does allow for many different system configurations, so it’s very difficult to describe them. Let's just take the classic Master-Slave replication with implementation through Slony.


MongoDB


Let's find out something new about MongoDB:


Cassandra



findings



Thanks for attention!

Source: https://habr.com/ru/post/328792/


All Articles