📜 ⬆️ ⬇️

NoSQL and Big Data - deception of workers?

image Recently, we were able to communicate with the great Monty - Michael Widenius, the author of the original version of the open database MySQL, which is currently working on its branch, MariaDB. (By the way, both of these bases are supported in Jelastic .)

As you know, the world produces and processes more and more data (the so-called “Big Data” phenomenon). It is generally accepted that there is now so much data that it is difficult or impossible to process it using traditional databases and software methods. This triggered a wave of non-relational databases (NoSQL), in which emphasis is placed on high scalability. An expert in the field of databases, Monty, shared with us his thoughts on the current and future state of SQL, NoSQL and Big Data. Some of his answers were somewhat unexpected, so we are happy to provide here a Russian translation of the transcript of our conversation:

Could you tell us a little about the history of NoSQL and big data? Why is this topic so interesting lately?
')
All of this “new NoSQL movement” began with a blog post from Twitter employees who believed that MySQL was not good enough for them. They needed something “better,” something like Cassandra.

The main cause of Twitter problems with MySQL is the improper use of the database itself. In addition, the solution they offered could be implemented in MySQL as easily as in Cassandra.

I cannot find the original article, but I found a bit later mention that Cassandra will replace MySQL .

Today (3 years later) Twitter is still using MySQL as the main repository for tweets. Cassandra, ultimately, was unable to replace MySQL.

The main reason for the popularity of NoSQL is that, unlike SQL, you can start using it without any additional development. In fact, starting with NoSQL is easy, but you will pay for it later when you lose control of your data.

Thus, the main advantages (at least until the appearance of MariaDB) of most NoSQL solutions are:
• Quick access to data (if all data is placed in RAM),
• Fast replication / distribution of data between many nodes
• Flexible layout (you can add new columns instantly).

What do you personally think about the future of NoSQL / Big Data? Your predictions?

I believe that most people tend to use NoSQL mainly because of the "hype" around this technology. Most companies really do not have large amounts of data, such as Facebook and Google, and they cannot afford to hire specialists to set up and continuously develop the database.

Relational databases - SQL - will not go anywhere. NoSQL simply cannot replace them. Almost everyone needs joins to use data.

However, there are situations when using NoSQL makes sense. I think in the future we will see more combined solutions that include SQL and NoSQL.

That is why we are expanding the functionality of MariaDB to be able to access NoSQL databases, such as Cassandra and LevelDB.

If NoSQL is needed only in rare cases, why do people still use them? What are the main reasons?

Because it is always much easier to start with NoSQL. You do not need to learn SQL and define a database schema before using it. Some use NoSQL because they believe it scales better than SQL.

Can SQL beat NoSQL? What unique advantages make SQL better than NoSQL?

In cases where the data does not fit in memory, SQL, as a rule, is superior to NoSQL.

There are also many other things that NoSQL simply cannot do. Most NoSQL solutions are optimized for single-key access. For something more, you need to write a program. In this case, it is very difficult to exceed the SQL optimizer for complex tasks, especially samples that are automatically generated based on user requests (this is required by most websites).

SQL can also outperform NoSQL when working with a single machine. In a cluster where everything is in memory, on the contrary, NoSQL usually leads SQL in key queries.

What do you think about Cloudera ’s latest investment promotion announcement for commercial Hadoop?

The main problem of Hadoop is that there is no well-known business model that would guarantee investors getting the expected tenfold profit. In this regard, it is difficult for me to understand how Cloudera can survive in the long run.

Just a good product is not enough, you need to be able to earn on it.

Who is most actively promoting Big Data and NoSQL?

All NoSQL providers, of course;)

If all this is just a hoax, why so much hype?

This deception is not for everyone. There are many big companies and projects that can benefit from Big Data.

However, I want to say that most people do not need either Big Data or NoSQL, since it will be more expensive in the long run, when you finally discover that NoSQL cannot solve all the needs of your business.

And finally, how does MariaDB fit into all this?

We are committed to making MariaDB a kind of bridge between NoSQL and SQL. That's why we added Cassandra support first and are currently working on adding LevelDB support.

We realize that NoSQL is trying to satisfy some really important needs, and that is why we added dynamic columns (which makes the SQL schema as flexible as most NoSQL schemas) and faster replication.

In MariaDB 10.0, replication will be even faster, more flexible and fault tolerant.

We also work closely with Galera to provide a multi-master solution in MariaDB.

All this in order to better adapt to the changing world and to meet the existing needs of the people - perhaps even artificial needs;)

Please tell us about the new fund MariaDB. What does this mean for developers?

The MariaDB Foundation was created so that many independent companies could work together on a common goal, actively developing MariaDB as an open source project. The Foundation employs developers to do all the assemblies, implement quality control, check patches, etc., all that is necessary for the project to move forward.

Many thanks, Monty! MariaDB continues to be very popular among users of the Jelastic platform. All the best!

Source: https://habr.com/ru/post/166845/


All Articles