
Since the
previous publication, the expensive editorial has received a large number of reviews. Most of them were positive, which undoubtedly strengthens the faith of the dear editors in humanity. There were also some serious additions in the form of critical comments about MySQL, which I either forgot or never heard of. Which led to the creation of the second part, which in fact is an addition to the first and was not originally in my plans.
So, we continue the analysis of typical misconceptions about MySQL in the framework of
cultural exchange and autumn aggravation. First, a few critical reviews about the first part.
“A long list of MySQL-using companies means nothing, because they use MySQL as key-value storage”
Sometimes they add that they heard it only about Facebook / Twitter, but they do not know about the rest. Someone says they don't use JOINs. But no one indicates the source.
')
First, even if it were, it would change nothing. I gave this short and far from complete list of companies only to show: whether someone likes it or not, but MySQL is the most popular DBMS in the largest web companies, so
legacy can be called it only by living in some of its own. parallel reality.
Secondly, you need to understand that we are talking about a dozen large companies with dozens and hundreds of internal services, each with its own structures and data volumes, response time requirements, read / write distribution, etc. Not all of these services use MySQL, but to say that in each case MySQL is used solely as key-value storage is at least naive.
The easiest way is with Wikipedia - the source code of MediaWiki is open, everyone can look at the types of requests on their own. In addition, Wikipedia employees
talk about the infrastructure in more detail than representatives of other companies.
Returning to Facebook, some of the most heavily loaded services undoubtedly work as key-value. Just because the data is shaded across thousands of servers, and before the DBMS there is a key-value cache, which
somewhat limits the use of aggregation, JOINs, etc. But Facebook has many services and different requirements for them.
This is a fairly recent description of the storage technologies used on Facebook. If you find there a confirmation of the thesis about the exclusively key-value nature of using MySQL - let me know.
"A long list of companies using MySQL means nothing, because now they would not choose MySQL"
Naturally, no justification is given, but I will try it myself. This would be its own logic if it were about static companies that do not grow, where new services and new tasks for data storage and processing do not appear. Or about small companies, where resources are limited, which means the best DBMS is the one with which the “team is familiar”. Returning to the notorious Facebook, neither applies to it. There are no "religious" prejudices, different technologies are used on the basis of specific technical problems. It also happens that new technologies are being developed, because none of the existing ones fit the requirements. Examples: Cassandra, Presto, Scuba.
If you try to summarize the reasons why companies choose and continue to choose MySQL, I will not discover America, if I say that MySQL is used as a reliable and scalable storage for OLTP loads (with key-value as a special case). I diligently avoid PostgreSQL criticism, so I’m going to keep it meaningfully silent - there will be a separate publication.
And if someone says that there is a
magic DBMS for all occasions, buy him a candy.
"MyISAM is actually relevant because ..."
Someone says that “household” (by the way, what is this?) CMS uses MyISAM. Someone says that the system tables are still in MyISAM (and this is partly true even for 5.7, although there were
hopes that this would be fixed). Someone uses MyISAM as a repository for temporary / append-only data such as logs.
I can only repeat myself: MyISAM is really a
legacy in the full sense of the word. The restrictions are well known to everyone, the code is not developed, and it is hardly even supported, and there is a full replacement. If someone else finds a use for her, that's fine, but I would advise you to look at InnoDB. Especially in 5.7, where performance versus MyISAM was one of the priorities. Just say no to MyISAM
drugs , all her problems are not worth any modest performance bonuses.
“But there is such a report and such an article, which proves that MySQL is the path to nowhere”
I know, thanks. The review of replication in general and the report / article in particular should have been published instead of this publication, but I have less time than I would like. Everything will be as promised.
And a couple of new myths that have been added in the comments.
“MySQL is proprietary!”
This is the person who allegedly never imagines and spreads any myths about MySQL. MySQL is a free project under a free license. And the speech seems to be about
MySQL Enterprise . I see almost a direct correspondence between MySQL Enterprise and
EnterpriseDB products. But nobody calls this “proprietary” PostgreSQL.
Updated 11/16/2015: For the few but insistent requests in the comments, I add the following text:
Yes, the GPL license, under which MySQL is distributed, does not allow for a license change, including for a proprietary one. Therefore, there are no closed forks. Oracle, as the owner of copyright, can change the license at its discretion.
The BSD license, under which PostgreSQL is distributed, permits a license change. Therefore, many proprietary forks.
Updated 11/12/2016: As I predicted in the comments a year ago, Postgres Pro
quietly and imperceptibly released its own proprietary fork . The talk about the “proprietaryity” of MySQL somehow came to naught by itself.
“MySQL is not needed, because <funny example from the Internet>”
Usually there is a link to
this article, which, probably, is the most serious addition of all the read comments on the previous article. The article itself, in my opinion, is an example of competent criticism of MySQL without “evangelical” conclusions like “MySQL - the path to nowhere”, “MySQL is not needed”, “obviously defective DBMS” and other things desired. A person lists examples of strange or non-intuitive behavior. Half of them are based on one simple fact: in MySQL, there is an
implicit type conversion , which applies not only to operators, but also to functions. All examples with LEAST () from the mentioned article exploit this single fact. This is also where the popular example with SELECT 0 = 'Sean' applies.
That may look strange for a person who often deals with strong typing (for me, for example, it looks strange), but it is absolutely natural for programmers in JavaScript, PHP, Perl and other languages ​​from the world of the web. Many similar examples can be cited for these languages, and this does not surprise anyone.
For the rest of the examples from this article:
- examples with ENUM: yes, strange, albeit
documented behavior. But in general there are much more
serious reasons for not using ENUM, which apply to any DBMS.
- examples with UNION and INNER JOIN - yes, a lot of problems with the parser, and there are no objections here. I have heard about plans to rewrite the parser in MySQL for many years, and the work has finally
begun , which is good news.
- example from
bug # 27877 : an infamous nuance with the sorting rules for the German language and a bunch of compatibility issues and upgrades. But the solution to this problem has existed for 3 years as basically MySQL, and in MariaDB and Percona Server the solution appeared even earlier.
Updated 03/24/2016: But a
recent example of a no less serious problem with encodings in PostgreSQL.
Summing up, I do not see in such things reasons for the typical conclusions that they make. If someone thinks that there is no such thing in other DBMS, then
I have bad news . Since we are talking here in the context of PostgreSQL, you can search for the keywords “postgresql gotchas”, where funny things happen.
If you try to generalize it and formulate it correctly, then I would say so. In MySQL, compliance with SQL standards has never been a priority, and overall compliance is lower than in many other DBMSs, including PostgreSQL, where compliance with standards is one of the top priorities.
But love MySQL is not for it , and more on that later.
Updated 11/22/2015: There are exceptions to the general rule, i.e. Cases where MySQL supports the standard (s) better than PostgreSQL.
An article on the topic .
Conclusion
I'm still going to write separate publications on the replication and comparison of MySQL with PostgreSQL from the user's MySQL point of view. But before that I would like to close the topic of MySQL criticism from the PostgreSQL community. This is important because such publications are expected to cause a flurry of comments in the style of "no community and divides by zero quietly!". In order not to answer many times, it is more convenient to collect all the answers in one place. Thank you all for the comments, I hope I did not miss anything.