In the wake of MySQL Users Conference 2011

I want to share with you my impressions of my trip to the MySQL Users Conference, which was held in Santa Clara (California) from April 14 to April 17, 2011.

Unlike previous years, nothing happened with MySQL during the conference, which is nice in itself (let me remind you that two years ago it was on the first day of the conference that the acquisition of Sun Microsystems Oracle was announced).

First of all, a conference for me is communication with people. This year there were not so many participants at the conference (about 1,100 people), but the percentage ratio of speakers and experts to visitors was very high.
')
A lot of topics were discussed during the week of the conference, in any case, there will be no related story, so I will describe what was interesting in the form of notes.

What's new in MySQL from Oracle

The presence of Oracle speakers at the conference was quite limited, and this was due to the fact that the company initially planned to shift the focus to Collaborate 11, which ran parallel to O'Reilly MySQL on the other side of the United States. I must say, it was a mistake - the speakers from Collaborate 11 told that there were very few listeners (20-30 people maximum) and their interest in MySQL was mediated. Indeed, Oracle would be an application software maker and it would be strange to expect an interest in RDBMS on Collaborate 11.

In the main report, Thomas Oulin spoke about MySQL 5.6, currently it is beta 5.6.2 - whose key new features relate to the field of replication, it is also worth noting some of the mysterious features of the query optimizer from MySQL 6.0. The rest is an incremental release that extends the capabilities of 5.5, 5.1 and 5.0.

From the “hot”, at first glance, new features, Thomas talked about memcached API to InnoDB. Maybe I don’t understand anything in NoSQL technologies, but this idea didn’t seem to be too tenacious — you can only use this API for reading without violating data consistency, but reading tasks from MySQL are already quite easy to scale. API is not suitable for writing APIs, primarily because memcached connections very rarely commit transactions.

COMMIT is an expensive operation, and without a regular commit, all InnoDB advantages over any other NoSQL solution are lost. If you so want to use MySQL as NoSQL, in my opinion, the MySQL Cluster + Cluster API is much more suitable from the “box”, which allows you to get significantly better performance without losing data consistency.

Other news, such as join pushdown in MySQL Cluster 7.2 or the new Windows Installer, were not very interesting to me personally.

About what MariaDB is doing

In a nutshell, the Monty Program has released 5.2 GA and is preparing to release 5.3 GA.
The main feature 5.2 is virtual columns, that is, the ability to set additional, “calculated” columns in the table. This is most useful if there is an index on such a column — you can quickly search for the result of the calculation without recalculating the value each time. The second, in my opinion, convenient application of this functionality - along with partitioning.
Because In the PARTITION BY operator, you can use far from any expression; alternatively, you can put the result of a calculation into a stored virtual column, and specify this column in the PARTITION BY statement.

I will not give examples, they are in
kb.askmonty.org/v/virtual-columns
www.openlife.cc/blogs/2010/october/what-would-you-use-virtual-columns

5.3 contains a fairly large number of possibilities. In a sense, this is MySQL 5.5 from MariaDB - all those optimizer features that we once worked together on in the now-dead branch of MySQL 6.0, the Monty Program Ab programmers brought to mind and included in MariaDB 5.3.

Of the most serious things in 5.3, a new cost-based optimizer for subqueries was made, hash joins were implemented, microseconds support was added in temporal types. All these are complex, requiring a long and hard work task.

It is the fact that the 6.0 code is actually rewritten again, and also the fact that the best MySQL testing engineers work in the Monty Program Ab (Philip Stoev alone is worth a whole department) gives us confidence that the features will be fairly stable.

MariaDB 5.3 is based on MySQL 5.1 and, to get the benefits of both MySQL 5.5 and MariaDB 5.3, it’s worth waiting for the end of the year when, according to Monty (a discount should be made on his optimism), MariaDB 5.5 will be released, combining the capabilities of the two systems.

My personal attitude towards MariaDB has changed at this conference. Two years ago, it seemed that Monty just wanted to have his fork and was ready to include all the patches in a row. Now, when the dust has settled, it is clear that weak patches have fallen off by themselves, the development process has settled down and gives good results.
I think that in a few years MariaDB will be able to make a very substantial competition to MySQL from Oracle.

References:
en.oreilly.com/mysql2011/public/schedule/detail/19899
assets.en.oreilly.com/1/event/36/New%20Query%20Engine%20Feature

Drizzle

For those interested in this fork, it is no longer news that Drizzle has released its first stable release, Drizzle7. The news is that the company Rackspace, a major American hosting provider, which was the main sponsor of the development of Drizzle, with the release of a stable release, stopped financial support for this project. Whether there are companies willing to sponsor its development is unclear.

I would not assume the sudden demise of this project, especially since the many ideas in Drizzle deserve attention. Take at least Firebird, the open-source version of Interbase, a popular DBMS at one time - few people know that the project is still developing.

Unfortunately, unlike other MySQL forcs, Drizzle, in my opinion, could not find "its" user. I am not aware of these users, as well as the reasons why Drizzle was preferred.

Related Links:

krow.livejournal.com/700783.html
en.oreilly.com/mysql2011/public/schedule/detail/17806

Keynote by Baron Schwartz

Baron Schwartz, the leading architect of Perkona, always reminded me a bit of a Protestant missionary. His report on the future of an open source database was also missionary in spirit.

However, there is little to say about the report. In addition to fairly obvious observations of technological changes in the industry, the report resembles a letter to Santa Claus with a list of gifts and wishes.

Indeed, open source databases do not meet the requirements of modern technological realities and are not capable of completely replacing commercial DBMSs. However, in order to predict the development of our industry, it is enough to look at the market of operating systems, the stabilization in the architecture of which has come, mainly, together with the stabilization of the architecture of the hardware environment. Fortunately, the DBMS hardware environment continues to evolve, which allows our industry to remain forever young.

I would like to mention this report for one reason - the stated thesis that the relational data model will remain dominant in the industry. I totally agree with that. The reasons for my conviction, however, are quite specific, and have long been known - the separation of the data model from their presentation, simplicity, a clear mathematical apparatus. The relational model is that greatest common divisor, uniting a multitude of approaches and views. It is this unsurpassed versatility that allows this model to remain relevant for 40 years.

The report also contained a nod to Oracle, mentioning MySQL 5.5 as one of the evidence that Oracle is paying attention to MySQL. Unfortunately, the length of the system software development cycle is always overlooked in such cases: MySQL 5.5 was born in 2005, long before the Oracle acquisition was in any plans. The desire to support Oracle is laudable, but the mention of 5.5 as one of the proofs of the intentions of Oracle has already turned sour.
If it is worth thanking any one company for the 5.5 release, it is Sun Microsystems, in which marketing timpani has long been successfully replaced by a culture of support for technological innovation.

en.oreilly.com/mysql2011/public/schedule/detail/17808

Talk with Bruce Momgiyan

My first meeting with Bruce was in 2004, at the O'Reilly Open Source Convention in Portland. I then worked at the stand - communicated with users, showed new product features. Bruce came up and asked when we would have transactions :)
This year, PostgreSQL and EnterpriseDB took an active part in the conference - a plenary report, a few regular reports, a huge stand at the exhibition. Remembering the acquaintance, I started the conversation with when replication will finally appear in PostgreSQL.

Replication in PostgreSQL, of course, has been around for quite some time, but, which is gratifying, since version 9.0 it has been included in the server. Unlike MySQL, PostgreSQL does not need an additional replication log (binary log), but simply sends a write ahead log to the repository's store ahead replica. This not only reduces the amount of writing to disk, but also removes the problem of group commit and the need to support distributed transactions (XA), which they have been trying to effectively implement in MySQL for a long time.

The replication log format in PostgreSQL is thus “physical". I.e. it contains actual changes to pages and files on disk, without being tied to tables and rows. This, in turn, allows MVCC replication to be implemented (concurrency control using multi-version) environment without additional locks.

In general, replication in PostgreSQL is simpler and more reliable, which is both an advantage and a disadvantage, for example, in cases where there is a need to create complex replication topologies.

developer.postgresql.org/pgdocs/postgres/high-availability.html
kristiannielsen.livejournal.com/12254.html
www.theserverside.com/feature/Comparing-MySQL-and-Postgres-90-Replication

Tungsten replication

Continuing the topic of replication: an interesting report from Continuent about their product for MySQL and PostgreSQL replication, the Tungsten replicator. Tungsten replicator allows you to solve many problems that arise with more or less "non-standard" use
replications from mysql:

- support for global replication id, which greatly simplifies recovery in case of failure of one of the replicas
- Multi-source replication - that is, when the same replica can receive data from several
servers. To resolve conflicts, Tungsten allows you to create triggers in Java.
- better performance when using replication in several threads (currently replication
in MySQL is always executed in one thread)
- the ability to set arbitrary filters on replication events. MySQL uses variables for this purpose.
like replicate-do-db, replicate-wild-do, etc.
- Check replica consistency.

For the sake of justice, I must say that many replications of MySQL 5.6.2 repeat Tungsten.
Replicator. On the other hand, this only confirms the relevance of the implemented solutions.
Continu's Tungsten Replicator conference won the prize for best product of the year.

tungsten.sourceforge.net/docs/Tungsten-Replicator-Guide/Tungsten-Replicator-Guide.html
en.oreilly.com/mysql2011/public/schedule/detail/19268

Report by Martin Mikos

Marten always attracted me primarily as a speaker who was pleasant to listen to.
In his report, he tried to understand how the software world will change with the development of cloud computing. The main points of this report are as follows:

- The Internet is waiting for a further qualitative leap with an increase in the number of users to several billions

- The amount of data on the Internet is also growing exponentially. This creates the need for all new solutions, such as
NoSQL. NoSQL-technologies become justified first of all because with a huge amount
users, the savings on hardware obtained using NoSQL. has a significant cumulative effect.

- The development of cloud infrastructure eliminates the need to work on software distribution channels, such as Linux distributions. The need to be able to download the package for your operating system goes into the background - simply getting access to the service in the cloud. Quoting Marten:
“The GPL was about distribution of derivative works. But few companies still distribute them. ”

- cloud technologies will be built on new data warehouses and infrastructure, in addition to traditional DBMS

- the open source software development model in the cloud infrastructure is changing - the issue with the derivative works is no longer worth it, because the software is no longer distributed.

- the software development model for the Web is changing:
many relatively weak single-processor virtual machines in the cloud infrastructure-- a new platform
development, even for initially small sites

en.oreilly.com/mysql2011/public/schedule/detail/17807

My own report

My report was devoted to the new subsystem of locks in MySQL 5.5. I will not repeat the abstracts and assessments, the report is fully available in electronic form, the presentation and the text can be found in the conference program and online:

en.oreilly.com/mysql2011/public/schedule/detail/17340
www.slideshare.net/kostjaosipov/metadata-locking-in-mysql-55

Source: https://habr.com/ru/post/119418/

All Articles

In the wake of MySQL Users Conference 2011

More articles: