Introducing CUBRID - Web-Optimized DBMS

Greetings to all, dear Habravchane!

Personally, we did not introduce our development to Habra users, but most likely you have already read about CUBRID in the habratopic Lev Khomich. Some points in the article are not entirely correct, what I want to fix in this topic. Therefore, I propose to get to know each other and learn in more detail why we represent CUBRID as the most optimized DBMS for Web applications. I will also talk about those nuances about which you will not find anywhere else (yet), even on the official website of the project http://www.cubrid.org . In this way, you will learn a lot and, I hope, tell, advise or offer us your ideas and opinions in the comments. Therefore, I am sure you will be pleased with our acquaintance.

First, when did the development of CUBRID start?
')
Different sources give different dates: 15 years ago, or 2006. Truly, the DBMS was sold and was in very great demand long before MySQL appeared, and even CUBRID itself. It was one of the first to have object-oriented architecture, which is widely used today in the gaming and multimedia industries. DBMS has become so popular that Oracle offered to buy the source code and license for its further development and sale for 1 billion US dollars. But the developers rejected the offer and instead found sponsors with an asset of $ 2 billion. It was still in the early 90s. Therefore, in Khabratopik Lev Khomich and some other sources talk about fifteen years and more experience.

However, officially, we, the developers of the DBMS, started the development of CUBRID the year 2006, when NHN , the main player in the search market of South Korea with 74% share, united into a team of 40 people our main architects and programmers and organized the CUBRID project. Occupying the 13th place in the global IT industry, the NHN possessed sufficient human and financial resources to successfully launch the project. By that time, NHN had already provided over 100 Web services for users of South Korea, Japan, China and the United States, including many online games , search , social and other services. We were confident that it was Web services that became more and more popular and diverse throughout the world, will determine the course of development of the IT industry. Therefore, we set a goal to develop the most optimized database management system for Web services and open its code under the GPL version 2.0 license.

Thus, the company decides to create an object-relational database management system that would provide all the benefits of both an OOSDBMS, which is so often used in online games and multimedia services, and an RDBMS, which has become the most popular solution for all other industries. For this purpose, the company acquires a license for the same 15-year-long OODBMS, and by that time the open standard SQL for 92 years already takes the basis of the relational part. This was the beginning of the development of the CUBRID DBMS.

First open source

For two years we have been developing CUBRID, and by October 2008 we released version 1.0 of the new DBMS, focused on use with Web applications. The first stable release was involved in the internal services of the NHN itself. Then, by the next month, we are finalizing the database and publicly anosiruem CUBRID, as the first open source database of South Korea.

The popularity of CUBRID in the domestic market grew as quickly as during the year several thousand users began to develop and adapt various applications for working with the CUBRID DBMS, like LACP (Linux, Apache, CUBRID, PHP / Perl / Python) and LnCP (Linux, nginx , CUBRID, PHP / Perl / Python) stacks, Windows installers, as well as well-known CMS (WordPress, phpBB, Joomla). During this first year, CUBRID was introduced to the internal management systems of the White House of Korea, the National Tax Service of Korea, and numerous ministries and corporations.

Thus, the first year was considered very successful. However, due to the fact that the majority of user development was limited to supporting only Korean, I and many others in the team did not like it. After all, we developed a DBMS not only for Korean users, but for the entire IT space. Therefore, exactly one year after the first release in October 2009, we transfer the source code of the project to a new Sourceforge.net resource so that users all over the world can follow the development of the project. Thus, SF.net becomes the main SVN, and English becomes the main language of development and documentation.

Key features and specifications

Today, the CUBRID DBMS is being developed for two main operating systems. These are Linux and Windows, for which the server-side CUBRID, all client applications and programming interfaces are available. For Mac OS X, only client applications are currently available, with which you can fully work with remote CUBRID servers. However, the development of the main server part of CUBRID for Mac OS is not in the plans yet.

The server part of CUBRID is developed in the C / C ++ programming language and is distributed under the GPL license version 2.0 or higher. Client applications are developed in different languages and are usually distributed under the BSD license (in more detail about the licensing policy of CUBRID I will tell in the next blog). The main tools for administering the CUBRID Manager, Query Browser and Migration Toolkit databases are written in Java. And programming interfaces are developed in C.

As I said earlier, in the CUBRID implementation of the relational part, we refer to the 92 open SQL standard. Many DBMSs support it, but each of them implements it differently. Take the system tables that store metadata about all existing or on a specific database. To do this, there are separate system databases in MySQL, MSSQL, and some other DBMS, for example, INFORMATION_SCHEMA, which are available for direct editing only to the system itself. Everything, in principle, is convenient, except that when transferring databases to another server, the system bases / tables on the new server (and on the old one too) must be updated. This usually happens automatically when restoring databases, which require additional resources, especially if there are hundreds or thousands of tables in the database. But it can be experienced. The worst thing is when the system tables are not updated at all or access to them changes. In this case, direct administrative intervention or modification of client applications is required.

In CUBRID, system tables are implemented a little differently. Each base in CUBRID that you create has its own system catalog classes and virtual catalog classes that hold data about this database, including all indexes, columns, users, triggers, etc. Base transfers go without a headache. Personally, I like this implementation more.

There were talks about the fact that CUBRID does not have tables, columns, and much more, which is found in ordinary relational DBMS. CUBRID has tables, columns, procedures, and everything else. Data access in CUBRID is possible in different ways. To access a table, you can use both tables (relational approach) and classes (object approach). To access rows, you can use both rows (the relational approach) and class instances (the object approach). Columns or attributes. Procedures or methods. Thus, you can use regular SQL ( SELECT index_name FROM db_index) ) to retrieve, for example, all index names that are used throughout the database. No need to refer to an external base. You can also specify that the indexes are only primary, or reverse, or unique, or only foreign keys. If you are used to the relational concept, you will not notice any difference from any other RDBMS.

ACID (Atomicity, Consistency, Isolation, Durability) is implemented in CUBRID, thus there is full transaction support. In CUBRID, it is possible to split, replicate, compress, check and recover data. It is also possible to make hot / online backups, create updatable views, triggers, hierarchical and nested queries. CUBRID has no restrictions on the size of the database, the number of tables or rows, and even the size of certain data types, like BLOB and CLOB. It has cursors, as well as built-in counter functions, which Lev described in detail. CUBRID also allows you to cache and schedule requests. There are many ways to instantly optimize queries using SQL hints . One of the main features of CUBRID is its own support for High-Availability. This built-in feature of high availability is in itself a rather large topic, so I’ll tell you about it in more detail in a separate charatopic.

Where do we use CUBRID ourselves?

In general, CUBRID is a full-featured database management system that can provide uninterrupted data handling at very high loads. For example, at NHN we use the CUBRID database on the search engine servers NAVER, which accepts requests from more than 17 million unique users per day. CUBRID is used in the search results monitoring system on NAVER.com and is directly responsible for storing data on the quality of the results. To improve the relevance of search results and combat spam sites, we need to record keywords that are used in the search, and associate each of them with all Web pages that are already indexed by the search engine. Millions of entries are then entered, then updated, and of course, retrieved from the database, and CUBRID copes with it flawlessly.

You are most likely wondering how well CUBRID copes with system crashes. As you know, the reasons may be different, but it is important for us at NHN that access is nine nines, the lower limit is six. Therefore, on all servers, we will definitely enable the High Availability option CUBRID. Once there was a case when the master server of one of the services failed, and then due to physical problems. The failure of the main server could completely disable the entire service, but thanks to the High Availability feature of CUBRID, then the master server role was automatically transferred to the primary slave server. This happened so quickly during the timeout set in the notification system that even the database administrators themselves did not notice the failure of the hardware until the plan looked into the logs. It was the first time, and so far the only one, when the active server fell in production.

Current status

To date, we have developed a very large number of functions in CUBRID, many of which are fully compatible with other RDBMS, like MySQL or MSSQL, and at the same time there are so many unique features. For user convenience, we strive to provide maximum compatibility with MySQL so that when switching to CUBRID, users can easily adapt their applications. To this end, we have planned several “MySQL compatibility” phases at the level of SQL and programming interfaces. The first phase of the fairly broad package of MySQL compatible functions was completed and included in the CUBRID version 8.3.0. In parallel, updated programming interfaces. There are a few PHP functions left that are not yet fully compatible with mysql. At the beginning of next month (May 2011) we plan to release a new version of CUBRID 8.4.0 with the second phase, which will cover almost 90% of MySQL syntax. The final third phase we have planned for the end of the summer. Thus, by the beginning of autumn, I hope we will make up for all disagreements between CUBRID and MySQL.

Additional nuances, the course of development, plans, as well as other interesting stories from the life of CUBRID I will tell in the following topics. Hope this article gives you a lot of food for thought. Please download CUBRID , work in it, and tell in comments your impressions, remarks, and wishes.

Source: https://habr.com/ru/post/117687/

All Articles

Introducing CUBRID - Web-Optimized DBMS

More articles: