📜 ⬆️ ⬇️

Why PostgreSQL is better than other open source SQL databases. Part 1

Today, let's talk about the benefits of Postgres over other open source systems. We will certainly reveal this topic in more detail on PG Day'16 Russia, which is only two months away.

You may be asking yourself: “Why PostgreSQL?” There are other options for open source relational databases (in this article, we looked at MySQL, MariaDB and Firebird), so what can Postgres offer something they don’t have? The PostgreSQL slogan states that it is "The most advanced open source database in the world." We give several reasons why Postgres makes such statements.

In the first part of this series, we’ll talk about data storage — model, structure, types, and size restrictions. And in the second part we focus more on sampling and data manipulation.
')


Data model


PostgreSQL is not just a relational, but an object-relational database. This gives it some advantages over other open source SQL databases, such as MySQL, MariaDB and Firebird.

The fundamental characteristic of an object-relational database is the support of user objects and their behavior, including data types, functions, operations, domains, and indices. This makes Postgres incredibly flexible and reliable. Among other things, he knows how to create, store and retrieve complex data structures. In some examples below, you will see nested and composite constructs that are not supported by standard RDBMS.

Structures and data types


There is an extensive list of data types that Postgres supports. In addition to numeric, floating-point, text, boolean, and other expected data types (as well as their many variations), PostgreSQL boasts support for uuid, money, enumerated, geometric, binary types, network addresses, bit strings, text search, xml, json , arrays, composite types and ranges, as well as some internal types for identifying objects and log locations. It is fair to say that MySQL, MariaDB and Firebird also have some of these data types, but only Postgres supports them all.

Let's take a closer look at some of them:

Network addresses

PostgreSQL provides storage of various types of network addresses. The CIDR data type (classless Internet Domain Routing, Classless Internet Domain Routing) follows the convention for IPv4 and IPv6 network addresses. Here are some examples:

Also available for storing network addresses is the INET data type, used for IPv4 and IPv6 hosts, where subnets are optional. The MACADDR data type can be used to store MAC addresses to identify hardware, such as 08-00-2b-01-02-03.

MySQL and MariaDB also have INET functions for converting network addresses, but they do not provide data types for internal storage of network addresses. Firebird also has no types for storing network addresses.

Multidimensional arrays

Since Postgres is an object-relational database, value arrays can be stored for most existing data types. This can be done by adding square brackets to the data type specification for the column or by using the ARRAY expression. The size of the array can be specified, but this is optional. Let's look at the holiday picnic menu to demonstrate the use of arrays:

--  ,      CREATE TABLE holiday_picnic ( holiday varchar(50) --   sandwich text[], --  side text[] [], --   dessert text ARRAY, --  beverage text ARRAY[4] --   4-  ); --      INSERT INTO holiday_picnic VALUES ('Labor Day', '{"roast beef","veggie","turkey"}', '{ {"potato salad","green salad","macaroni salad"}, {"chips","crackers"} }', '{"fruit cocktail","berry pie","ice cream"}', '{"soda","juice","beer","water"}' ); 

MySQL, MariaDB, and Firebird do not know how. To store such arrays of values ​​in traditional relational databases, you have to use a workaround and create a separate table with rows for each of the array values.

Geometric data

Location data is quickly becoming a basic requirement for many applications. PostgreSQL has long supported many geometric data types, such as points, lines, circles, and polygons. One of these types is PATH, it consists of a set of consecutive points and can be open (the start and end points are not connected) or closed (the start and end points are connected). Let's take the hiking trail as an example. In this case, the hiking trail is a loop, so the starting and ending points are connected, and it means that my path is closed. Parentheses around a set of coordinates indicate a closed path, and square brackets indicate an open one.

  --      CREATE TABLE trails ( trail_name varchar(250), trail_path path ); --    , --        - INSERT INTO trails VALUES ('Dool Trail - Creeping Forest Trail Loop', ((37.172,-122.22261666667), (37.171616666667,-122.22385), (37.1735,-122.2236), (37.175416666667,-122.223), (37.1758,-122.22378333333), (37.179466666667,-122.22866666667), (37.18395,-122.22675), (37.180783333333,-122.22466666667), (37.176116666667,-122.2222), (37.1753,-122.22293333333), (37.173116666667,-122.22281666667))); 

The PostGIS extension for PostgreSQL complements the existing properties of geometric data with auxiliary spatial types, functions, operators, and indexes. It provides location support and supports both raster and vector data. It also provides compatibility with many third-party geospatial tools (copyrighted and open source) for displaying, drawing, and working with data.

Note that in MySQL 5.7.8 and in MariaDB, starting with version 5.3.3, data type extensions have been added to support the OpenGIS geographic information standard. This version of MySQL and subsequent versions of MariaDB offer data type storage similar to that provided by Postgres geodata. However, in MySQL and MariaDB, data values ​​must first be converted to geometric format with simple commands before being inserted into a table. Firebird does not currently support geometric data types.

JSON support

JSON support in PostgreSQL allows you to move to storing schema-less data in an SQL database. This can be useful when the data structure requires a certain amount of flexibility: for example, if during the development process the structure is still changing or it is not known which fields will contain the data object.

The JSON data type provides validation JSON, which allows the use of specialized JSON operators and functions built into Postgres to execute queries and manipulate data. The JSONB type is also available - a binary version of the JSON format, in which spaces are removed, the sorting of objects is not preserved, instead they are stored in the most optimal way, and only the last value for duplicate keys is stored. JSONB is usually the preferred format, because it requires less space for objects, can be indexed and processed faster, as it does not require re-parsing.

Support for embedded JSON objects has been added to MySQL 5.7.8 and MariaDB 10.0.1. But, although there are many functions and operators for JSON that are now available in these databases, they are not indexed like JSONB in ​​PostgreSQL. Firebird has not yet joined the trend and only supports text JSON objects.

Creating a new type

If it so suddenly happens that there is not enough of an extensive list of Postgres data types, you can use the CREATE TYPE command to create new data types, such as composite, enumerated, range, and base. Consider an example of creating and sending requests of a new composite type:

  --     "wine" CREATE TYPE wine AS ( wine_vineyard varchar(50), wine_type varchar(50), wine_year int ); --  ,     "wine" CREATE TABLE pairings ( menu_entree varchar(50), wine_pairing wine ); --        ROW INSERT INTO pairings VALUES ('Lobster Tail',ROW('Stag''s Leap','Chardonnay', 2012)), ('Elk Medallions',ROW('Rombauer','Cabernet Sauvignon',2012)); /*        ( ,        ) */ SELECT (wine_pairing).wine_vineyard, (wine_pairing).wine_type FROM pairings WHERE menu_entree = 'Elk Medallions'; 

Since they are not object-relational, MySQL, MariaDB and Firebird do not provide such powerful functionality.

Data size


PostgreSQL can handle a lot of data. Current published restrictions are listed below:

Maximum database sizeIs not limited
Maximum table size32 TB
Maximum row size1.6 TB
Maximum field size1 GB
Maximum number of rows in the tableNot limited
Maximum number of columns in a table250-1600 depending on the type of column
Maximum number of indexes in a tableNot limited

In Compose [app. Lane: the organization in which the author of the original article works] we automatically scale your installation so that you don’t have to worry about increasing the amount of data. But, as any database administrator knows, it is worth being wary of too large and unlimited possibilities. We advise you to use common sense when creating tables and adding indexes.

By comparison, MySQL and MariaDB are notorious for limiting the size of 65,535 byte rows. Firebird also offers only 64KB as the maximum line size. Typically, the amount of data is limited to the maximum file size of the operating system. Since PostgreSQL can store table data in many smaller files, it can bypass this limitation. But it is worth noting that too many files can negatively affect performance. MySQL and MariaDB support a larger number of columns in the table (up to 4.096 depending on the type of data) and larger individual table sizes than PostgreSQL, but the need to exceed the existing Postgrese constraints occurs only in extremely rare cases.

Data integrity


Postgres seeks to conform to the ANSI-SQL: 2008 standard, meets the requirements of ACID (atomicity, consistency, isolation and reliability) and is known for its referential and transactional integrity. Primary keys, limiting and cascading foreign keys, unique constraints, NOT NULL constraints, check constraints, and other data integrity functions ensure that only correct data is stored.

MySQL and MariaDB are working more to conform to the SQL standard with the InnoDB / XtraDB table engines. They now offer the STRICT option using SQL modes, which sets validation checks on the data used. Despite this, depending on which mode you use, invalid or even truncated data without your knowledge may be inserted or created during the update. None of these databases currently support CHECK restrictions. In addition, they have many peculiarities regarding the referential integrity constraints on foreign keys. In addition to the above, data integrity can be significantly affected depending on the storage engine chosen. MySQL (and fork MariaDB) makes no secret of the fact that they have traded integrity and compliance with standards for speed and efficiency.

Summing up


Postgres has many possibilities. Created using the object-relational model, it supports complex structures and a wide range of built-in and user-defined data types. It provides enhanced data capacity and has earned trust with respect for data integrity. You may not need all the advanced data storage functions that we explored in this article, but since needs can grow quickly, there is an undoubted advantage to have it all at your fingertips.

If it seems to you that PostgreSQL does not meet your needs, or you prefer to shoot from the hip, then you should pay attention to the NoSQL databases that we offer in Compose, or to think about other SQL databases that we mentioned. Each of them has its advantages. Compose is firmly convinced that it is very important to choose the right database for a specific task ... sometimes it means that you need to select several databases!

Want more Postgres? In the second part of this series, we will look at data manipulations and searches in PostgreSQL, including virtual table functions, query capabilities, indexing and language extensions.

Source: https://habr.com/ru/post/282764/


All Articles