📜 ⬆️ ⬇️

Coverage SQL in Postgres

One thing that makes me look at the ORM from the side is how they try so hard to hide and abstract all the power and expressiveness of SQL. Before I write further, let me say that, Frans Bouma reminded me yesterday that there is a difference between ORM and the people who use them. These are just tools (in ORM) and I agree with that, I also agree that not bad fast food makes people full, and these are people who eat it too much.

Instead of scolding ORM, I would like to tell you why I do not use an OO abstraction at the base of my database. In short: because SQL can help you a lot to express the value of your application in terms of data. The only way to know how your application works is to know from what data it is generated.

Try to devote some time to studying it, and you will see how your favorite DB engine in work emphasizes the SQL standard. Let's look at some examples, all the functionality that I use is described here , there are many things in the documentation that you can find out, my examples below, but this is only part of the functionality that I use.

Postgres Built-in Fun


Right from the start, there are a lot of syntax sugar in Postgres, and it's really fun to do with it. SQL is ANSI standardized languages ​​- this means that you can count on some rules when moving from one system to another. Postgres follows standards almost to the letter, but goes beyond with very funny additions.
')

Regular expressions


At some point you may have to run some chain of algorithms. Many databases including SQL Server (sorry for the link to MSDN) allow you to use Regex patterns through functions or some other constructs. It is a pleasure to work with Posters. Easy way (using psql for old Takepub database):

select sku,title from products where title ~* 'master'; sku | title ------------+--------------------------------- aspnet4 | Mastering ASP.NET 4.0 wp7 | Mastering Windows Phone 7 hg | Mastering Mercurial linq | Mastering Linq git | Mastering Git ef | Mastering Entity Framework 4.0 ag | Mastering Silverlight 4.0 jquery | Mastering jQuery csharp4 | Mastering C# 4.0 with Jon Skeet nhibernate | Mastering NHibernate 2 (10 rows) 

The ~ * operator says: that the POSIX regular expression pattern follows (case insensitive)
You can do this case-sensitive by omitting *.
Regular expressions can be a pain at work, but you could improve this query using functions for full-text search using indexes:

 select products.sku, products.title from products where to_tsvector(title) @@ to_tsquery('Mastering'); sku | title ------------+--------------------------------- aspnet4 | Mastering ASP.NET 4.0 wp7 | Mastering Windows Phone 7 hg | Mastering Mercurial linq | Mastering Linq git | Mastering Git ef | Mastering Entity Framework 4.0 ag | Mastering Silverlight 4.0 jquery | Mastering jQuery csharp4 | Mastering C# 4.0 with Jon Skeet nhibernate | Mastering NHibernate 2 (10 rows) 

But it is a little more difficult. Postgres is a field type using the full text search tsvector. You can even have this column in the table, if you want, and that's great, since it is not hidden in some binary index somewhere. I converted the title on the fly to the tsvector using the to_tsvector function. It breaks and prepares the string for the search. I show this through the to_tsquery function. This query is built from the term "Mastering". The @@ bits simply say to return true if the tsvector fields match the tsquery. The syntax hurts the eye a little, but it works very well and quickly. You can use the concat function to concatenate strings with additional fields:

 select products.sku, products.title from products where to_tsvector(concat(title,' ',description)) @@ to_tsquery('Mastering'); sku | title ------------+--------------------------------- aspnet4 | Mastering ASP.NET 4.0 wp7 | Mastering Windows Phone 7 hg | Mastering Mercurial linq | Mastering Linq git | Mastering Git ef | Mastering Entity Framework 4.0 ag | Mastering Silverlight 4.0 jquery | Mastering jQuery csharp4 | Mastering C# 4.0 with Jon Skeet nhibernate | Mastering NHibernate 2 (10 rows) 

Combining the title and description into one area allows you to search for them at the same time, using all the features of full-text search .

Generation Series


There is a good generate_series function that outputs a sequence that you can use in queries for various reasons:

 select * from generate_series(1,10); generate_series ----------------- 1 2 3 4 5 6 7 8 9 10 

If sequential functions do not suit you, you can use other functions of the type random ():

 select * from generate_series(1,10,2) order by random(); generate_series ----------------- 3 5 7 1 9 (5 rows) 

Here I added an additional argument, talking about the omission of values ​​by 2. It also works with dates:

 select * from generate_series( '2014-01-01'::timestamp, '2014-12-01'::timestamp, '42 days'); generate_series --------------------- 2014-01-01 00:00:00 2014-02-12 00:00:00 2014-03-26 00:00:00 2014-05-07 00:00:00 2014-06-18 00:00:00 2014-07-30 00:00:00 2014-09-10 00:00:00 2014-10-22 00:00:00 (8 rows) 

I'm talking about dates 2014 with an interval of 42 days. You can do this in the opposite direction, simply using a negative interval. Why is this useful? You can use alias and connect the numbers from the generated series, depending on what you think:

 select x as first_of_the_month from generate_series('2014-01-01'::timestamp,'2014-12-01'::timestamp,'1 month') as f(x); first_of_the_month --------------------- 2014-01-01 00:00:00 2014-02-01 00:00:00 2014-03-01 00:00:00 2014-04-01 00:00:00 2014-05-01 00:00:00 2014-06-01 00:00:00 2014-07-01 00:00:00 2014-08-01 00:00:00 2014-09-01 00:00:00 2014-10-01 00:00:00 2014-11-01 00:00:00 2014-12-01 00:00:00 (12 rows) 

Alias ​​functions allow you to use the result string in accordance with the SQL call.
Such things are useful for analytics and verification of your data. Also note the month specification. This Postgres interval is something you'll use often in queries.

Work with dates


Intervals are a good combination for working with dates in Postgres. For example, if you want to know the date today which will be in a week:

 select '1 week' + now() as a_week_from_now; a_week_from_now ------------------------------- 2015-03-03 10:08:12.156656+01 (1 row) 

Postgres sees now () as a timestamp and uses the ( + ) operator to convert the string '1 week' as an interval. The result 12015-05-06 17: 59: 30.587874 was interesting.
This will tell me the current date and time down to the millisecond. And also the timezone (+1 which is now in Italy) If you have ever struggled with dates in UTC, then you know that this is a big pain. Postgres has a built-in timestamptz data type (representing a time zone label) and the conversion will take place automatically when the date is calculated.
For example, I want to ask Postgres what time in California:

 SELECT now() AT TIME ZONE 'PDT' as cali_time; cali_time ---------------------------- 2015-02-24 02:16:57.884518 (1 row) 

Returns the interval difference between two timesamp. At 2 AM it is better not to call Jon Galloway to say that his SQL server is on fire. Let's see how many hours between me and John:

 select now() - now() at time zone 'PDT' as cali_diff; cali_diff ----------- 08:00:00 (1 row) 

Note that it returns a value with an 8 hours mark, which is not an integer. Why is this important? Time is a relative thing; it is very important to know the time zone of your server when calculating data depending on time. For example, in my Takepub database I wrote down when orders were placed. If 20 orders came at the end of the year, my accountant wanted to know which orders came earlier or later than January 1, 2013. My server is in New York. My company is registered in Hawaii.

These important things in Postgres: handlers and many other functions for working with dates are pretty enjoyable.

Aggregation


Dealing with aggregation and aggregation in Postgres can be tedious because it is very, very standards compliant. You can always be sure that the GROUP BY is not important in your SELECT statement. If you want to see the sales for the month, grouped for the week, you need to run the following queries:

 select sku, sum(price), date_part('month',created_at) from invoice_items group by sku,date_part('month',created_at) having date_part('month',created_at) = 9 

This is a bit of an extreme syntax, use a better Postgres future - window functions:

 select distinct sku, sum(price) OVER (PARTITION BY sku) from invoice_items where date_part('month',created_at) = 9 

The same data, but too much less (window functions are also available on the SQL server).
Here I am making a set based on the calculations, indicating that I want to run the SUM function on the data section for this row. If you did not specify DISTINCT here, the query would have issued all the sales as if we simply indicated a SELECT query.
Excellent opportunity to use window functions together with aggregating ones:

 select distinct sku, sum(price) OVER (PARTITION BY sku) as revenue, count(1) OVER (PARTITION BY sku) as sales_count from invoice_items where date_part('month',created_at) = 9 

Gives me the number of monthly sales calculated by the field sku and revenues. I can also display total sales per month in the following column:

 select distinct sku, sum(price) OVER (PARTITION BY sku) as revenue, count(1) OVER (PARTITION BY sku) as sales_count, sum(price) OVER (PARTITION by 0) as sales_total from invoice_items where date_part('month',created_at) = 9 

I use PARTITTION BY 0, thus saying that I need to use the “entire section set” it will output all sales in September ... And by combining this inclusion in the CTE ( a Common Table Expression ) I can run some interesting calculations:

 with september_sales as ( select distinct sku, sum(price) OVER (PARTITION BY sku) as revenue, count(1) OVER (PARTITION BY sku) as sales_count, sum(price) OVER (PARTITION by 0) as sales_total from invoice_items where date_part('month',created_at) = 9 ) select sku, revenue::money, sales_count, sales_total::money, trunc((revenue/sales_total * 100),4) as percentage from september_sales 

In the final select, select the revenue fields and sales_total as money type - it will be beautifully formatted with a currency symbol. Quite a comprehensive sales request - I get the total sku, the number of sales and the percentage of sales per month, it turns out pretty simple SQL. I use trunc CTE to round up to 4-digit numbers, because the result in percent can be quite long.

Strings


I showed you some charms over the Regex. But much more you can do over the lines in Postgres. Consider a query that I use quite often:

 select products.sku, products.title, downloads.list_order, downloads.title as episode from products inner join downloads on downloads.product_id = products.id order by products.sku, downloads.list_order; 

The request receives all my videos and individual episodes (called downloads). I could use this request on pages that are displayed to the user. But what if you want to summarize the episodes? I can use some aggregation functions for this. The simplest example is a comma-separated string of names:

 select products.sku, products.title, string_agg(downloads.title, ', ') as downloads from products inner join downloads on downloads.product_id = products.id group by products.sku, products.title order by products.sku 

string_agg works like String.join () in your favorite language. But we can do better by combining through concat and then into an array:

 select products.sku, products.title, array_agg(concat(downloads.list_order,') ',downloads.title)) as downloads from products inner join downloads on downloads.product_id = products.id group by products.sku, products.title order by products.sku 

Here I use array_agg pulling data from list_order and title to combine downloads into a table and the output is an array.

I use the concat function to combine the list_order.
If you use Node.Js at the output, you can immediately run through it with an iterator.
Also using Node, you can use JSON:

 select products.sku, products.title, json_agg(downloads) as downloads from products inner join downloads on downloads.product_id = products.id group by products.sku, products.title order by products.sku 

Where I show the relations of the boot bits (ie Child Records) with fields which I can easily work on the client with an array of JSON.

findings


If you know SQL not very well - especially as your favorite DBMS implements it - use this week to get to know it better. This is a very powerful tool for your application to work: your data.

Source: https://habr.com/ru/post/258153/


All Articles