📜 ⬆️ ⬇️

Three interesting problems on SQL knowledge - Solutions

In my recent topic, I gave three interesting MySQL tasks in my opinion with a desire to look at possible solutions with habrami. As I promised, I give my solutions to these problems. In order for the text to be interesting and informative, I decided to thoroughly chew on what and why. So…



First, go through the solutions that were proposed in the comments.
')
1 task.

There was nothing complicated here, the moreover, there were few right decisions. The most common mistake is that few people thought of considering the option when the category could not be published. They came across with incorrect grouping or just gross syntax errors in the code. There were solutions, after watching EXPLAIN which arose suicidal thoughts. :-)

But after correcting mistakes, many worked correctly, except for some "minor" sins.

2 task.

In fact, it turned out easier than I thought - but this is more my fault, because the condition was not sufficiently accurate, and no one could figure it out. In the end, it turned into a trick to which no one paid attention. The point is that the ordi field is responsible for sorting, but it was not clarified that it is unique (although there was no reverse either, and this is my fault, yes). This field is indeed sorted, but for each affiliation there are different meanings, that is, if in one photo album there is 1,2,3, then in the other not 4,5,6 but also 1,2,3. Because of this, half the decisions are both right and not at the same time.

Of the errors, almost nowhere are there checks for the occurrence flags above. Apart from this, almost everyone worked.

If we take into account the conditions I have cited, then this task is quite complicated, namely, the request is very cumbersome. Next stant is clear why.

3 task.

Solutions, as expected, were few. Only one thing worked as it should - the proposed pharod . Because the task is really very difficult, and here it’s not the bulkiness of requests as in the second, but directly in the solution itself.

Now solutions.


To begin with, we will build a template query that will correctly take into account all the above conditions and select everything at all. I always do this if the same data in the tables must be shown, but depending on certain external conditions. Now it does not matter, I just want to clarify how I solve this step by step.

We link all three tables in a single query by setting join conditions using JOIN ... ON (...). It turns out about the following:

SELECT
*
FROM photo_category as c
JOIN photo_gallery as g ON (g.c_id=c.id AND c.is_published AND g.is_published)
JOIN photo_image as i ON (i.g_id = g.id AND i.is_published)
GROUP BY g.id
HAVING COUNT (i.id) > 0
ORDER BY c.ordi, g.ordi, i.ordi;



I am writing the conditions of joining in ON (...) so that it would be immediately clear what belongs to what and not to overload WHERE (in fact, one hell where they are written). Grouping is needed to filter empty galleries and categories (for them the HAVING COUNT (i.id) condition will be NULL). Sort by fields present.

All initial conditions are fulfilled in this request, although this is not a solution at all. We have made the preparation. And now let's go.

Task one.


We take the original request, which already takes into account everything. Add a condition of sampling by category ID and another JOIN table, taking into account the is_main_foto flag. But since this may not be the case then we will make it LEFT JOIN:

SELECT
*
FROM photo_category as c
JOIN photo_gallery as g ON (g.c_id=c.id AND c.is_published AND g.is_published)
JOIN photo_image as i ON (i.g_id = g.id AND i.is_published)
LEFT JOIN photo_image as i2 ON (i2.g_id = g.id AND i2.is_main_foto)
WHERE c.id = 1
GROUP BY g.id
HAVING COUNT (i.id) > 0
ORDER BY c.ordi, g.ordi, i.ordi;



Now if the value of im2.id IS NULL, then in i.id absolutely any arbitrary photo will lie. By the way, a normal DBMS in this case should swear because it will not be clear what value to take in i . Now, for complete happiness, we write the answer using IF:

SELECT
c.id as cid,
c.title as ctitle,
g.id as gid,
g.title as gtitle,
IF (i2.id IS NULL , i.id, i2.id) as image_id
FROM photo_category as c
JOIN photo_gallery as g ON (g.c_id=c.id AND c.is_published AND g.is_published)
JOIN photo_image as i ON (i.g_id = g.id AND i.is_published)
LEFT JOIN photo_image as i2 ON (i2.g_id = g.id AND i2.is_main_foto)
WHERE c.id = 1
GROUP BY g.id
HAVING COUNT (i.id) > 0
ORDER BY c.ordi, g.ordi, i.ordi;



For those who do not know how IF works, I will explain - IF (<condition>, <expression value if true>, <expression value if false>) . Trenarny operator practically.

Everything, the problem is solved.

Task two.

The solution is not very difficult: you need to get the ORDI of the current element, then make MAX for satisfying the condition of the sample under the condition that the received ordi is less and thus get the previous one, then similarly - the next one. If you do not go into details, then everything is quite simple, but if you go into it, you get quite complex queries.

First we get the ordi (sort field) of the current element - it's very simple:

SELECT ordi FROM photo_image WHERE id = 1


Next, we will take into account what to look for if you enter only into the current gallery, that is, you must determine the id of the gallery (by condition it can be obtained from the outside, but for the sake of completeness, we calculate it ourselves):

SELECT g_id FROM photo_image WHERE id = 1


Now we will build a request for a previous ORDI (we are looking only for published ones):

SELECT MAX (ordi) from photo_image
WHERE is_published AND g_id = ( SELECT g_id FROM photo_image WHERE id = 1)
AND ordi < ( SELECT ordi FROM photo_image WHERE id = 1)



Similarly for the following:

SELECT MIN (ordi) from photo_image
WHERE is_published AND g_id = ( SELECT g_id FROM photo_image WHERE id = 1)
AND ordi > ( SELECT ordi FROM photo_image WHERE id = 1)



Now we know the ORDI of the following and previous ones, we need to get the ID of the corresponding elements, again, provided that we are only in the current gallery:

SELECT id, title
FROM photo_image
WHERE
g_id = ( SELECT g_id FROM photo_image WHERE id = 1)

AND
(
ordi =
(
SELECT MAX (ordi) from photo_image
WHERE is_published AND g_id = ( SELECT g_id FROM photo_image WHERE id = 1)
AND ordi < ( SELECT ordi FROM photo_image WHERE id = 1)
)

OR
ordi =

(
SELECT MIN (ordi) from photo_image
WHERE is_published AND g_id = ( SELECT g_id FROM photo_image WHERE id = 1)
AND ordi > ( SELECT ordi FROM photo_image WHERE id = 1)
)
);



It seems to be everything, but we still have no condition checks. And it is also not clear how to separate the following from the previous one if we have one result in the sample. To solve this, we will add our photo to the sample (for which we calculate) and also use our “template” query, and in the end we will get:

SELECT ordi, id, title
FROM photo_image
WHERE
EXISTS
(
SELECT
i.id
FROM photo_category as c
JOIN photo_gallery as g ON (g.c_id=c.id AND c.is_published AND g.is_published)
JOIN photo_image as i ON (i.g_id = g.id AND i.is_published)
WHERE i.id = 1
)

AND
g_id = ( SELECT g_id FROM photo_image WHERE id = 1)

AND
(
ordi =
(
SELECT MAX (ordi) from photo_image
WHERE is_published AND g_id = ( SELECT g_id FROM photo_image WHERE id = 1)
AND ordi < ( SELECT ordi FROM photo_image WHERE id = 1)
)

OR
ordi =

(
SELECT MIN (ordi) from photo_image
WHERE is_published AND g_id = ( SELECT g_id FROM photo_image WHERE id = 1)
AND ordi > ( SELECT ordi FROM photo_image WHERE id = 1)
)

OR
id = 1
);



Condition

EXISTS
(
SELECT
i.id
FROM photo_category as c
JOIN photo_gallery as g ON (g.c_id=c.id AND c.is_published AND g.is_published)
JOIN photo_image as i ON (i.g_id = g.id AND i.is_published)
WHERE i.id = 1
)



will remove everything from the selection if the current photo, gallery or category is not published. There is no point in expressing HAVING and grouping here, since if there are no pictures in the gallery, then the selection will still be empty. The result will be three, two or one row. Accordingly, the desired photo will be the one that has the given ID and the previous one is determined based on the ordi - it is less or more than the one with the given ID.

It seems that there are a lot of subqueries here and it will work slowly. Actually, no - MySQL caches repeated zaryas (you can do SQL_CACHE for greater certainty), so each of the subqueries will be executed only 1 time. Under the condition of the existence of indices (well, where without them) it will work quickly enough.

Problem solved.

Note: I said that I do not know how to solve this problem in one request. This is so, because in reality I still needed to get a large pile of parameters for each photo, and if I did this by a request, it becomes just huge and really slows down. Here is tezhe 2 requests (on the following and previous) truth with the general filter.

The third task.

The ambush is that you actually need to do LIMIT for a specific group. You cannot do this as a normal limit, since it limits the result of the entire request and only that. Combine UNION requests for each category using LIMIT ... - well, the solution is of course, but if there are a lot of categories? Not okay. The next thing that comes to mind is filtering with WHERE. But then you need some kind of indication of what will be filtering, that is, you need to distinguish in the WHERE clause the rows that are more than our N. Only by giving your serial number to a series of albums for each category. How to make it? Only with local variables. I’m a little less than completely sure that this task and others like it are not solved in any other way.

If we take the original query and modify it a bit, like this:

SELECT
@a:=@a+1, *
FROM photo_category as c
JOIN photo_gallery as g ON (g.c_id=c.id AND c.is_published AND g.is_published)
JOIN photo_image as i ON (i.g_id = g.id AND i.is_published)
GROUP BY g.id
HAVING COUNT (i.id) > 0
ORDER BY c.ordi, g.ordi, i.ordi;



That will get the numbered rows. For each row, the value of @a will increase by one. Fine. But we need to count all of a certain group, so, provided that c_id changes, you need to reset @a. Let us introduce one more variable for this, and a reset condition (at the same time, instead of the asterisk, we add the fields we need):

SELECT
@a:=@a+1,
IF (@cid=cid, @a:=@a+1, (@cid:=cid) AND (@a:=1)),
c.id as cid, c.title as ctitle,
g.id as gid, g.title as gtitle,
FROM photo_category as c
JOIN photo_gallery as g ON (g.c_id=c.id AND c.is_published AND g.is_published)
JOIN photo_image as i ON (i.g_id = g.id AND i.is_published)
GROUP BY g.id
HAVING COUNT (i.id) > 0
ORDER BY c.ordi, g.ordi, i.ordi;



What does it mean. If the value of the variable cid is equal to the current id value of the category, then the variable @ a is incremented by 1, after which the result is equal to @a. Otherwise, the value of cid becomes equal to the current id of the category, @a becomes equal to 1 and the result of the expression becomes = 1 (that is, the same @a, although in reality it is just a good match). Now it would seem possible to just filter by the condition WHERE @a <N but ... this will not work.

And that's why. This scheme will be executed correctly only if the rows are already sorted. But sorting is done after the query has been processed, and not during. It turns out that you need to first sort, and only then numbered.

Now I also draw attention to the fact that, as I have already said, by a happy coincidence, the expression IF ( cid = cid, @a: = @ a + 1, ( cid : = cid) AND (@a: = 1)) will be equal @a in both cases. Then it can be used by writing it directly in WHERE. Let me remind you that by the condition it was necessary to show the picture, but this does not cause any problems, since we first execute the query and then we will only perform counting operations with filters. Then this is what we get:

SELECT q.* FROM
(
SELECT --
c.id as cid, c.title as ctitle,
g.id as gid, g.title as gtitle,
IF (i2.title IS NULL , i.title, i2.title)
FROM photo_category as c
LEFT JOIN photo_gallery as g ON (g.c_id=c.id AND c.is_published AND g.is_published)
LEFT JOIN photo_image as i ON (i.g_id = g.id AND i.is_published)
LEFT JOIN photo_image as i2 ON (i2.g_id = g.id AND i2.is_main_foto)
GROUP BY g.id
HAVING COUNT (i.id) > 0
ORDER BY c.ordi, g.ordi DESC
) as q
WHERE IF (@cid=q.cid, @a:=@a+1, (@cid:=q.cid) AND (@a:=1)) <= N;



This will already work properly. Problem solved. By the way, I'd note from myself - that local variables are a very strong thing with skillful use of them. And yet - this request can be called a universal solution for many typical tasks of this kind: the last 5 added or best or purchased goods per category, 5 best articles for each user, 10 most bought books of each genre, and so on. Because it doesn't matter what the subquery will be.

Well, in general, that's all. Thanks for attention. I hope you were interested.

PS: about the title of the gallery in the condition of the problem - well, of course it was a typo :-) Thanks to those who reported this, although I could not fix it on time.

PS2:
* This source code was highlighted with Source Code Highlighter .

Source: https://habr.com/ru/post/70479/


All Articles