From detour to Dijkstra’s algorithm

Instead of introducing

I sorted out my old “notes,” so to speak, and came across this one. I still have no invites for Habré, I thought, and decided to publish. In this article I will explain how to understand the Dijkstra algorithm for finding the shortest paths from a given vertex in a graph. What do I come to him naturally from the algorithm to bypass the graph in width.

The comments asked for more information about the data structure, hiding behind the priority_queue in STL C ++. At the end of the article is a brief story and its implementation.

Wide bypass graph

The first algorithm that I would like to describe, and which definitely cannot be skipped, is a bypass of the graph in width . What is it? Let's move away a bit from the formal description of the graphs, and imagine such a picture. Put on the ground ropes, soaked with something fuel, of the same length so that none of them intersect, but some of them touch the ends of each other. And now let's fire one of the ends. How will the fire behave? It will be evenly thrown over the ropes at neighboring intersections, until everything lights up. It is not difficult to generalize this picture to three-dimensional space. This is how life in width will look like bypassing the graph. Now we will describe more formally. Suppose that we have started a wide bypass from some vertex V. At the next moment in time, we will look through the neighbors of the vertex V (the neighbor of the vertex V is called the vertex that has a common edge with V). And so on until all the vertices in the graph have been viewed.

Implementation of the crawl wide

Suppose we are at some vertex in the process of traversing a width, then we will use the queue to which we will add all the vertex neighbors, excluding those that we have already visited.
')

void bfs(int u) { used[u] = true; queue<int> q; q.push(u); while (!q.empty()) { int u = q.front(); q.pop(); for (int i = 0; i < (int) g[u].size(); i++) { int v = g[u][i]; if (!used[v]) { used[v] = true; q.push(v); } } } }

In this implementation, g is a list of adjacent vertices, i.e. in g [u] there is a list of adjacent vertices with u (std :: vector is used as a list), used is an array that allows us to understand which vertices we have already visited. Here a walk in width does nothing but a walk in width itself. It would seem, why? However, it can be easily modified in order to look for what we need. For example, the distance and the path from any vertex to all others. It should be noted that the ribs have no weight, i.e. graph is not weighted. We give the implementation of the search for distances and paths.

 void bfs(int u) { used[u] = true; p[u] = u; dist[u] = 0; queue<int> q; q.push(u); while (!q.empty()) { int u = q.front(); q.pop(); for (int i = 0; i < (int) g[u].size(); i++) { int v = g[u][i]; if (!used[v]) { used[v] = true; p[v] = u; dist[v] = dist[u] + 1; q.push(v); } } } }

Here p is an array of ancestors, i.e. in p [v] lies the ancestor of the vertex v, dist [v] is the distance from the vertex from which we started the traversal to the vertex v. How to restore the path? This is quite easy to do, just passing through the array of ancestors of the vertex we need. For example, recursively:

 void print_way(int u) { if (p[u] != u) { print_way(p[u]); } cout << u << ' '; }

Shortest paths

All further reasoning will be true only if the edge weights are not negative.

We now turn to weighted graphs, i.e. graph edges have weight. For example, the length of the edge. Or a fee for the passage on it. Or the time it takes to pass through it. The task is to find the shortest path from one vertex to some other. In this article I will start from a wide bypass, I don’t remember seeing this approach anywhere else. Maybe I missed it. So, let's look again at the implementation of a wide bypass, and specifically on the condition of adding to the queue. By traversing the width, we add to the queue only those vertices in which we have not yet been. Now we will change this condition and we will add those vertices, the distance to which can be reduced. Obviously, the queue will become empty if and only if there is not a single vertex to which the distance can be reduced. The process of reducing the path from the vertex V, let's call the relaxation of the vertex V.

It should be noted that initially the path to all vertices is equal to infinity (for infinity we take some sufficiently large value, namely: one that no path will have a length greater than the value of infinity). This algorithm follows directly from a wide bypass, and it was to him that I myself came when I solved the first task in life for the shortest paths in a graph. It is worth mentioning that this method is looking for the shortest paths from the top, from which we started the algorithm, to all the others. I will give the implementation:

 const int INF = 1e+9; vector< pair<int, int> > g[LIM]; //  g[u]   : (    u  v,  v) void shortcut(int u) { fill(dist, dist + n, INF); dist[u] = 0; p[u] = u; queue<int> q; q.push(u); while (!q.empty()) { int u = q.front(); q.pop(); for (int i = 0; i < (int) g[u].size(); i++) { int v = g[u][i].second, len = g[u][i].first; if (dist[v] > dist[u] + len) { p[v] = u; dist[v] = dist[u] + len; q.push(v); } } } }

Why do we keep such pairs in the adjacency list? A little later, it will become clear when we improve this algorithm, but, frankly, for this implementation, the pairs can be stored and vice versa. You can slightly improve the addition to the queue. To do this, it is worthwhile to start an array of bool, in which we will mark whether the vertex with which it is necessary to relax is in the queue.

 const int INF = 1e+9; vector< pair<int, int> > g[LIM]; bool inque[LIM]; void shortcut(int u) { fill(dist, dist + n, INF); dist[u] = 0; p[u] = u; queue<int> q; q.push(u); inque[u] = true; while (!q.empty()) { int u = q.front(); q.pop(); inque[u] = false; for (int i = 0; i < (int) g[u].size(); i++) { int v = g[u][i].second, len = g[u][i].first; if (dist[v] > dist[u] + len) { p[v] = u; dist[v] = dist[u] + len; if (!inque[v]) { q.push(v); inque[v] = true; } } } } }

If you look closely, then this algorithm is very similar to Levit's algorithm, but still it’s not him, although in the worst case it works for the same asymptotics. Let's roughly estimate it. In the worst case, we will have to relax every time we pass along any edge. Total O (n * m). The assessment is rather rough, but at the initial stage this is quite enough. It is also worth noting that this is the worst case , and in practice even such an implementation works fairly quickly. And now, the most interesting! ... drumming ... Improve our algorithm to the Dakesta algorithm !

Dijkstra's Algorithm

The first optimization that comes to mind. And let's relax those vertices, the path to which is now the minimum? Actually, it was this idea that came into my mind one day. But as it turned out, this idea came to the first far from me. First she came to the remarkable scientist Edsger Dijkstra. Moreover, it was he who proved that choosing the top for relaxation in this way, we will spend the relaxation no more than n times! On an intuitive level, it is clear that if the path to a certain peak is minimal now, then we will not be able to make it even less. More formally can be read here or on Wikipedia .

Now it remains for the small, to understand how to effectively find the top with the minimum distance to it. To do this, use the priority queue (heap, heap). In stl there is a class that implements it and is called priority_queue . Let me give you an implementation, which, again, directly follows from the previous (described in this article) algorithm.

 void dijkstra(int u) { fill(dist, dist + n, INF); dist[u] = 0; p[u] = u; priority_queue< pair<int, int>, vector< pair<int, int> >, greater< pair<int, int> > > q; q.push(make_pair(0, u)); while (!q.empty()) { pair<int, int> u = q.top(); q.pop(); if (u.first > dist[u.second]) continue; for (int i = 0; i < (int) g[u.second].size(); i++) { int v = g[u.second][i].second, len = g[u.second][i].first; if (dist[v] > dist[u.second] + len) { p[v] = u.second; dist[v] = dist[u.second] + len; q.push(make_pair(dist[v], v)); } } } }

Most likely it is not entirely clear what is happening here. Let's start by declaring a priority queue. Here, the first argument of the template is the data that is stored in the queue, and specifically the form pairs (distance to the vertex, vertex number), the second argument of the template is the container in which the data will be stored, the third argument, the comparator (found, by the way, in the header file functional). Why do we need some other comparator? Because with the standard declaration of priority_queue <pair <int, int>> , you only get those vertices that are as close as possible, but we need to do the opposite. Asymptotics of such a solution of the problem posed O (n * log (n) + m * log (n)) .

Indeed, we have only n relaxations, and we look for the vertex with the minimum length of the path to it behind log (n) (this is exactly the asymptotics of the standard queue with stl priorities). It should also be noted that it may turn out that we have added the same vertex to the queue, but with different paths to it. For example, we spent the relaxation from the vertex A, which has a vertex C in the neighbors, and then we relaxed from the vertex B, which also has a vertex C in the neighbors, to avoid the problems associated with this, we simply skip those vertices that we got out of the queue, but the distance from the queue to which is not relevant, more than the current shortest. For this, the implementation has a line if (u.first> dist [u.second]) continue; .

Instead of conclusion

It turned out that it is actually very easy to write Dijkstra for O (n * log (n) + m * log (n)) !

Addition

The comments were asked to tell how you can do with your hands priority_queue.

Let us face the next task (as in the Dijkstra algorithm).
You need to maintain a data structure that meets the following requirements:
1) Adding no more than O (log (n)) time.
2) Removal of the minimum element in no more than O (log (n)) time.
3) Search for a minimum in O (1) time.

It turns out that these requirements are satisfied by the data structure, which in Russian-language literature is usually called the “heap”, in the English-language “heap” or “priority queue”.

Generally speaking, a bunch is a tree. Let's look at the properties of this tree in more detail. Suppose we have already built a heap, then we define it as follows - for each vertex in the heap it is true that the element at this vertex is no more than the elements in its descendants. Then at the root of the tree is the minimum element, which allows us to continue to look for a minimum of O (1).

Now we will define our heap so that the other properties are fulfilled. We call the level of a vertex in a tree the distance from the root to it. We prohibit adding a new level in the heap until it is completely filled with the previous one. More formally: let the current maximum height of the tree H, then it is forbidden to add vertices to the height H + 1, as long as it is possible to add it to the height H. Indeed, under such conditions, the height of the heap is always no more than O (log (n)).

It remains to learn how to add and remove items in the heap.

We will implement a bunch on a binary tree, in which there will be at most one vertex with the number of descendants less than two. The tree will be stored in an array, then the descendants of the vertex at position pos will lie at positions 2 * pos and 2 * pos + 1, and the parent will lie at position pos / 2.

We will add an element to the first possible vacant place at the current maximum level, then perform an operation called “sifting up”. This means that while the parent of the added element is more than the element itself, then change the positions of the added element and the parent, repeat recursively until the root. Let us prove that this does not violate the properties of the heap. This is simple, since the current element is smaller than the parent, it is also smaller than all the descendants of the parent.

 void push(const value_t& value) { data.push_back(value); sift_up(data.size() - 1); } void sift_up(size_t position) { size_t parent_position = position / 2; if (!cmp(data[position], data[parent_position])) { swap(data[position], data[parent_position]); sift_up(parent_position); } }

We will delete the minimum element as follows: first, swap the root of the tree with the last element at the maximum level, then delete the vertex in which this element was (now there will be a minimum). The properties of the heap will then be broken, and in order to restore them you need to perform an operation called “sifting down”. We start recursively from the root: for the current element we find the minimum from the descendants, swap with the current element, start recursively for the vertex with which the current element was changed. Note that after this operation, heap properties are executed.

 void pop() { swap(data[1], data.back()); data.pop_back(); sift_down(1); } void sift_down(size_t position) { size_t cmp_position = position; if (2 * position < data.size() && !cmp(data[2 * position], data[cmp_position])) { cmp_position = 2 * position; } if (2 * position + 1 < data.size() && !cmp(data[2 * position + 1], data[cmp_position])) { cmp_position = 2 * position + 1; } if (cmp_position != position) { swap(data[cmp_position], data[position]); sift_down(cmp_position); } }

Since the tree height is O (log (n)), addition and removal will work for O (log (n)).

All code

 template<typename value_t, typename cmp_t> class heap_t { public: heap_t() : data(1) { } bool empty() const { return data.size() == 1; } const value_t& top() const { return data[1]; } void push(const value_t& value) { data.push_back(value); sift_up(data.size() - 1); } void pop() { swap(data[1], data.back()); data.pop_back(); sift_down(1); } private: void sift_up(size_t position) { size_t parent_position = position / 2; if (!cmp(data[position], data[parent_position])) { swap(data[position], data[parent_position]); sift_up(parent_position); } } void sift_down(size_t position) { size_t cmp_position = position; if (2 * position < data.size() && !cmp(data[2 * position], data[cmp_position])) { cmp_position = 2 * position; } if (2 * position + 1 < data.size() && !cmp(data[2 * position + 1], data[cmp_position])) { cmp_position = 2 * position + 1; } if (cmp_position != position) { swap(data[cmp_position], data[position]); sift_down(cmp_position); } } cmp_t cmp; vector<value_t> data; };

Source: https://habr.com/ru/post/259295/

All Articles