At leisure, I got an interesting idea that I developed into an algorithm for finding the smallest common ancestor (LCA) of two vertices in a tree. Before the emergence of this idea of other algorithms for searching LCA, I did not know. Having checked the correctness of the work, I hurried to study other algorithms for solving this problem, but I did not find any similar ones. Now I hasten to share it with the community.
Introduction
A tree is an undirected connected graph of
N vertices and
N-1 edges. From any vertex to any other, there is exactly one simple path.
The root of the tree will be called such a vertex, from which the direction of movement along the tree is specified.
The smallest common ancestor of two vertices
u and
v will be called such a vertex
p that lies on the path from the root to the vertex
v , and to the vertex
u , as well as the most distant from it.
Input data
Information about the tree is input:
N is the number of vertices,
N-1 is a pair of vertices that are connected by an edge, and
M is the number of requests. Next, the program receives requests: two vertices
u and
v , for which you want to find their smallest common ancestor.
The idea of the algorithm
For each vertex, we will store the distance to the root and the previous vertex (ancestor) from which we came to it. Further we will rise from the vertices
u and
v to the root of the tree. At each step, we will choose the vertex
u or
v that which is the most distant from the root and then consider its ancestor instead, until the paths formed from the initial
u and
v are in one vertex - their smallest common ancestor. With different tree variants, this path may consist of
N steps, which, with a large number of vertices and queries, will work too slowly. This implementation requires O (N) time to execute each request.
Now we improve the algorithm. For each vertex, we will store the distance to the root of the tree
dist , the number of descendants of its
kids , as well as the ancestor (the choice of which will be determined below) from which we came to it
last and the number of the vertex where the ancestor goes an edge on the way to this vertex
turn .
We declare the necessary variables, for convenience, this will be done in global memory.
const int SIZE = 100050; vector<int> v[SIZE];
Now for each vertex using the recursive function (call
k_go (0) ) we calculate the number of descendants from it:
int k_go(int s) { int res = 0; vis[s] = true; for(int i = 0, maxi = v[s].size(); i < maxi; i++) { if(!vis[v[s][i]]) res += k_go(v[s][i]) + 1; } kids[s] = res; return res; }
And now we finish the preparatory part of the algorithm and fill in the necessary information for each vertex. Call the function
l_go (0, 0, 0, 1) :
void l_go(int s, int d, int l, int r) { vis[s] = true; dist[s] = d; last[s] = l; turn[s] = r; int maxval = 0; int idx = -1; for(int i = 0, maxi = v[s].size(); i < maxi; i++) { if(!vis[v[s][i]]) { int k = kids[v[s][i]]; if(k > maxval) idx = v[s][i], maxval = k; } } for(int i = 0, maxi = v[s].size(); i < maxi; i++) { if(!vis[v[s][i]]) { if(idx == v[s][i]) l_go(v[s][i], d + 1, l, r); else l_go(v[s][i], d + 1, s, v[s][i]); } } }
')
Now I will explain how the last part of the code works, since the search for the number of descendants at the top is a fairly typical task.
For each vertex we look at its descendants and from them we choose such a vertex with the maximum number of descendants. For her, all information about the ancestor will be inherited from the current, and only the distance to the root will change. For all other descendants of this vertex, now the ancestor of the
last will be the current vertex, and the
turn is the descendant itself.
Now I argue that if we take some vertex
a in the tree and until we reach the root from it, then the number of transitions to the
last ancestor does not exceed the binary logarithm of the number of vertices in the tree.
Proof: suppose we are at some vertex
v which has one descendant. Then it is obvious that the descendant of the
last link to the ancestor will not change. In the case of two or more vertices, we get one descendant (which has the largest number of descendants among all the descendants of the current vertex) from whom the link will inherit from the current one and all the others whose updates will be updated. Denote the total number of descendants of the current vertex for
N. Then from the descendant of this vertex, in which we update the reference to the ancestor, there will be no more than
N / 2 descendants (otherwise this number will be the maximum, but then updating the reference is not required). Thus, on each of the vertices, which is the ancestor of the
last for someone, no more than half of the vertices are left, of all the descendants coming from it. The total length of such a path will not exceed the binary logarithm of
N.We now turn to the main function of the algorithm and the explanation of why it works.
So, we have two vertices
u and
v , for which we need to know the answer to the query. Call the function
p = lca (u, v) :
int lca(int a, int b) { if(turn[a] == turn[b]) { if(dist[a] < dist[b]) return a; else return b; } if(dist[last[a]] > dist[last[b]]) return lca(last[a], b); return lca(last[b], a); }
The function recursively rises in the direction of the root. For each subsequent vertices
a and
b, we first check the vertex at which the last turn was made. If it coincides, this means that either the vertex
a lies on the path to the vertex
b from the root, or vice versa. Depending on which vertex is closer to the root, that is the desired smallest common ancestor.
Otherwise, it is necessary to bring vertex
a or
b to the root. We will approach on the basis of which of them will be farther from the root in the event of a transition to their ancestor (one vertex will constantly try to catch up with the second vertex on the way to the root). Thus, sooner or later the vertices will come either to one vertex, which will immediately be their smallest common ancestor, or one of the vertices will lie on the path from the root to another vertex, as described in the first step.
Evaluation of the algorithm
To implement the algorithm,
O (N) memory is required (the entire memory consumes
6N ),
O (N) preliminary calculation and
O (M * log (N)) time to answer all requests.
The algorithm allows you to respond to requests on a predetermined tree and respond to requests immediately as they arrive for
O (log (N)) .
Efficiency of other algorithms for solving the LCA search problem
There are several algorithms for solving this problem, each of which is characterized by the complexity of writing, the time for preliminary calculation, the answer to the request and the size of the required memory.
1)
Answer to the request for O (log N), preliminary calculation for O (N). The complexity of the implementation of the need to use the data structure "tree of segments" or "sqrt-decomposition."
2)
Binary ascent method . Answer to the request for O (log N), preliminary calculation for O (log N), used memory (N * log (N)). A fairly simple implementation with a slightly better runtime than the algorithm I have cited, but at the cost of additional memory used.
3)
Algorithm Farah-Colton and Bender . This is the most optimal algorithm that allows you to respond to a request for O (1), but also requires a lot of additional memory and is very complex in implementation.
As well as an algorithm that allows you to respond to a request for O (1), but for this you need to know the information about all requests in advance.
4)
Tarjan AlgorithmConclusion
Thus, the algorithm presented in the implementation complexity is slightly inferior to the slightly faster binary recovery algorithm, but it requires less memory and time for a preliminary calculation. I would like to hear on this score the opinion of those who are familiar with the algorithms for solving this problem in order to evaluate the expediency of its use and uniqueness. Thanks to everyone who took the time to understand this article.
UPD: Knowledgeable people told me that my algorithm is just one of the applications of
Heavy-Light decomposition. Before that, I was not familiar with this decomposition, but in any case it is nice to invent something for myself, even if it exists.