The theory of graphs in the Game of Thrones

Recently, on Geektimes I published an article where I cited some superficial statistics from the “Song of Ice and Flame” series of books. But I did not go into the most interesting part, in the graph of social relations, because the topic deserves special attention. In this article, I will demonstrate how graph theory can help in analyzing such data and present the implementation of the algorithms that I used.

Anyone interested, welcome under cat.

From the survey of the aforementioned article, I concluded that more than a third of the Geektimes audience was familiar with the book of fantasy novel, and more than half with the plot of the series. I hope the audience Geektimes and Habrahabr intersect and you may be interested in not only the algorithms used, but also the results.
')

Data

Let's start, of course, with the main thing - the data. We have a social activity graph:

It depicts characters from the world of thrones at the vertices of the graph, where the size of the vertex depends on the words spoken, and the dialogues of the characters on the edges of the graph, where the thickness of the edge depends on the number of conversations held.

General information about the graph:

The graph is undirected.
Summits (read characters): 1105
Reber (read dialogs): 3505

It's all good, but who made the list of all the conversations, and even determined their identity, you ask me (in 5 books, there are almost 2 million words ). The method of conditional random fields , I answer. Yes, machine learning was attracted to collect data and, as the model “trainer” states, the accuracy of determining the ownership of a conversation is about 75% . I will add a questionnaire at the end of the article to find out whether it is worth translating this article about how the method of conditional random fields was applied.

So, the input data format for all algorithms will be the same:

The first line contains V and E , the number of vertices and edges in the graph, respectively.

Next come the V lines with the character ID and name. After this, E lines of the form uvw are the description of edges. It means that between u and v there is an edge with weight w , where u and v is the character ID . Let me remind you that the weight means the number of dialogues between the respective characters. The weight of the vertices themselves will not be taken into account anywhere. I did not find a worthy application for him. Oh yeah, the code of the algorithms will be presented in C ++.

We, of course, need to read the data. For this, I created the definitions.h and definitions.cpp files. Further I will connect definitions.h everywhere that the code was less and the code was read easier. Each subsequent algorithm is implemented as a separate function and is called in the function main .

defenitions.h

#ifndef DEF_H #define DEF_H #define _CRT_SECURE_NO_DEPRECATE #include <cstdio> #include <iostream> #include <string> #include <vector> #include <map> #include <algorithm> #include <queue> #include <set> #include <iterator> #include <functional> using namespace std; extern int reverseID[]; extern vector<int> id; extern vector< vector<int> > weight, edge; extern map<int, string> name; extern int V, E; void read_data(); #endif

definitions.cpp

 #include "definitions.h" int reverseID[3000]; vector<int> id; // origin ID of characters vector< vector<int> > weight; // weights of edges vector< vector<int> > edge; // the graph's edges map<int, string> name; // names of characters int V, E; // amount of vertices and edges void read_data() { freopen("input.in", "r", stdin); // read from the input file cin >> V >> E; id.resize(V); weight.resize(V); edge.resize(V); int u; char s[100]; for (int i = 0; i < V; i++) { // read the names scanf("%d %[^\n]", &u, s); name[i] = string(s); id[i] = u; // origin ID by assigned ID reverseID[u] = i; // assigned ID by origin ID } int a, b, w; for (int i = 0; i < E; i++) { // read the edges and weights cin >> a >> b >> w; edge[reverseID[a]].push_back(reverseID[b]); edge[reverseID[b]].push_back(reverseID[a]); weight[reverseID[a]].push_back(w); weight[reverseID[b]].push_back(w); } }

main.cpp

 #include "definitions.h" #include "degree.h" #include "dfsbfs.h" #include "bridges_articulations.h" #include "cliques.h" #include "spanning.h" #include "cut.h" int main() { read_data(); degree(); count_components(); calculate_diameter(); find_cliques(); get_spanning_tree(); find_bridges_and_articulations(); find_cut(); }

Degree of vertex

To begin with, we will try to implement something very simple, which is even embarrassing to call an algorithm, but it will be interesting for the reader / viewer. Find the degree of each vertex. The degree of a vertex is the number of edges incident to (adjacent to) a vertex. This parameter in the context of the graph, which we have, will show us how many “friends” the character has.

This can be done in a single pass and the complexity of this algorithm is O (V + E). If we apply and sort the result, as I did, then the complexity will be O (E + V * log (V)).

Algorithm for determining the degree of vertices

 #include "definitions.h" void degree() { freopen("degree.txt", "w", stdout); // output file vector< pair<int,int> > degree(V); for (int i = 0; i < V; i++) { degree[i] = make_pair(edge[i].size(), i); } stable_sort(degree.begin(), degree.end()); for (int i = V-1; i >= 0; i--) { cout << name[degree[i].second] << ": " << degree[i].first << endl; } }

Top 10 multiply characters:

Tyrion Lannister: 168
John Snow: 128
Arya Stark: 104
Jamie Lannister: 102
Cersei Lannister: 86
Catelyn Stark: 85
Theon Greyjoy: 76
Daineris Targaryen: 73
Brienne: 71
Sansa Stark: 69

The whole list

It is interesting that in the top there was no one who is familiar with many and is obliged to maintain contact with people, Varis (he is in 25th place). But the seemingly secondary character of Brienne, on whose behalf the chapter is not yet, is in ninth place.

Bypassing the graph

So, we now turn to simple algorithms, namely, to search in depth ( Depth-first search ) and search in width ( Breadth-first search ). Both algorithms are very similar, differ only in the way around the graph. They are used when you want to go through the edges from a given vertex in the graph to another. In the case of a depth search, the graph is traversed starting from the vertex and alternately at the maximum depth along one of the outgoing edges. In the case of a search in width, the graph first goes through all its neighboring vertices, further along the neighbors of neighboring vertices, and so there are no pending vertices left. Both algorithms have O (V + E) complexity.

Graph connectivity

Find the connected components of our graph. A connected component is a subgraph in which all vertices have a path to any other vertex of the component. To do this, run the search algorithm in depth by one of the vertices, and after completion, in the next, not marked by the algorithm, vertex. Thus, each start of the search will mean a new component of connectivity.

Counting code for connected components

 #include "definitions.h" vector<bool> useddfs; vector<int> comp; void dfs(int u) { useddfs[u] = true; comp.push_back(u); for (vector<int>::iterator i = edge[u].begin(); i != edge[u].end(); i++) if (!useddfs[*i]) dfs(*i); } void count_components() { freopen("components.txt", "w", stdout); // output file int comp_num = 0; useddfs.resize(V, false); for (int i = 0; i < V; i++) { if (!useddfs[i]) { comp.clear(); dfs(i); comp_num++; } } cout << comp_num; }

It would be extremely strange if the graph were not connected and there was more than one component in it, wouldn’t it? But to my surprise, there were already 3 components in the graph! But looking at them, I saw that one large component was present, and 2 others, in which there was one character each (It was a smiling old man ( A smiley old man ) who talked with Arya near God's eye ( God's eye ). And the cook ( A Cook ), who played with Tirion on the ship). Nothing unusual, they have no names and it is surprising that the participants in the conversation were generally defined correctly. I, of course, added the missing edges and as a result the whole graph turned out to be connected.

Handshake Theory

The next thing we will find using the search algorithm is already wide - the maximum number of handshakes in the graph, in other words, the diameter of the graph, that is, the longest of the shortest paths between all the vertices of the graph. As it turned out, the maximum number of handshakes is 8. A good result for a work about “medieval centuries”, if we take into account the Theory of 6 handshakes . For example, one of these communication chains:

Mother Mole - Thistle - Varamyr - Jon Snow - Ned Stark - Barristan Selmy - Quentyn Martell - The Torn Prince Tattered Prince ) - Lewis Lanster ( Lewis Lanster )

This algorithm works for O (V * V + V * E). Since you need to run the BFS algorithm from each vertex of the graph. If this graph were a tree , it would take only 2 times to run BFS. First, from any vertex of the graph, and then from the most distant vertex from it. And the greatest depth of the last launch would be the diameter of the graph.

And since I found the longest paths for all the vertices of the graph, then we can calculate the average value, as well as make up the distribution of maximum paths. The average value for the length of the maximum paths turned out to be 6.16 (on average, it fits perfectly into the theory of 6 handshakes), and the total average distance is 3.6, for comparison, this parameter is 4.57 for Facebook.

The distribution of the lengths of the maximum paths:

If you look at the distribution, we see that 77 characters are “in the center” of the graph, in other words, they can contact any other character in no more than 4 others. Among them are all the main characters of the story, except for Dayneris Targaryen and Sansa Stark, and among those to whom less than five chapters in the books are on the list are Barristan Selmi, John Connington and Melisander.

Whole result

The code for finding the given parameters

 #include "definitions.h" vector<bool> usedbfs; void bfs(int u, vector<int> &distance, vector<int> &path) { queue<int> q; q.push(u); usedbfs[u] = true; path[u] = -1; while (!q.empty()) { int u = q.front(); q.pop(); for (vector<int>::iterator i = edge[u].begin(); i != edge[u].end(); i++) { int to = *i; if (!usedbfs[to]) { usedbfs[to] = true; q.push(to); distance[to] = distance[u] + 1; path[to] = u; } } } } void calculate_diameter() { freopen("diameter.txt", "w", stdout); // output file int diameter = 0; int current_max = 0; double average_max = 0; double average_average = 0; vector< vector<int> > distribution(V, vector<int> (0)); vector< vector<int> > max_path; vector<int> farthest_node; for (int i = 0; i < V; i++) { vector<int> distance_to(V, 0); vector<int> path(V,-1); usedbfs.clear(); usedbfs.resize(V, false); bfs(i, distance_to, path); current_max = 0; for (int j = 0; j < V; j++) { average_average += distance_to[j]; if (distance_to[j] > current_max) current_max = distance_to[j]; if (distance_to[j] > diameter) { diameter = distance_to[j]; farthest_node.clear(); max_path.clear(); max_path.push_back(path); farthest_node.push_back(j); } else if (distance_to[j] == diameter){ max_path.push_back(path); farthest_node.push_back(j); } } average_max += current_max; distribution[current_max].push_back(i); } average_max /= V; average_average /= V*V; cout << "Diameter: " << diameter << endl; cout << "Average path: " << average_average << endl; cout << "Average farthest path: " << average_max << endl; vector < vector<int> > farthest_path; for (int i = 0; i < farthest_node.size(); i++) { farthest_path.push_back(vector<int>()); for (int u = farthest_node[i]; u != -1; u = max_path[i][u]) farthest_path[i].push_back(u); } cout << "Farthest paths:" << endl; for (int i = 0; i < farthest_path.size(); i++) { cout << i+1 << ": "; for (vector<int>::iterator j = farthest_path[i].begin(); j != farthest_path[i].end(); j++) cout << name[*j] << ". "; cout << endl; } int minimum = V; cout << "Farthest paths distribution:" << endl; for (int i = 0; i <= diameter; i++) { if (distribution[i].size() != 0 && i < minimum) minimum = i; cout << i << ": " << distribution[i].size() << endl; } cout << "Characters with minimum farthest path:" << endl; for (vector<int>::iterator i = distribution[minimum].begin(); i != distribution[minimum].end(); i++) { cout << name[*i] << endl; } }

Clicks in the graph

The Bron-Kerbosh algorithm allows you to find the maximum clicks in the graph, in other words, subsets of vertices, any two of which are connected by an edge. In the context of the graph under consideration, this will make it possible to find strongly related companies that have communicated with each other in history. The complexity of the algorithm is linear with respect to the number of clicks in the graph, but in the worst case, it is exponential O (3 ^{V / 3} ), the algorithm solves the NP complete problem all the same. The algorithm itself is a recursive function, which for each vertex finds the maximum click, i.e. one in which no other vertices can be added. So, the results:

Click size distribution:

As you can see, the maximum click size is 9 characters. For example, one of these is the Lannister company: Tyrion, Jamie, Cersei, Varis, Tywin, Kevan, Pyzel, Petir and Mace Tyrell. Interestingly, all clicks of size 8 and 9 are formed in or near Royal Harbor. And the maximum click with daneris size 5.

Whole result

Click Location Code

 #include "definitions.h" vector < vector<int> > cliques; bool compare_vectors_by_size(const vector<int> &i, const vector<int> &j) { return (i.size() > j.size()); } void bron_kerbosch(set <int> R, set <int> P, set <int> X) { // where R is probable clique, P - possible vertices in clique, X - exluded vertices if (P.size() == 0 && X.size() == 0) { // R is maximal clique cliques.push_back(vector<int>(0)); for (set<int>::iterator i = R.begin(); i != R.end(); i++) { cliques.back().push_back(*i); } } else { set <int> foriterate = P; for (set<int>::iterator i = foriterate.begin(); i != foriterate.end(); i++) { set <int> newR; set <int> newP; set <int> newX; newR = R; newR.insert(*i); set_intersection(P.begin(), P.end(), edge[*i].begin(), edge[*i].end(), inserter(newP, newP.begin())); set_intersection(X.begin(), X.end(), edge[*i].begin(), edge[*i].end(), inserter(newX, newX.begin())); bron_kerbosch(newR, newP, newX); P.erase(*i); X.insert(*i); } } } void find_cliques() { freopen("cliques.txt", "w", stdout); // output file set <int> P; for (int i = 0; i < V; i++) { P.insert(i); stable_sort(edge[i].begin(), edge[i].end()); } bron_kerbosch(set<int>(), P, set<int>()); stable_sort(cliques.begin(), cliques.end(), compare_vectors_by_size); cout << "Distribution:" << endl; int max_size = 0; int max_counter = 0; for (int i = cliques.size()-1; i >= 0 ; i--) { if (cliques[i].size() > max_size) { if (max_counter > 0) { cout << max_size << ": " << max_counter << endl; } max_size = cliques[i].size(); max_counter = 0; } max_counter++; } if (max_counter > 0) { cout << max_size << ": " << max_counter << endl; } cout << "Cliques:" << endl; for (int i = 0; i < cliques.size(); i++) { cout << i + 1 << "(" << cliques[i].size() << "): "; for (int j = 0; j < cliques[i].size(); j++) { cout << name[cliques[i][j]] << ". "; } cout << endl; } }

Spanning tree

And now let's simplify our link graph a bit, leaving only the “core” in it, the so-called spanning tree , i.e. the most important edges of connection and at the same time turning the graph into a tree with all the original vertices The Kruskal algorithm will help us in this. The algorithm is greedy, for its implementation it is only necessary to sort all edges by their weight and add each edge in turn if it connects two components. If you use the correct data structure (usually use a system of disjoint sets ), then the complexity of the algorithm is O (E (logE) + V + E) = O (E (logV)). Below is the result of the graph that has undergone these manipulations. I also removed the edges with a weight of 1 for readability.

the picture is clickable

So, from this graph you can see the main characters and their relationship with each other. As I expected, Tyrion Lannister is in the "center" of all ties, with 4 large branches emanating from him: Cersei Lannister, John Snow, Sansa Stark and Jorah Mormont (Daineris branch). The last, in turn, heroes of the next level of connections. Quite an expected result, isn't it?

Whole result

Spanning tree construction code

 #include "definitions.h" vector<int> p; int dsu_get(int v) { return (v == p[v]) ? v : (p[v] = dsu_get(p[v])); } void dsu_unite(int u, int v) { u = dsu_get(u); v = dsu_get(v); if (rand() & 1) swap(u, v); if (u != v) p[u] = v; } void get_spanning_tree() { freopen("spanning.txt", "w", stdout); // output file vector < pair < int, pair<int, int> > > edges_weights, result; vector < vector<bool> > used; for (int i = 0; i < V; i++) { used.push_back(vector<bool>(V, false)); } for (int i = 0; i < V; i++) { for (int j = 0; j < edge[i].size(); j++) { int cur_v = edge[i][j]; if (!used[i][cur_v]) { edges_weights.push_back(make_pair(weight[i][j], make_pair(i, cur_v))); used[i][cur_v] = true; used[cur_v][i] = true; } } } sort(edges_weights.begin(), edges_weights.end(), greater< pair < int, pair<int, int> > >()); for (int i = 0; i < V; i++) { p.push_back(i); } for (int i = 0; i < E; i++) { int u = edges_weights[i].second.first; int v = edges_weights[i].second.second; int w = edges_weights[i].first; if (dsu_get(u) != dsu_get(v)) { result.push_back(edges_weights[i]); dsu_unite(u, v); } } for (vector < pair < int, pair<int, int> > >::iterator i = result.begin(); i != result.end(); i++) { cout << id[(i->second).first] << " " << id[(i->second).second] << " " << i->first << endl; } }

But how many spanning trees, including not only the maximum can be built? Incredibly much in this column, of course. But they can be calculated using the Kirchhoff theorem . The theorem says that if we construct a matrix where all elements of the adjacency matrix are replaced with opposite ones, and the degrees of the corresponding vertices are on the main diagonal, then all algebraic complements of the resulting matrix will be equal and this value will be the number of spanning trees of the graph.

Bridges and Hinges

Bridges in the graph are called edges, as they are removed, the number of connected components increases. Similar definitions are for hinges, but unlike bridges, hinges are vertices, as they are removed, the number of connected components increases. In this column, the presence of bridges will mean that in the world of thrones there are connections that are the only source of communication between individual groups of characters. The hinges are heroes who are the only mediator to a group of other characters. Bridges and hinges are not very hard to look for. To do this, we need a familiar depth-depth search algorithm, but slightly modified. It is necessary to use the so-called entry time to the vertex and the exit time function ( fout ) on the vertex, which will take the values of the minimum entry time and the vertex fout values at which the depth search is passed, thus, when passing the edge, it turns out that the function value fout at the top is more than the time of entry into it, then it will be a bridge, as well as a hinge, and if it is equal, then only a hinge. In other words, we check if we return to the same edge / vertex after the detour and only to it in the detour of a part of the graph, if so, this will be the required bridge or hinge.

As expected, all the hinges and bridges covered only irrelevant characters. The number of vertices in the components that would be isolated, by removing the hinges and bridges, does not exceed 4x. This means that there is no hero who could not be sacrificed. All are interconnected through at least two other characters.

Whole result

Bridge and joint search code

 #include "definitions.h" vector<bool> used, usedcnt; int timer = 0; vector<int> tin, fout, articulations; vector < pair<int,int> > bridges; int count_rec(int v, int skip) { int cnt = 1; usedcnt[v] = true; for (int i = 0; i < edge[v].size(); i++) { int to = edge[v][i]; if(to == skip || usedcnt[to]) continue; cnt += count_rec(to, skip); } return cnt; } int count_from(int v, int skip, bool clean = true){ usedcnt.resize(V, false); if (clean) { for (int i = 0; i < usedcnt.size(); i++) usedcnt[i] = false; } return count_rec(v, skip); } void dfs(int v, int prev = -1) { used[v] = true; tin[v] = fout[v] = timer++; int root_calls = 0; bool in_ar_list = false; for (int i = 0; i < edge[v].size(); i++) { int to = edge[v][i]; if (to == prev) continue; if (used[to]) fout[v] = min(fout[v], tin[to]); else { dfs(to, v); fout[v] = min(fout[v], fout[to]); if (fout[to] > tin[v]) { bridges.push_back(make_pair(v, to)); } if (fout[to] >= tin[v] && prev != -1 && !in_ar_list) { articulations.push_back(v); in_ar_list = true; } root_calls++; } } if (prev == -1 && root_calls > 1) articulations.push_back(v); } void find_bridges_and_articulations() { freopen("bridges_and_articulations.txt", "w", stdout); // output file used.resize(V, false); tin.resize(V); fout.resize(V); for (int i = 0; i < V; i++) if (!used[i]) dfs(i); cout << "Bridges:" << endl; for (int i = 0; i < bridges.size(); i++) { cout << name[bridges[i].first] << " (" << count_from(bridges[i].first, bridges[i].second) << ") - " << name[bridges[i].second] << " (" << count_from(bridges[i].second, bridges[i].first) <<")" << endl; } cout << endl << "Articulation points:" << endl; for (int i = 0; i < articulations.size(); i++) { int cur = articulations[i]; cout << name[cur]; for (int i = 0; i < usedcnt.size(); i++) usedcnt[i] = false; for (int j = 0; j < edge[cur].size(); j++) { if (!usedcnt[edge[cur][j]]) { int comp_size = count_from(edge[cur][j], cur, false); cout << " (" << comp_size << ")"; } } cout << endl; } }

Minimum cut

But it’s still interesting to know how many heroes you need to “kill” in order to completely eliminate the contact of the two most popular (according to the above statistics) characters, such as Tirion Lannister and Arya Stark, or John Snow and Cersei Lannister. For this we will use the algorithm Dinnitsa . First, let's understand how the algorithm works.

By the Ford-Fulkerson theorem, it follows that the magnitude of the maximum flow in a graph of paths is equal to the value of the carrying capacity of its minimum cut. In other words, we can find the minimum cut between two vertices (sink and source) if we find the maximum flow between these vertices. Algorithm Dinitsa just allows you to find the value of the maximum flow and, in fact, the flow itself. I will not analyze in detail the algorithm itself and give you the opportunity to understand it yourself. To understand the result obtained, it is enough to know that the flow is a certain function that, for each edge lying in the path from the source to the drain, determines a positive value that does not exceed the capacity of this edge. For simplicity, one can imagine a pipe system with water, where the volume of water flowing from the source to the drain at time t is the flow quantity.

The task I set is slightly different from the one that the algorithm solves. The fact is that the flow is defined on the edges, and if we find the minimal cut, then it will cut the graph along the edges, i.e. connections in the graph, but we need to cut the graph not at the edges, but at the vertices, i.e. characters. To solve this problem it is necessary to subject the graph to small changes. We split each vertex in the graph, say, instead of the vertex v , we have vertices v ₁ and v ₂ , then if there was an edge between two vertices v and u , then we redirect v ₁ to u ₂ and u ₁ to v ₂ , instead of an edge (u , v) we get the edges (v ₁ , u ₂ ) and (u ₁ , v ₂ ) . And between each forked vertices v ₁ and v ₂ we draw an edge (v ₂ , v ₁ ) . In the resulting, already oriented graph, all the links will be preserved, but where there were vertices before, edges will appear. If the weight of these edges is 1, and all the others, say, infinity (in practice, you can use the number of vertices in the graph), then the edge between the split vertices will be a weak link in the graph and, according to the “fuse” principle, will not allow more flow through will give the opportunity to find out the number of vertices that need to be removed in order for the graph to become disconnected. And then we run the algorithm Dinica on the resulting graph.

The complexity of the algorithm is O (n ² m). Therefore, we will search for the value of the maximum flow not for all pairs of vertices, but only for the characters that have the greatest number of connections, otherwise for a given graph the complexity will be O (n ⁵ ), where n = 1000 and everything will work for many hours. Below I will give the results on the top 100 characters on the links.

I was a little surprised at the results. It turned out that in the top 100, you can cut the graph by getting rid of at least 6 characters. All such connections contain either John Connington or Euron Greyjoy as a drain / source. But this is not the main characters. Interestingly, in the top 10, basically the minimum flow is about 45. For example, between Tirion and Arya, the flow is 53, and between John Snow and Cersei 42. But there is also a character who can be separated by removing 16 other characters. This is Dayneris Targaryen, not surprising, because she is the most distant heroine on the map of thrones.

It would also be interesting to find out who is the most "important" hero in the streams. Those.through whom most often flows the maximum flow. There is something to be surprised. Top 10 important characters in the streams (in brackets the corresponding number of entries in the streams):

Tyrion Lannister (4109)
Jamie Lannister (3067)
Arya Stark (3021)
John Snow (2883)
Eddard Stark (2847)
Cersei Lannister (2745)
Theon Greyjoy (2658)
Brienne (2621)
Sandor Klegane (2579)
Sansa Stark (2573)

, , ( ) . , , , . , 10 , , , 10 :

11.
12.
13.
14.
15.
16.
17.
18.
19.
20.

6 5. . ?

, , «» , , , . (A Man) (A Guard). , .

 #include "definitions.h" const int INF = 1000000000; struct edge_s{ int a, b, c, flow; // a->b, c-capacity, flow=0 edge_s(int a, int b, int c, int flow) : a(a), b(b), c(c), flow(flow) {} }; vector< pair<int, int> > top_degree; vector< pair<int, int> > top_flow; vector<int> dis; vector<int> ptr; vector<edge_s> edgelist; vector< vector<int> > edges_dup; bool compare_pairs_by_second(const pair<int, int> &i, const pair<int, int> &j) { return (i.second > j.second); } bool bfs_dinic(int s, int t) { queue<int> q; q.push(s); dis[s] = 0; while (!q.empty() && dis[t] == -1) { int v = q.front(); q.pop(); for (int i = 0; i < edges_dup[v].size(); i++) { int ind = edges_dup[v][i], next = edgelist[ind].b; if (dis[next] == -1 && edgelist[ind].flow < edgelist[ind].c) { q.push(next); dis[next] = dis[v] + 1; } } } return dis[t] != -1; } int dfs_dinic(int v, int t, int flow) { if (!flow) { return 0; } if (v == t) { return flow; } for (; ptr[v] < (int) edges_dup[v].size(); ptr[v]++) { int ind = edges_dup[v][ptr[v]]; int next = edgelist[ind].b; if (dis[next] != dis[v] + 1) { continue; } int pushed = dfs_dinic(next, t, min(flow, edgelist[ind].c - edgelist[ind].flow)); if (pushed) { edgelist[ind].flow += pushed; edgelist[ind ^ 1].flow -= pushed; return pushed; } } return 0; } long long dinic_flow(int s, int t) { long long flow = 0; while (true) { fill(dis.begin(), dis.end(), -1); if (!bfs_dinic(s, t)) { break; } fill(ptr.begin(), ptr.end(), 0); while (int pushed = dfs_dinic(s, t, INF)) { flow += pushed; } } return flow; } void dinic_addedge(int a, int b, int c) { edges_dup[a].push_back((int) edgelist.size()); edgelist.push_back(edge_s(a, b, c, 0)); edges_dup[b].push_back((int) edgelist.size()); edgelist.push_back(edge_s(b, a, 0, 0)); } void find_cut() { freopen("cut_min.txt", "w", stdout); // output file dis.resize(2 * V, 0); ptr.resize(2 * V, 0); edges_dup.resize(2 * V, vector<int>(0)); const int top = min(10, V); for (int i = 0; i < V; i++) top_flow.push_back(make_pair(0, i)); for (int i = 0; i < V; i++) { top_degree.push_back(make_pair(edge[i].size(), i)); for (int j = 0; j < edge[i].size(); j++) { int to = edge[i][j]; dinic_addedge(i, to + V, V); } dinic_addedge(i + V, i, 1); } stable_sort(top_degree.rbegin(), top_degree.rend()); cout << "Minimum cut between characters:" << endl; for (int i = 0; i < top; i++) { for (int j = i + 1; j < top; j++) { vector<int>::iterator p = find(edge[top_degree[i].second].begin(), edge[top_degree[i].second].end(), top_degree[j].second); if (p != edge[top_degree[i].second].end()) continue; int flow = dinic_flow(top_degree[i].second, top_degree[j].second + V); cout << name[top_degree[i].second] << " -> " << name[top_degree[j].second] << " = " << flow << endl; for (int k = 0; k < edgelist.size(); k++) { if ((edgelist[k].a == (edgelist[k].b + V)) && (edgelist[k].flow == 1)) { top_flow[edgelist[k].b].first++; } edgelist[k].flow = 0; } } } stable_sort(top_flow.rbegin(), top_flow.rend()); cout << endl << "Top involved characters in the flows:" << endl; for (int i = 0; i < top; i++) { cout << name[top_flow[i].second] << " (" << top_flow[i].first << ")" << endl; } }

. , . , , . GitHub, .