About a month ago I published
an article about habrakota on
habr . A byproduct of this article turned out to be a dump of habra-users' pages, and I wanted to extract some more information from it. On Habré regularly appear
articles about user analysis ,
articles ,
comments ,
karma , but I did not find a single article that analyzed habrainvayty. Therefore, I built a graph of habrainvaytov and looked at some of its characteristics.

Let me remind you that the pages were downloaded in January 2016, so everything that happened after (registration of new users, deletion of old users, changes in karma) was not taken into account. Removing all read-only and deactivated users from the list of downloaded users, we get 79870. As far as I know, this number roughly corresponds to the actual number of habra-users (plus or minus a thousand). Further, in order to get a graph without holes, we had to add 955 read-only users and 382 deactivated (these are the users who invited someone or were invited, but were cut from a habr or transferred to the RO for one reason or another). As a result, we obtain a graph on 81207 vertices.
It is worth noting here that getting a list of users of the habr is not easy. Most were obtained a couple of years ago, when hub subscriber lists were still available. But now there are no such lists, so the user names for 2015 and 2016 were extracted from articles, comments, pages of already known users, lists of subscribers, lists of users from specified cities, countries, and I also chopped frequently met prefixes of user names (like Alex * , admin *, Captain *, etc.) and made several thousand queries on the Habr search page. I also added active users on giktayms and megamozg, therefore, if you are not on my list, you are well encrypted.
')
So, we have a directed graph with 81207 vertices and 20195 arcs. As you can see, only about 20 thousand users were registered on invites from other users, the rest either registered before invites (more than 40 thousand) or were invited by UFOs.
We call the
habraklan component of the weak connectedness of this oriented graph. It is worth noting that these components, generally speaking, are not trees, since one person may receive invites several times. Therefore, we have cycles: for example, @ tangro invited @ Milla, and @ Milla invited @ tangro; loops: for example, @ aavezel invited himself; vertices, which includes several arcs: the user @ shara was invited 6 times (@ Deeman, @ myagi, @ homm, @ Azya, @ veveve, @ shifttstas). Although this is more of an exception, the graph as a whole looks like a forest.
Our graph has 61021 habraklan. The size distribution is as follows:
Size components | Number of components |
---|
more than 1001 | one |
101–1000 | 6 |
11–100 | 436 |
2–10 | 3110 |
one | 57468 |
Let's look at the biggest components.
No | The size | Root Tops |
---|
one | 1027 | @ Davekeinz (sent 412 invites - more than anyone else on Habré, also in this component @ Mithgol, who sent 78 invites) |
2 | 584 | @ Mudhoney (sent 242 invites) @ valemak |
3 | 316 | @ XaocCPS (sent 65 willows) |
four | 272 | @ Alaunquirie (invited by @ BarsMonster, who invited 73 users) @ kip |
five | 189 | @ Deeman @ homm @ DorBer @ myagi @ Azya @ maovrn @ fil9 @ yoihj |
6 | 106 | @ Rossomachin |
7 | 104 | @ Garyan |
eight | 97 | @ Kukutz (Yandex. Component) |
9 | 90 | @ Eosunknown |
ten | 85 | @ Cigulev @ tyr |
eleven | 80 | @ Mdevils |
12 | 80 | @ Nuzgul |
13 | 77 | @ Ni404 @ tronix286 @ Rembish |
14 | 77 | @ Tigger |
15 | 76 | @ Gaidar |
sixteen | 70 | @ Auren |
17 | 69 | @ Saltommeister |
18 | 68 | @ Kalan |
nineteen | 68 | @ Alisadenisova |
20 | 67 | @ Horsev |
Below are the pictures of these 20 graphs. Green circles - users with positive karma, red - with negative, blue - with zero, gray - read-only or deactivated users. The area of ​​the circle is proportional to the module of karma (if this number is greater than 1). All images are links to a larger version.

Let's look also at "heights" of habraklans. If we throw away the negligible number of graphs with cycles, then dag_longest_path_length (G) gives the following result.
The length of the longest chain | Number of components |
---|
9 | one |
7 | 2 |
6 | eleven |
five | 39 |
four | 125 |
3 | 479 |
2 | 2888 |
one | 57468 |
The longest chain is as follows: @ Garyan invited @ Andrey_Rogovsky, who invited @ DmitryGushin, who invited @ Uncle_Sam, who asked @ RootHell, who sent an invite to @ alexey_qwe, who invited @ Doom2, who called for @ Odnoklassniki_ru and who finally invited @ DarkDefender.
The analysis coincides with the expectation that most habraklans are small and with a small "height".
Now, remember that users have karma. Simple summation gives that all in
Habré at least
450323.4 units of positive karma. (By the way, 10579 users have a karma greater than or equal to 10, so theoretically this article could gain 10,578 pluses.)
Let's see which habraklans have the largest reserves of karma.
No | Total karma | Root Tops |
---|
one | 6184.4 | @ Mudhoney @ valemak |
2 | 5333.7 | @ Davekeinz |
3 | 4720.8 | @ XaocCPS |
four | 3587.1 | @ Alaunquirie @ kip (@ BarsMonster here) |
five | 2464.5 | @ Deeman @ homm @ DorBer @ myagi @ Azya @ maovrn @ fil9 @ yoihj |
6 | 2390.1 | @ Horsev (@ PapaBubaDiop and @ Milfgard here) |
7 | 1984.9 | @ Cigulev @ tyr (@ Zelenyikot here) |
eight | 1780.2 | @ Ni404 @ tronix286 @ Rembish |
9 | 1606.1 | @ Eosunknown |
ten | 1526.9 | There is no root here, and everything starts with the @ tangro cycle - @ Milla |
eleven | 1319.3 | @ Kit |
12 | 1304.1 | @ Ocelot |
13 | 1299.5, | @ Auren |
14 | 1104.5 | @ Kalan |
15 | 1009.1 | @ Rossomachin |
sixteen | 985.5 | @ Easy_john |
17 | 932.3 | @ Assuri |
18 | 871.7 | @ Sourcerer |
nineteen | 845.2 | @ LukaSafonov |
20 | 838.6 | @ Mdevils |
Below are pictures of graphs that have not met before.

Also, some users have a country on the page in the "From" field. Top countries by users can be found at the very habr, I was also interested to look at invites, in which the inviting and invited are located in different countries. Such invites characterize the “geographical” connectivity of the habrasoobshchestvo.
At first I wanted to build a so-called.
chord diagram , but I didn’t find an easy way to do this on python, so I’m quoting the upper left corner of the corresponding matrix. (If someone tells you how to build a diagram, I will be grateful.) The blue cell in the picture, the greater the logarithm of the number of invites from country 1 to country 2.

Noticeable is the connectivity of Russia, Ukraine, Belarus, the USA and Germany.
Another piece of information that is not related to invites, but is easily extracted from the users' pages is the date of registration and the date of the last appearance. The following table shows how many users registered in a given year and how many of them appeared on the site since January 1, 2015 (otherwise we consider that the user has ceased to be habraactive).
2006 | 3091 | 909 |
2007 | 19433 | 5511 |
2008 | 22031 | 6348 |
2009 | 6032 | 3094 |
2010 | 6826 | 3345 |
2011 | 9341 | 6355 |
2012 | 5841 | 4160 |
2013 | 4029 | 2819 |
2014 | 2684 | 2100 |
2015 | 1473 | 1473 |
Total | 80781 | 36114 |
The same in the form of a diagram.

We see that half of the users registered in 2007 and 2008, as well as many old-timers are active.
That's all. The table with the source data and a script for drawing graphs are available on the
github . Archive with raw data is available on request.