📜 ⬆️ ⬇️

Habraklany

About a month ago I published an article about habrakota on habr . A byproduct of this article turned out to be a dump of habra-users' pages, and I wanted to extract some more information from it. On Habré regularly appear articles about user analysis , articles , comments , karma , but I did not find a single article that analyzed habrainvayty. Therefore, I built a graph of habrainvaytov and looked at some of its characteristics.



Let me remind you that the pages were downloaded in January 2016, so everything that happened after (registration of new users, deletion of old users, changes in karma) was not taken into account. Removing all read-only and deactivated users from the list of downloaded users, we get 79870. As far as I know, this number roughly corresponds to the actual number of habra-users (plus or minus a thousand). Further, in order to get a graph without holes, we had to add 955 read-only users and 382 deactivated (these are the users who invited someone or were invited, but were cut from a habr or transferred to the RO for one reason or another). As a result, we obtain a graph on 81207 vertices.

It is worth noting here that getting a list of users of the habr is not easy. Most were obtained a couple of years ago, when hub subscriber lists were still available. But now there are no such lists, so the user names for 2015 and 2016 were extracted from articles, comments, pages of already known users, lists of subscribers, lists of users from specified cities, countries, and I also chopped frequently met prefixes of user names (like Alex * , admin *, Captain *, etc.) and made several thousand queries on the Habr search page. I also added active users on giktayms and megamozg, therefore, if you are not on my list, you are well encrypted.
')
So, we have a directed graph with 81207 vertices and 20195 arcs. As you can see, only about 20 thousand users were registered on invites from other users, the rest either registered before invites (more than 40 thousand) or were invited by UFOs.

We call the habraklan component of the weak connectedness of this oriented graph. It is worth noting that these components, generally speaking, are not trees, since one person may receive invites several times. Therefore, we have cycles: for example, @ tangro invited @ Milla, and @ Milla invited @ tangro; loops: for example, @ aavezel invited himself; vertices, which includes several arcs: the user @ shara was invited 6 times (@ Deeman, @ myagi, @ homm, @ Azya, @ veveve, @ shifttstas). Although this is more of an exception, the graph as a whole looks like a forest.

Our graph has 61021 habraklan. The size distribution is as follows:
Size componentsNumber of components
more than 1001one
101–10006
11–100436
2–103110
one57468
Let's look at the biggest components.
NoThe sizeRoot Tops
one1027@ Davekeinz (sent 412 invites - more than anyone else on Habré, also in this component @ Mithgol, who sent 78 invites)
2584@ Mudhoney (sent 242 invites) @ valemak
3316@ XaocCPS (sent 65 willows)
four272@ Alaunquirie (invited by @ BarsMonster, who invited 73 users) @ kip
five189@ Deeman @ homm @ DorBer @ myagi @ Azya @ maovrn @ fil9 @ yoihj
6106@ Rossomachin
7104@ Garyan
eight97@ Kukutz (Yandex. Component)
990@ Eosunknown
ten85@ Cigulev @ tyr
eleven80@ Mdevils
1280@ Nuzgul
1377@ Ni404 @ tronix286 @ Rembish
1477@ Tigger
1576@ Gaidar
sixteen70@ Auren
1769@ Saltommeister
1868@ Kalan
nineteen68@ Alisadenisova
2067@ Horsev
Below are the pictures of these 20 graphs. Green circles - users with positive karma, red - with negative, blue - with zero, gray - read-only or deactivated users. The area of ​​the circle is proportional to the module of karma (if this number is greater than 1). All images are links to a larger version.

Let's look also at "heights" of habraklans. If we throw away the negligible number of graphs with cycles, then dag_longest_path_length (G) gives the following result.
The length of the longest chainNumber of components
9one
72
6eleven
five39
four125
3479
22888
one57468
The longest chain is as follows: @ Garyan invited @ Andrey_Rogovsky, who invited @ DmitryGushin, who invited @ Uncle_Sam, who asked @ RootHell, who sent an invite to @ alexey_qwe, who invited @ Doom2, who called for @ Odnoklassniki_ru and who finally invited @ DarkDefender.

The analysis coincides with the expectation that most habraklans are small and with a small "height".

Now, remember that users have karma. Simple summation gives that all in Habré at least 450323.4 units of positive karma. (By the way, 10579 users have a karma greater than or equal to 10, so theoretically this article could gain 10,578 pluses.)

Let's see which habraklans have the largest reserves of karma.
NoTotal karmaRoot Tops
one6184.4@ Mudhoney @ valemak
25333.7@ Davekeinz
34720.8@ XaocCPS
four3587.1@ Alaunquirie @ kip (@ BarsMonster here)
five2464.5@ Deeman @ homm @ DorBer @ myagi @ Azya @ maovrn @ fil9 @ yoihj
62390.1@ Horsev (@ PapaBubaDiop and @ Milfgard here)
71984.9@ Cigulev @ tyr (@ Zelenyikot here)
eight1780.2@ Ni404 @ tronix286 @ Rembish
91606.1@ Eosunknown
ten1526.9There is no root here, and everything starts with the @ tangro cycle - @ Milla
eleven1319.3@ Kit
121304.1@ Ocelot
131299.5,@ Auren
141104.5@ Kalan
151009.1@ Rossomachin
sixteen985.5@ Easy_john
17932.3@ Assuri
18871.7@ Sourcerer
nineteen845.2@ LukaSafonov
20838.6@ Mdevils
Below are pictures of graphs that have not met before.


Also, some users have a country on the page in the "From" field. Top countries by users can be found at the very habr, I was also interested to look at invites, in which the inviting and invited are located in different countries. Such invites characterize the “geographical” connectivity of the habrasoobshchestvo.

At first I wanted to build a so-called. chord diagram , but I didn’t find an easy way to do this on python, so I’m quoting the upper left corner of the corresponding matrix. (If someone tells you how to build a diagram, I will be grateful.) The blue cell in the picture, the greater the logarithm of the number of invites from country 1 to country 2.

Noticeable is the connectivity of Russia, Ukraine, Belarus, the USA and Germany.

Another piece of information that is not related to invites, but is easily extracted from the users' pages is the date of registration and the date of the last appearance. The following table shows how many users registered in a given year and how many of them appeared on the site since January 1, 2015 (otherwise we consider that the user has ceased to be habraactive).
20063091909
2007194335511
2008220316348
200960323094
201068263345
201193416355
201258414160
201340292819
201426842100
201514731473
Total8078136114
The same in the form of a diagram.


We see that half of the users registered in 2007 and 2008, as well as many old-timers are active.

That's all. The table with the source data and a script for drawing graphs are available on the github . Archive with raw data is available on request.

Source: https://habr.com/ru/post/232769/


All Articles