📜 ⬆️ ⬇️

Principle 20/80 on the example of Habr

Somehow it became interesting whether the Pareto Act operates on such a self-regulated resource as Habr . Let me briefly remind you that Pareto's law is the “Principle 20/80” in this case, it can be interpreted that 20% of users bring 80% of the result. And since there is a very accurate method on our resource, evaluating the results of users based on the opinions of a heterogeneous, independent, decentralized crowd. Why not take it into account. About the evaluation method, we read the book “The Wisdom of the Crowd” by James Shurovyeski (I think every Khabarovsk person should know about it). For evaluation, the values ​​of karma were taken (the total result of activity for all time according to other users) and the rating (result of activity over the past 50 days according to the opinion of synthetic habr algorithms).

To obtain the data were taken statistics statistics zahabrennyh and otkhabrennyh users. This rating includes a list of only those users who have at least done something for the habr, and not just registered, so we will consider it quite relevant. Applying some programming knowledge, parked the data from the site. I think, for this article the method will not play much importance, maybe if it will be interesting then I will write separately. As a result, I received a list of users with their current karma and rating. So he was processed.

Picture for beauty:
image
')
The calculations took into account 24049 users. Of these, the total positive karma was 190371.89 total positive rating of 229145.98. Only positive values ​​were taken because they most closely match the notion of result. Of all non-zero users, 20% is 4810 and for them the sum of karma and rating is 150318.87 and 188463.37, respectively. It should be noted that the amounts are taken from the list sorted separately by decreasing for karma and rating (top habravchane)

Now dividing the values, we get a result close to 80% with an accuracy of ± 3%, which suggests the presence of dependency (formulas are clickable)

For karma image

For rating image

Well, this is what the Pareto law foreshadowed for us. But what other parameters apply to 20% of users?

So, the total rating and the karma of all users (taking into account the minuses) are respectively 150403.63 and 186244.84 which coincides with the naked eye with the total values ​​for 20%. But still count.

For karma image

For rating

Accept hypothesis with an error of ± 3%

You can say a little pathetic, incredibly 20% of top users have a total of the same amount of karma and rating as all users, taking into account the "lagging behind." Those. it can be said that the laggards compensate each other and it can be assumed that the same thing happens in other systems. Well, how are things with an average value.

The values ​​of the average positive karma and the average positive rating were found for all users of 7.92 and 9.53, respectively. The number of users with a rating> = average values ​​turned out to be 5449 and 5008, respectively. That is compared with the total number of non-zero users.

For karma image

For rating image

Total 20% of users have a rating or karma higher than the average positive (ie, productive) with a deviation of ± 3%.

The effect of negative values ​​on karma and rating was also noticed. It consists in the fact that in total all the negative karma and negative rating make up 20% of the sum of positive values. In total, all negative karma and rating are -39968.26 and -42901.14. While positive values ​​are 190371.89 and 229145.98.

For karma image

For rating image

That is 20% with a given error of ± 3%.

To everything written above I attach the document in a progressive format with the original data Results.xlsx

Obviously, the values ​​dynamically change, so this data may become outdated :) Of course, for complete statistics, you need to capture data periodically over a long period of time. But parsing html is not suitable for this. Yes, and the question of the removal of these data is better than the Chip and Dale, no one will decide. Probably a shift in ratios may be correlated with some events or be cyclical.

Also, the results in no case should not be considered as an immutable law of nature with specifically given numerical parameters. All calculations are purely empirical . Also, if we talk about how to leave only 20% of useful users, then according to Pareto law, they will eventually be divided according to rule 20/80.

Of all the calculations, one can confirm the fact that: “Most successful events are due to the action of a small number of high-performance forces; most of the trouble is due to the action of a small number of highly destructive forces. ”

You can also say that on average, out of every 5 invited users to Habr, only 1 will be useful for a habr.

At the request of the comments logarithmic graph of the distribution of karma and rating for sorted descending lists. Horizontal scale means a place in the ranking. For the blue graph, a place in the rating by karma (from the largest to the lowest) Similarly for red.



Direct on the red graph of the rating from 10867 to 12631 places means a large number of users with a rating of 2.
And the blue line that partially coincides with it from 11573 to 12603 places means Khabarovsk residents with karma 2. Where do so many people with karma and rating 2 have statistics are silent about. But it can be a reason for new research)

For those who read only the beginning and the end:

Brief conclusions with an accuracy of ± 3%:
  1. 20% of habravchan have 80% of the entire positive rating.
  2. 20% of habravchan have 80% of all positive karma.
  3. 20% of top habravchan have karma as much as the sum of the whole karma of all habravchan.
  4. 20% of top habravchan have the same rating as the sum of the rating of all habravchan.
  5. 20% habravchan have a karma higher than the average positive karma.
  6. 20% of habravchan have a rating above the average positive rating.
  7. All negative rating by volume is 20% of the positive.
  8. All negative karma in volume is 20% of the positive.

Source: https://habr.com/ru/post/80140/


All Articles