Hi, Habr.
In the
first part , some patterns of development of such an interesting resource as habrahabr were considered. The material turned out to be long, so the continuation is here. In this part we will at the same time look at how to build such pictures, and finally, complete our statistics and rating.

')
Who are interested in the results, please under the cat.
Titles of articles (word cloud)
Before displaying the ranking of articles, it became interesting to see which keywords are the most popular in the headlines. It is quite obvious that the popularity of different technologies changes over time, I want to see it in a visual form.
This is easy to do with Python:
from wordcloud import WordCloud def split_words(s): try: words = re.split('[:?., "''()<>-\[\]|!]', s) return map(to_ascii, words) except: return [] def filter_words(s): s = s.decode('utf-8').encode("ascii", errors="ignore").decode() return len(s) > 2 def to_ascii(s): s = s.replace("'", '').replace("-", '').replace("|", '') return s.decode('utf-8').encode("ascii", errors="ignore").decode() titles = df['title'].str.lower() ts = titles.apply(lambda x: pd.value_counts(filter(filter_words, split_words(x)))).sum(axis = 0) ts = ts.sort_values(ascending=False) print(ts[:50]) print() s_all = "" for p in range(min(ts.shape[0], 200)): s_all += (ts.index[p] + ' ') * int(ts.values[p]) wc = WordCloud(width=1600, height=1200, background_color="white", relative_scaling=1.0, collocations=False, ).generate(s_all) plt.figure(figsize=(9,6)) plt.title("%d" % year) plt.xticks([]) plt.yticks([]) plt.tight_layout() file_name = 'habr-words-%d.png' % year plt.savefig(file_name)
However, it would be inconvenient to display the “sheet” of 12 pictures - we will do all this in the form of gif-animation. We take out the code in a separate function and run it cyclically for the desired range of years.
import imageio images = [] for y in range(2006, 2019+1): file_name = make_words_cloud(df, y) images.append(imageio.imread(file_name)) imageio.mimsave('habr-words.gif', images, duration=2)
And the last point: to make words easier to compare, let's make sure that the same words always match the same colors.
colors_dict = dict() def random_color_func_(word=None, font_size=None, position=None, orientation=None, font_path=None, random_state=None): if word in colors_dict: return colors_dict[word] else: c = random_color_func(word, font_size, position, orientation, font_path, random_state) colors_dict[word] = c return c wc = WordCloud(width=2600, height=2200, background_color="white", relative_scaling=1.0, collocations=False, color_func=random_color_func_).generate(text) ...
The final result in the form of a GIF:

And the size of the words (it is proportional to the number of entries) and their diversity speak for themselves. Some patterns are interesting - Google remained unshakable in the first place only, Flash, Opera and Yahoo are gone, 10 years ago nobody wrote about Amazon, and there were no names like Tesla, Kotlin or GDPR.
A similar attempt was made to create a distribution for Russian words, but it ended in complete failure - because of the declensions of the prefixes and endings in Russian, the result was more like a random number generator. An attempt to “normalize” all of this, singling out nouns, bringing them to the nominative case, etc., probably would not have demanded this article, but the length of the thesis. And it would be tempting to see how, for example, the word “Roskomnadzor” or “Duma” rose in the citation rating (but let's not talk about the sad).
At the same time, we will finally finish programming, and move on to the actual rating.
Rating
Once again, the rating is unofficial, and absolute certainty is not guaranteed. If for example, the server on some article returned timeout, then such an article will not be included in the list. Perhaps there are still some hidden indexes, about which I do not know. Manual check of 206 thousand articles from 2006 to 2019 is quite difficult. If one of the authors did not find himself in the rating, but I am sure that he should be there, write, add manually. Some articles from 10 years ago may have already become outdated, well, by the way, this is even more interesting - some forgotten moments can be remembered.
Let's go :) Well, and all the authors, congratulations in absentia on getting into the super-top. Although the names were not analyzed during the parsing and were not written into the rating, but I think those who created the articles will probably recognize themselves.
Edit : as some users noticed, a couple of articles are repeated twice. This is not a parsing error, these articles were really laid out again - the links are different, and the redirect leads to the same article.
Top 20 articles by number of views
Hack Wi-Fi for ... 3 seconds 2,000,000 views, 63 comments, rating + 112.0 / -21.0
Hidden smiles in Skype 1655000 views, 69 comments, rating + 173.0 / -74.0
We write our first application on Android 1535000 views, 95 comments, rating + 123.0 / -15.0
300 amazing free services 1482000 views, 104 comments, rating + 325.0 / -16.0
Hack Wi-Fi for 10 hours 1416000 views, 164 comments, rating + 294.0 / -10.0
Networks for the smallest. Part zero. Planning 1388000 views, 133 comments, rating + 100.0 / -4.0
Wi-Fi: unobvious nuances (for example, home network) 1186000 views, 138 comments, rating + 231.0 / -3.0
Learning Python qualitatively 1181,000 views, 87 comments, rating + 59.0 / -27.0
New Java programmers 1084000 views, 58 comments, rating + 113.0 / -7.0
1000+ hours of video on Java in Russian 1076000 views, 38 comments, rating + 111.0 / -9.0
Programming for Android for beginners. Part 1 1043000 views, 29 comments, rating + 50.0 / -34.0
Practice setting Mikrotik for dummies 1006000 views, 114 comments, rating + 34.0 / -5.0
5 practical tips on the use of lithium-ion batteries 999,000 views, 34 comments, rating + 21.0 / -2.0
Once again about IP-addresses, subnet masks and generally 972,000 views, 203 comments, rating + 261.0 / -25.0
How to start working with GitHub: a quick start 948,000 views, 50 comments, rating + 165.0 / -17.0
27+ resources for online learning 939,000 views, 68 comments, rating + 163.0 / -11.0
Memo to users of ssh 925000 views, 135 comments, rating + 352.0 / -8.0
What is CRM-systems and how to choose them correctly? 916,000 views, 30 comments, rating + 21.0 / -7.0
A simple strategy game 2048 897000 views, 43 comments, rating + 63.0 / -20.0
Candid photos of Jennifer Lawrence and dozens of other celebrities leaked through iCloud 895,000 views, 328 comments, rating + 183.0 / -23.0
Top 20 articles by rating
Making a private monitor from an old LCD monitor , 320 comments, rating + 1466.0 / -18.0, 486,000 views
The source of 3300 global Internet projects were received , 909 comments, rating + 1190.0 / -36.0, 240000 views
Story toys. Field of Miracles , 302 comments, rating + 923.0 / -10.0, 150000 views
How Denis Kryuchkov bought Habr from Mail.ru , 337 comments, rating + 817.0 / -35.0, 275,000 views
Voronezh concluded a contract with the bank, making his edits, and is going to sue 24 million rubles , 860 comments, rating + 778.0 / -25.0, 397000 views
What exactly do I hate for some individual marketers - or as an IT specialist, went shopping , 777 comments, rating + 769.0 / -45.0, 591000 views
Steve Jobs has died , 648 comments, rating + 783.0 / -75.0, 22,700 views
The principle of cicada and why it is important for web designers , 119 comments, rating + 682.0 / -14.0, 172,000 views
As we searched for Mars-3 , 169 comments, rating + 669.0 / -8.0, 225,000 views
Stop twisting! , 337 comments, rating + 667.0 / -15.0, 865,000 views
The history of the online store, which has become a world monopolist for $ 5,000 , 189 comments, rating + 641.0 / -5.0, 81800 views
Sleep a little, but right? , 420 comments, rating + 670.0 / -43.0, 464000 views
What is wrong with the Habrakhabra design , 361 comments, rating + 673.0 / -62.0, 143,000 views
We read the QR code , 103 comments, rating + 612.0 / -9.0, 490000 views
Vulnerability on Habrahabr or how to steal an invite , 138 comments, rating + 600.0 / -19.0, 160,000 views
Wooden mouse Project history , 440 comments, rating + 574.0 / -6.0, 137000 views
Nifiga himself went for some bread, or the story of one hacking , 147 comments, rating + 576.0 / -16.0, 102000 views
30 kopecks for Mikhalkov , 295 comments, rating + 588.0 / -29.0, 28,700 views
How I punished Firaxis or the story of how to iterate through a binary engine through a silencer , 176 comments, rating + 547.0 / -4.0, 95,100 views
Badges for Habr, version , 143 comments, rating + 552.0 / -10.0, 18,500 views
Top 20 articles by relative rating
Update Android versions: sad statistics 10,000 views, rating + 412.0 / -46.0
Schedule of articles on Habré (for any week) 12,000 views, rating + 418.0 / -14.0
Let's help Sberbank 12300 views, rating + 424.0 / -18.0
Happy programmer's day! 13100 views, rating + 0.0 / -0.0
Steve Jobs has died 22,700 views, rating + 783.0 / -75.0
QIP - Minute of hatred (history on the server) 12100 views, rating + 413.0 / -44.0
Advanced technologies, as a way to squeeze a maximum of 10,300 views
from the server , rating + 314.0 / -4.0
Badges for Habr, version 18500 views, rating + 552.0 / -10.0
About the system administrator, a search in his apartment and the illegal seizure of computer equipment 10800 views, rating + 344.0 / -37.0
About the system administrator, a search in his apartment and the illegal seizure of computer equipment 10800 views, rating + 344.0 / -37.0
How was hacked Vkontakte.ru 11300 views, rating + 381.0 / -71.0
Mediamagia: You come home, take the remote and choose to watch 11,700 views
from the tracker , rating + 318.0 / -12.0
Russian Business 12500 views, rating + 355.0 / -31.0
Russian Business 12500 views, rating + 355.0 / -31.0
Thoughts out loud about the protocol X 10600 views, rating + 283.0 / -11.0
Sad statistics or never rely on freelancers 12400 views, rating + 365.0 / -49.0
What did yablofily and yablofoby did not understand? 16700 views, rating + 0.0 / -0.0
Mal, yes Del: Trojan-Downloader.Win32.Tiny 13300 views, rating + 351.0 / -14.0
Did you buy the program? Come on ... 13,000 views, rating + 371.0 / -42.0
IT emigration to the Land of Smiles, to Thailand 10400 views, rating + 271.0 / -11.0
Are you afraid that they will close ex.ua again? Not worth it - everything can be downloaded on the server of the Ministry of Internal Affairs of Ukraine 13000 views, rating + 332.0 / -7.0
Live frame model of a motorcycle 11,200 views, rating + 302.0 / -24.0
Top 20 by the number of bookmarks
300 amazing free services 1482000 views, 9119 bookmarks
Memo to users of ssh 925000 views, 5822 bookmarks
27+ resources for online learning 939,000 views, 4851 bookmark
Networks for the smallest. Part zero. Planning 1388000 views, 4347 bookmarks
Stop twisting! 865,000 views, 4330 bookmarks
Sleep a little, but right? 464,000 views, 3,946 bookmarks
1000+ hours of Java video in Russian 1076,000 views, 3,616 bookmarks
Know the complexity of the algorithms 522,000 views, 3563 bookmarks
We make a private monitor from the old LCD monitor 486,000 views, 3539 bookmarks
The principle of cicada and why it is important for web designers 172,000 views, 3511 bookmarks
Acceleration of loading Windows for fun and profit 448000 views, 3497 bookmarks
Hack Wi-Fi for 10 hours 1416000 views, 3405 bookmarks
Guide to the design of HTML / CSS code from Google 266,000 views, 3349 bookmarks
Cheat sheet for design patterns 785,000 views, 3344 bookmarks
Several useful services 121,000 views, 3319 bookmarks
Top 5 most impressive books that every software developer should read 319,000 views, 3277 bookmarks
Many free books on programming 282,000 views, 3203 bookmarks
A selection of useful for fans of Twitter Bootstrap 248,000 views, 3079 bookmarks
Wi-Fi: unobvious nuances (for example, home network) 1186000 views, 3070 bookmarks
Want to be an iOS developer? Be it! 377,000 views, 2980 bookmarks
Top 20 in relation to the number of bookmarks to view
Cribs for those who make the first steps 1114 bookmarks, 15900 views
Linux hacking exercises 876 bookmarks, 13300 views
Anatomy of a font 1495 bookmarks, 22,700 views
38 articles on creating rounded corners on sites 677 bookmarks, 10,400 views
Where to nibble the granite of science 830 bookmarks, 13200 views
Advanced technologies, as a way to squeeze a maximum of 646 bookmarks
from the server , 10,300 views
Lecture torrent Lectorium 947 bookmarks, 16200 views
Useful links for learning CSS animation 1280 bookmarks, 22300 views
New free online courses from Stanford 623 bookmarks, 11,300 views
UICloud: The largest database of user interfaces 1780 bookmarks, 32,400 views
A selection of html / javascript / css tools and libraries from SmashingMagazine 1000 bookmarks, 18,400 views
8 useful services for a web developer and designer 1658 bookmarks, 33,800 views
Lectorium recorded almost a thousand lectures in a year 2,516 bookmarks, 54,200 views
IPO for dummies. Part I: shares, majority shareholders, control over the company 564 bookmarks, 12,200 views
Cross-browser monochrome translucency of 499 bookmarks, 11000 views
Mediamagia: You come home, take the remote and choose to watch 518 bookmarks
from the tracker , 11,700 views
Screencasts on how to cut and pull 448 bookmarks, 10,300 views
Noty - an unusually flexible jQuery plugin for displaying notifications 1099 bookmarks, 26200 views
Data Visualization 491 bookmark, 12200 views
9 articles on the topic of round buttons 396 bookmarks, 10,100 views
Top 20 most "controversial" articles
First post , 667 comments, rating + 596.0 / -445.0
Discrimination of VKontakte users , 319 comments, rating + 399.0 / -258.0
Why fell VKontakte , 380 comments, rating + 306.0 / -255.0
It's time to tie tabs in code , 217 comments, rating + 258.0 / -234.0
And the reader, and the dude igrets , 175 comments, rating + 337.0 / -233.0
Goodbye, Karma, or Who Needs an iPad? , 520 comments, rating + 661.0 / -223.0
Non-Apple products from Apple 504 comments rating + 397.0 / -218.0
Pointless 'Operating System' , 325 comments, rating + 394.0 / -215.0
Cho! Mail.ru What , 497 comments, rating + 316.0 / -205.0
Pepyaka , 255 comments, rating + 239.0 / -204.0
Chanterelle-Firefox Costume [photo] , 105 comments, rating + 285.0 / -204.0
Let's talk about Microsoft , 990 comments, rating + 261.0 / -201.0
OpenSource-CURATCH, or force the teacher to precipitate , 538 comments, rating + 276.0 / -200.0
Yandex browser , 825 comments, rating + 266.0 / -199.0
Merchant API , 136 comments, rating + 231.0 / -198.0
God is a stupid game designer , 531 comments, rating + 351.0 / -195.0
Why I refused Mozilla Firefox , 324 comments, rating + 225.0 / -193.0
Vogue magazine's 'secret' , 199 comments, rating + 225.0 / -189.0
Order of the White Knights of Habr , 553 comments, rating + 213.0 / -188.0
All PHP in two lines , 322 comments, rating + 240.0 / -187.0
Top 20 Most Commented Articles
My disappointment in the software is 2435 comments, 278000 views
How to distribute invites on Google+ 2266 comments, 17600 views
Hello, world! 2194 comments, 10,300 views
The best computer games of all times and peoples according to the version of the habrasoobshchestva 2013 1887 comments, 163000 views
About the IT market in Russia to be honest 1834 comments, 128000 views
Distributing elephants or invites on Google+ 1829 comments, 1500 views
More + 2GB for your DropBox account. This time, DropBox and Three.com.hk promotion. 1729 comments, 13600 views
Festive distribution of invites! 1663 comments, 1300 views
Slack ban accounts from Crimea 1660 comments, 64200 views
The first version of Opera 15 for computers 1585 comments, 187,000 views
Stop suspecting developers in imposture. Learn to better interview 1579 comments, 111,000 views
Completed the most extensive study of the effects of GMO on human health 1579 comments, 224000 views
Invites to Google Wave 1476 comments, 408 views
Why do we need priests in high school? 1475 comments, 157000 views
What are you missing to complete the transition from windows to linux? 1381 comments, 17600 views
Rocket 9M729. A few words about the “violator” of the INF Treaty 1371 comments, 83000 views
Invites to Turbofilm! 1313 comments, 2200 views
Adherents of static and dynamic typifications will never understand each other. And TypeScript will not help them. 1301 comments, 49300 views.
Come on! @ # With your 'toxicity' 1300 comments, 176000 views
Amazon gave up and raised employee salaries 1288 comments, 63600 views
Antitop-20 articles with the largest number of dislikes
First post , 667 comments, rating + 596.0 / -445.0
Discrimination of VKontakte users , 319 comments, rating + 399.0 / -258.0
Why fell VKontakte , 380 comments, rating + 306.0 / -255.0
It's time to tie tabs in code , 217 comments, rating + 258.0 / -234.0
And the reader, and the dude igrets , 175 comments, rating + 337.0 / -233.0
Goodbye, Karma, or Who Needs an iPad? , 520 comments, rating + 661.0 / -223.0
Non-Apple products from Apple 504 comments rating + 397.0 / -218.0
Pointless 'Operating System' , 325 comments, rating + 394.0 / -215.0
Cho! Mail.ru What , 497 comments, rating + 316.0 / -205.0
Pepyaka , 255 comments, rating + 239.0 / -204.0
Chanterelle-Firefox Costume [photo] , 105 comments, rating + 285.0 / -204.0
Let's talk about Microsoft , 990 comments, rating + 261.0 / -201.0
OpenSource-CURATCH, or force the teacher to precipitate , 538 comments, rating + 276.0 / -200.0
Yandex browser , 825 comments, rating + 266.0 / -199.0
Merchant API , 136 comments, rating + 231.0 / -198.0
God is a stupid game designer , 531 comments, rating + 351.0 / -195.0
Why I refused Mozilla Firefox , 324 comments, rating + 225.0 / -193.0
Vogue magazine's 'secret' , 199 comments, rating + 225.0 / -189.0
Order of the White Knights of Habr , 553 comments, rating + 213.0 / -188.0
All PHP in two lines , 322 comments, rating + 240.0 / -187.0
Bonus
And a small bonus for those who read to here - we will post a mini-rating of articles written in English. This rating is essentially one year, since earlier from simply was not, but that is, that is. To get it, just add one line of code - select articles in the filter that have "/ en /" in the link field:
df = df[df['link'].str.contains("/en/")]
The results are shown below. All categories will not bring, because English-language articles are still few, and much is repeated.
Top English-language articles by number of views
I’m ruin the developers and I’m sorry 164000 views, 12 comments, rating + 33.0 / -3.0
Notebook
for a system administrator 98300 views, 56 comments, rating + 88.0 / -3.0
Flightradar24 - how it works? 91000 views, 12 comments, rating + 74.0 / -1.0
I lost faith in the industry, I burned out, but the cult of the tool saved me 30400 views, 2 comments, rating + 21.0 / -2.0
PC Speaker To Eleven 24600 views, 0 comments, rating + 31.0 / -2.0
Making a DIY text laser projector 22900 views, 5 comments, rating + 25.0 / -1.0
A bot for Starcraft in Rust, C or any other language 21200 views, 3 comments, rating + 44.0 / -1.0
Hello world! Or Habr in English, v1.0 21000 views, 249 comments, rating + 178.0 / -2.0
Running image viewer from Windows XP on modern Windows 8900 views, 1 comment, rating + 25.0 / -2.0
Yet another plea against using public WiFi 8000 views, 0 comments, rating + 17.0 / -1.0
Real-time edge detection using FPGA 7500 views, 45 comments, rating + 41.0 / -14.0
Stack-based calculator on the Cyclone IV FPGA board 7200 views, 27 comments, rating + 58.0 / -17.0
On higher education, programmers and blue-collar job 7100 views, 7 comments, rating + 22.0 / -1.0
I am a useless idiot, so I want to quit my job: 10 questions to a software developer, a pilot episode 7000 views, 6 comments, rating + 24.0 / -0.0
Vim for beginners 6,200 views, 2 comments, rating + 19.0 / -0.0
Do more with patterns in C # 8.0 5700 views, 5 comments, rating + 18.0 / -2.0
Naive Math: Earnshaw’s theorem theorem 5300 views, 1 comment, rating + 44.0 / -1.0
? Wanna Play a Detective? Find the Bug in a Function from Midnight Commander 5100 views, 0 comments, rating + 31.0 / -0.0
How does a barcode work? 4500 views, 0 comments, rating + 20.0 / -2.0
How to learn English 4400 views, 17 comments, rating + 15.0 / -1.0
Top English articles on the number of bookmarks
Flightradar24 - how it works? 91,000 views, 28 bookmarks
How to learn English 4400 views, 21 bookmarks
A small notebook for a system administrator 98300 views, 19 bookmarks
Vue, Storybook, TypeScript-starting a project with 2700 views, 17 bookmarks
Hello world! Or Habr in English, v1.0 21000 views, 16 bookmarks
Vim for beginners 6400 views, 15 bookmarks
A bot for Starcraft in Rust, C or any other language 21200 views, 14 bookmarks
Kalman Filter 2000 views, 11 bookmarks
Things you need to know should you want to switch from PHP to Python 2700 , 11
Isometric Plugin for Unity3D 1500 , 10
Ternary computing: basics 2100 , 10
I ruin developers' lives with my code reviews and I'm sorry 164000 , 9
Currying and partial application in C++14 1300 , 9
Time Series Modelling 1100 , 8
Generic Methods in Rust: How Exonum Shifted from Iron to Actix-web 3300 , 8
.NET Reference Types vs Value Types. Part 1 1400 , 8
How do technical indicators on stock market actually work? 791 views, 7 bookmarksMaking a DIY text laser projector 22900 views, 7 bookmarksSend an email with attachements by JavaMailSender from SpringFramework 563 views, 7 bookmarksLow-budget stereo frames, anaglyph, stereoscope 1100 views, 7 bookmarksfindings
Conclusions will not be. Thank you all for your attention, and have a nice reading :)