📜 ⬆️ ⬇️

"Rock is hard" or ya. Music as a graph

KDPV

I have been using Yandex music for a long time to search for "what to listen to." Most of the time, I just go to like the performers of the bands I enjoy, however, this method has not given results for a long time. For some time, my needs were closed by a radio filter with a filter by genre, but its repertoire is surprisingly stingy. It's time to solve the problem globally, and that's what came out of it =)

What did you want


Immediately, I note that I did not set out to find out how and how well the recommendations work, or to conduct a cluster analysis. You can work in very different directions on the obtained data, but for my ultimate goal it is not necessary.

And I wanted, first of all, to find those rock and metal performers that everyone knows about, but I, in view of my inexperience, no.
')
The problem is foolish, you say, take the TOP100 according to the version of any publication and rejoice, but it does not work like that: all the tops that I saw are either too “newfangled” and I don’t like it, or very classic and I already heard it all .

In general, my travels in music is a separate story: it was funny to see metal friends when I told them about Black Sabbath , which is new for me, in the context of “ just hear how it sounds! ” =)

In general, it was decided to collect all the rock and metal performers, and links to other performers from their pages, count incoming degrees and build your top. Well, since I clearly have a network structure, it was beyond my powers not to visualize it as a graph.

How to collect


Not once or twice did I sit down for this project, but each time the interest faded before a significant result (right down the book). This time I decided to obey the ideologists of the hakatons and tried to cut the minimum working product in the minimum time. For this reason, I stopped collecting the most close to me genres of rock and metal instead of collecting all the music.

The parsing itself is written on python and selenium plus Postgree for data, since there was a ready project on this stack at hand. Selenium for production is certainly a controversial decision, but it’s not production.

For a start, I collected data on rock artists, they turned out to be about six thousand.
When I started collecting references to similar ones, it turned out that the rockers are actually much larger and only a small part of them is represented in the index by genre. Separately collected subgenres (Russian rock and others) - they are very weakly intersected with the main index. There were no surprises with metal, except that in the middle of the parsing he had subgenres and had to collect everything anew.

In general, having reassembled everything three times, I realized that everything I could did and needed to be drawn.
I would like to separately ask the guys from Yandex not to swear at my parasitic requests, even if there were not so many of them - guys, all for the sake of science =)

How to draw


I finally wanted to abandon the manual drawing in favor of Gephi , so that only the sliders move between beautifully and clearly , but did not grow together - he draws small graphs on a thousand vertices, and ten times more silently refuses. No mistakes, no greetings, my colleagues work on Windows, and I have a white sheet = (I made a notch to return to it (and I recommend it to you), and went to draw by usual means.

I picked up a rather old comparison of the python names for calculating graphs and chose Igraph with the FR algorithm for the final image and satisfactory performance.

What happened


Some introductory, which may not seem obvious:


All columns are built without singles - those artists to whom no one refers and who themselves do not refer to anyone. They only create noise and do not give any additional information. It was possible to cut off the tops with small degrees in order not to litter the image, but it seemed to me that I could do without it on the final graph.
Some columns are presented in two versions - basic and complete. They differ in that in the first version only the performers included in the index of i.music music in any genre (conventionally primary performers) participate.

Metal


Let's start with subgenres. The graphs are small, so I built immediately complete.

folk metal progressive metal

nu metal epic metal

extreme metal classic metal

Metal subgenres at the time of collection only appeared, I think now they have grown significantly.
And this is how the total graph looks like with all subgenres in two versions - main and full.

main performers of metal all metal performers

Rock



Ukrainian rock new wave

post rock Russian rock

rock'n'roll progressive rock

Notice how progressive and rock and roll ahead of their fellows. The rock of performers is generally much more than metalists (25 thousand versus 8.5), perhaps these are features of music or the relative youth of metal.

The total graph of rock with all subgenres, also in two versions.

main performers of rock all rock artists

On the main graph, a small and proud cluster of Russian rock, so far from the rest, is perfectly visible.

Crossing rock + metal


Finally, let's begin to cross both genres. There are already full graphs.

the intersection of rock and metal
Rock performers are marked red, blue is metal, and purple are both genres at the same time (this is when rock and metal are both performers). In this form it is convenient to look at how close these genres are in principle. From afar, it looks interesting, of course, but I wanted more clarity.

By changing the size of each vertex in proportion to its incoming degree, I finally got that picture, which I will study for a long time in search of interesting things.
the intersection of rock and metal

This thumbnail doesn’t see anything, but it is clickable and behind it is a 10x10 kilopixel map with a legend for vertices weighing 30 or more. The same map is in another resolution ( 10k , 20k , 32k ). At the end of the article there are links to more detailed versions of this map and to the sources (in case you want your own color / size version / etc).

Interesting


The heart of rock - Elvis is invincible!

heart of rock


rock and metal - transition An isthmus between rock and metal. Violet, remember, performers who are in both genres.


Heart of metal. Unlike rock, there are purple here and have a decent weight.

metal heart


Russian rock Russian rock is very separate and almost not connected with metal, even Russian.

interesting rock cluster Funny cluster of rockers - a few, very dense and weighty.

A piece of Russian metal is very far from everyone.

Russian metal

Tops


It was interesting to see the tops on the incoming degree
Rock
ExecutorPower
Elvis presley185
Deep purple133
Paul McCartney97
Eric Clapton93
Whitesnake73
ChayF73
David Bowie72
The rolling stones70
The ventures67
B267
Metal
ExecutorPower
Black sabbath78
Edguy62
Sonata arctica60
In flames58
Therion58
Judas priest57
Glenn hugs56
Eluveitie55
Of Mice & Men54
Doro pesch54
Interestingly, performers refer not only to their colleagues in style. Top inbound links outside the genre of rock and metal.
Other genres
ExecutorPower
Dr114
Bob dylan87
Ryan tedder61
Farrell58
John Frusciante48
Rihanna47
Frank sinatra47
Lana del rey46
Ray Charles44
NOFX43
Pay attention to the very first artist Dr. His page is very scanty, while the weight is still not small (it’s a joke, almost deep purple, and this is among the rockers / metallers), and until the spring update it was really huge. Maybe it was added as similar when no one more suitable was found, and the links needed to be taken to nine? This can only say guys from Yandex.

Conclusion


I hope you learned something interesting from the article. Personally, I made myself a big list of artists for familiarization and I hope to find new favorites.
I will also ask you not to forget that everything presented is the result of the work of recommendations of music i, so that the graph can be very far from analogues with foreign aggregators.
I would be glad to reasoned criticism and feedback on the work done.

→ Sources
→ Options for the final graph

Afterword


  1. Be careful with running scripts on your local machine. For parsing Yandex you can be banned on it, and the rendering of pictures can eat all the memory, especially if it is less than 16GB.
  2. Similarly, count the performance of your laptop before opening the card at 32 kilopixels.
  3. Cairo, which is under the hood of the Igraph painter, falls into the bark of generating large images in recent versions, and in the latter drives us into 32k pixels. If you need more gold , generate a .ps file and convert it using third-party tools.
  4. As far as I have noticed, the question of the quantity l in the title of the metal genre is very painful, so I stick to the side of i.music, since all the data from it. I beg you not to kindle on this topic =)

Source: https://habr.com/ru/post/337216/


All Articles