📜 ⬆️ ⬇️

Keyword Graph

In early May of this year, in a conversation with a work colleague (in the course of discussing one task for the practical application of sets), the question of building links between objects of a single site was raised. This site was a catalog of analogs of Windows and Linux programs, and one of the “chips” of this catalog was a software sample by sections a la facet (visually, the facet, and inside everything is implemented on sets as far as I understood from the discussion (the site was made by another person - it will also be necessary to discuss this issue with him)). Generally speaking, I was somewhat surprised at the task and ... stated that it is rather trivial, and if, when designing a database, the relationship between tables is done like a lot-to-many, then everything can be solved with a single query. We talked and dispersed, but the idea sat in the subconscious and gouged "you can do the same and better."

And by May 10, I “matured”: I decided to build an undirected graph of all connections between keywords in the Media Repository project, because on the one hand the task is very similar to the one described above, on the other hand there are quite a decent amount of these words so that it was possible to build something from a third party, the number of words is not so large as to wait a very long time for the construction of the graph.

No sooner said than done. For a start, the “related keywords” functionality was implemented, and then I was somewhat puzzled. Consultation with an SQL specialist gave a disappointing answer - the problem is not solved by means of SQL alone on an existing DBMS (MySQL 5). I had to make everything on the “client part” - a script that works with the database.

Earlier, I worked with the graphviz graphing toolkit , so I decided to use it for graph visualization this time. Using PHP and SQL, I prepared the appropriate .dot file for graphviz and built the required graph using the fdp method (neato gave a very “crowded” picture, and dot was too big).
')
The result is below:


The graph clearly shows several "foci of crystallization" of keywords, but to be honest, I was surprised at precisely this type of graph - I thought that it would be more "blurry" :)

After almost two months, I decided to re-build the graph using the same methodology, so that you can see the “development” of the graph.

Result of rebuilding:


As it turned out the development of the graph is quite entertaining :)

Source: https://habr.com/ru/post/38532/


All Articles