📜 ⬆️ ⬇️

Tag Algorithms

Tags are an integral part of all modern sites and an indirect sign of the site’s belonging to the notorious Web-Two-Zero.

In the article I want to talk about the methods and algorithms for tagging information.

So, when organizing tags, there are several weaknesses and bottlenecks, namely:

Unfortunately, the universal algorithm that would easily solve all these problems is not familiar to the author. Further on the algorithms themselves.

The normal ratio is many-to-many.


There is a huge table with tags, there are huge tables with tagged information. The connection between them is carried out through a third table, which is obtained very large. So, if we have 50,000 articles, and 10,000 tags, provided that each article is on average associated with 4 tags, we get a table size of 200,000.
')
Pros:
Minuses:

Using full-text search


The algorithm is given in my article "Full-text search and its capabilities"

Now how it is done directly in relation to the tags. In the field with the full-text index are the tags themselves, as they were written. Selection of objects occurs exclusively on this field. Based on the same field, the object's affiliation with tags is constructed. This means that if the tag is Russian, then the link to it should contain Russian letters. And with this there are problems, because they can be encoded using urlencode, and this depends on the encoding. Those. the same tag, depending on the encoding of the page, must be decoded differently. You can certainly use the transliteration of Russian words into English, and write them in the field along with Russian words. Then the tag will be displayed in Russian, and the link to it will be in Latin, and the search will also go in Latin. Bad exit, but exit.

Pros:
Minuses:
Alternatively, the combination of both methods. That is, a search by full-text index, and the frequency of use and the tags themselves in a separate table. Well, or variations on the same topic. This solves problems with the drop-down list and the cloud, but it creates difficulties when displaying, adding and creating tags.

If someone knows more options for the organization - it will be interesting to learn about them. Constructive criticism is welcome.

Source: https://habr.com/ru/post/40320/


All Articles