⬆️ ⬇️

The tool to control the behavior of robots on your site

Greetings

Today I would like to tell you about my project, the start of which was given back in 2008. Since then, much has changed, both in the data storage architecture and in information processing algorithms.



It's about the service for SEO specialists and / or ordinary webmasters. BotHunter is a passive monitoring system (in real time) for user agents on your site. Examples of interfaces, see below, or in a DEMO account on the site of the system (limited functionality in demo mode). Read on





')





Prehistory



Considering my appetites and the amount of data analyzed, I wrote this service for myself. For me, it is more understandable is the "graphic answer" to all questions. Frequently asked questions BotHunter will answer:









There is a finished bike



At once I would like to stop those who are ready to ask the question “Why? Are there Yandex.Webmaster and google webmasters? "

Yes, these services are useful and well-known, but they will not answer the following questions:



1. Are there pages on my site that bots know about, but they are not in Sitemap.XML?

2. Is there a page on my site that the bot has visited, but there has never been any traffic on them (I want a list)?

3. What is the share of urls, the crawlers constantly visit, but they are not in the search?

4. Is there a page on my site with the same weight in bytes (also a duplicate theme)?

5. After updating the search base (or changing the algorithm) of such and such a number; How many pages of the site do not visit bots anymore? And how many of them are no longer traffic entry points from organic delivery?

6. etc.

The list of interesting questions can be continued, and each of us will have our own list ...







What are the advantages of the service











In addition to simple and clear reports, BotHunter daily checks the integrity of the robots.txt and sitemap.xml files for each of your sites. Regarding sitemap.xml a separate song, the file is being tested for validity and compliance with the sitemap protocol. The system writes a journal about all checks and facts of report generation on a daily basis.



What's in the plans







ps about TTX, briefly:





The main objective of this post is to get your advice .

What other data would you like to receive and in what form?

What ideas would you suggest?



Thank you in advance for constructive criticism ...

Source: https://habr.com/ru/post/180849/



All Articles