📜 ⬆️ ⬇️

Disable site indexing by search bots using .htaccess

With active development, many use copies of the site in other domains, for experimenting or revising sites (not working to make changes).
And so many are faced with the problem of how to isolate search engines from this domain and, moreover, leave a working version of the site.

The easiest way to do this without interfering with the code is using .htaccess
Create a .htaccess file and write in it:

SetEnvIfNoCase User-Agent "^Yandex" search_bot
SetEnvIfNoCase User-Agent "^Yahoo" search_bot
SetEnvIfNoCase User-Agent "^igdeSpyder" search_bot
SetEnvIfNoCase User-Agent "^Robot" search_bot
SetEnvIfNoCase User-Agent "^Googlebot" search_bot
SetEnvIfNoCase User-Agent "^msnbot" search_bot
SetEnvIfNoCase User-Agent "^Aport" search_bot
SetEnvIfNoCase User-Agent "^Mail" search_bot
SetEnvIfNoCase User-Agent "^bot" search_bot
SetEnvIfNoCase User-Agent "^spider" search_bot
SetEnvIfNoCase User-Agent "^php" search_bot
SetEnvIfNoCase User-Agent "^Parser" search_bot

Order Allow,Deny
Allow from all
Deny from env=search_bot

')
To test the performance, you can use the following PHP script

<?php
if(empty($_POST)) {
?>
User-Agent User-Agent

;)
<?php
if(empty($_POST)) {
?>
User-Agent User-Agent

;)
<?php
if(empty($_POST)) {
?>
User-Agent User-Agent

;)
<?php
if(empty($_POST)) {
?>
User-Agent User-Agent

;)

Source: https://habr.com/ru/post/20632/


All Articles