📜 ⬆️ ⬇️

Simple way to protect from classic HTTP DDoS

This solution allows you to calculate any bots, except for those that completely mimic the browser.

How it works


The bot requests a page, for example habrahabr.ru/search . The bot does not know how to load pictures, scripts, css, etc. along with the page. So the request to / search / will be displayed in the log and that's it.
If a live person comes to habrahabr.ru/search through a browser, then along with / search / a lot of pictures, scripts, css, etc. will go into the log.

Customization


Mysql


/etc/my.cnf
[mysqld] local-infile=1 # load data #    : max_heap_table_size=1024M tmp_table_size=1024M 

Under the root:
 UPDATE `mysql`.`user` SET `File_priv` = 'Y' WHERE `user`.`Host` = 'localhost' AND `user`.`User` = '__'; flush privileges; 

sysctl


sysctl.conf in detail with comments (linux)

Ram drive


Ram drive is needed to speed up the work with the nginx logs.
Add to the / etc / fstab file
 tmpfs /var/log/ram_disk tmpfs size=1024m 0 0 

Then
 mkdir /var/log/ram_disk mount -t tmpfs -o size=1024m tmpfs /var/log/ram_disk 

')

Algorithm



1. Choice trap


We take on the site any static and inconspicuous file (picture, css, js, etc.), loaded when you call any page of the dynamics, for example habrahabr.ru/styles/fontello/css/habr.css
This file needs to be made uncacheable, i.e. Add a random parameter, for example <? php echo '/styles/fontello/css/habr.css?'. rand (99999999)?>.
For reference, by default, opera puts images in the local cache for 1 hour, css / js for 5 minutes.

2. We rule nginx config


 #      log_format ddos_log '$remote_addr\t$msec\t$status'; #   location =/styles/1347283218/highlight.css { access_log /var/log/ram_disk/hook_access.log ddos_log; } #    location ~* ^.+\.(class|htc|bmp|cur|jpg|jpeg|gif|png|svg|xls|doc|xhtml|js|css|mp3|ogg|mpe?g|avi|flv|zip|gz|bz2?|rar|ico|txt|jar|swf)$ { access_log off; } #  location / { access_log /var/log/ram_disk/dynamic_access.log ddos_log; } 


3. Create tables for logs


ENGINE = MEMORY - to make it faster.
 CREATE TABLE `dinamic_log` ( `inc` bigint(20) NOT NULL AUTO_INCREMENT, `remote_addr` varchar(20) NOT NULL DEFAULT '0', `time_local` int(20) NOT NULL DEFAULT '0', `status` int(4) NOT NULL DEFAULT '0', PRIMARY KEY (`inc`), KEY `remote_addr` (`remote_addr`), KEY `time_local` (`time_local`) ) ENGINE=MEMORY AUTO_INCREMENT=1 DEFAULT CHARSET=latin1 

 CREATE TABLE `hook_log` ( `inc` bigint(20) NOT NULL AUTO_INCREMENT, `remote_addr` varchar(20) NOT NULL DEFAULT '0', `time_local` int(20) NOT NULL DEFAULT '0', `status` int(4) NOT NULL DEFAULT '0', PRIMARY KEY (`inc`), KEY `remote_addr` (`remote_addr`), KEY `time_local` (`time_local`) ) ENGINE=MEMORY AUTO_INCREMENT=1 DEFAULT CHARSET=latin1 


The table where we put the ip search bots
 CREATE TABLE `white` ( `remote_addr` bigint(20) NOT NULL, PRIMARY KEY (`remote_addr`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 

Table locked
 CREATE TABLE `black` ( `remote_addr` bigint(20) NOT NULL, `time_local` int(20) NOT NULL DEFAULT '0', PRIMARY KEY (`remote_addr`), KEY `time_local` (`time_local`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 


4. The main script


For ease of understanding written in php, because almost everyone knows this language. And error handling is also removed - for ease of understanding.

 //    $dinamic_log = $argv[1]; //    $hook_log = $argv[2]; // -     ,    . $r_stop = $argv[3]; //      (   ) $load_time = $argv[4]; //         $wait_sec = $argv[5]; function load_log($log, $table) { $tmp = '/var/log/ram_disk/tmp_ddos_file'; //      copy ($log, $tmp); //   file_put_contents($log, "", LOCK_EX); //     mysql_query('LOAD DATA CONCURRENT INFILE "'.$tmp.'" IGNORE INTO TABLE '.$table.' FIELDS TERMINATED BY \'\t\' (`remote_addr`, `time_local`, `status`) SET `remote_addr` = INET_ATON(`remote_addr`)'); //    unlink($tmp); } //     while (true) { //    load_log($dinamic_log, 'dinamic_log'); //    load_log($hook_log, 'hook_log'); //  . nginx    $status,  200  304  . $res = mysql_query('SELECT dinamic_log.remote_addr FROM `dinamic_log` WHERE (`status` = 200 OR `status` = 304) AND`remote_addr` NOT IN (SELECT `remote_addr` FROM `hook_log`) AND`remote_addr` NOT IN (SELECT `remote_addr` FROM `white`) GROUP BY `remote_addr` HAVING count(inc)>'.$r_stop); while ($row = mysql_fetch_array($res)) { //   ip mysql_query('INSERT INTO black(`remote_addr`) VALUES ('.$row['remote_addr'].')'); //  ip switch (PHP_OS) { case "FreeBSD": system('/sbin/route add -host '.$row['remote_addr'].' 127.0.0.1 -blackhole'); break; case "Linux": system('/sbin/ip route add blackhole '.long2ip($row['remote_addr'])); break; } } //      mysql_query('DELETE FROM `log` WHERE `time_local` < (UNIX_TIMESTAMP() - '.$load_time.')'); //  sleep($wait_sec); } 

Run:
 php ddoshook.php /var/log/ram_disk/dynamic_access.log /var/log/ram_disk/hook_access.log 5 300 3 


5. Shattered


 $block_time = $argv[1]; //     ip. $res = mysql_query('SELECT `remote_addr` FROM black WHERE time_local < (UNIX_TIMESTAMP() - '.$block_time.')'); while ($row = mysql_fetch_array($res)) { //  ip switch (PHP_OS) { case "FreeBSD": system('/sbin/route delete '.$row['remote_addr']); break; case "Linux": system('/sbin/ip route delete '.long2ip($row['remote_addr'])); break; } } 

We put in crowns
 * * * * * /usr/bin/php unban.php 86400 


That's all, bots are banned, people are skipped.

In the following releases:


upd

How to find IP search bots?


  1. We are looking for an AS of the required search engine, for example, here: bgp.potaroo.net/cidr/autnums.html
  2. We get the IP addresses for AS: stat.ripe.net/data/announced-prefixes/data.json?resource=AS15169

AS and IP lists need to be constantly updated.

upd 2

OS: FreeBSD 8.3
CPU: E5-2620 2.00GHz

Test 1


rows (dinamic_log): 100,000 (100,000 http requests to site dynamics in 3 seconds)
rows (hook_log): 1000 (1000 legitimate requests from users in 3 seconds)

# php /root/scripts/php/imgtest/ddos_hook.php /tmp/d20.log /tmp/h20.log 5 300 3
LOAD DATA time elapsed: 0.29 sec.
LOAD DATA time elapsed: 0.003 sec.
select time elapsed: 0.017 sec.
rows (ban): 1800
full cicle time elapsed: 0.313 sec.

Test 2


rows (dinamic_log): 1,000,000 (1 million http requests to site dynamics in 3 seconds)
rows (hook_log): 10,000 (10,000 legitimate requests from users in 3 seconds)

# php /root/scripts/php/imgtest/ddos_hook.php /tmp/d2.log /tmp/h2.log 5 300 3
LOAD DATA time elapsed: 2.878 sec.
LOAD DATA time elapsed: 0.023 sec.
select time elapsed: 0.501 sec.
rows (ban): 12402
full cicle time elapsed: 3.54 sec.

PS
The solution described in the article is experimental. Use at your own risk

Source: https://habr.com/ru/post/151420/


All Articles