⬆️ ⬇️

Express analysis of suspicious activity in the web server log

On most modern hosting companies, besides FTP access to the file system, SSH access is also provided (by default or on request for technical support). The webmaster’s ability to work with site files in the terminal (in command line mode) via SSH saves him a lot of time. An operation that can take dozens of minutes via FTP is done via the command line in a couple of seconds. In addition, there are many operations that can only be done over SSH in command line mode.



The webmaster does not have to master all the tools of the Unix operating system, first you need to get acquainted with the basic commands, and add some useful tricks to them when working with the command line via SSH in order to quickly search for files, change their attributes, copy, delete and perform operations with text data.



I’ll skip the description of the protocol and the process of connecting to a hosting account via SSH, you can find many video tutorials and articles on this topic on the network, I’ll just say that you will need Putty (Windows OS) / Terminal (Mac OS X) or similar to connect, and access to hosting via SSH: host, port, username and password (often the username and password are the same as access to cPanel, ISPManager or a hosting control panel account).

')

So, what useful can you do on the command line? You can quickly search for a substring in a text file, sort, filter text data. For example, to analyze the logs (logs) of a web server in order to identify suspicious requests to the site or to understand how the site was hacked.



Suppose you notice suspicious activity on the site (began to open slowly, access to the admin panel was gone, spam was sent from the site, etc.). The first thing to do in this case is to check the site files for malicious code using specialized scanners. But while the site is being scanned, you can conduct an express analysis of the web server logs using the find / grep commands to determine whether there were any calls to some suspicious scripts, brute-force attempts (password guessing) or hacker script calls. How to do it? About this below.



In order to analyze the logs (logs) of the web server, it is necessary that these logs be enabled and accessible in the user directory. If they are disabled by default, you need to enable them in the hosting control panel and set, if there is such a setting, the maximum possible storage period (rotation). If there are no logs, but you need to perform an analysis for the last few days, you can try to request them from those hosting support. On most shared-hosting logs can be found in the logs directory, which is located one or two levels above the public_html directory (www). So, we will assume that there are logs on the hosting and the path to them is known.



Connect via SSH and go to the directory with the web server logs, which are usually stored on virtual hosts for the last 5-7 days. If you list the files in directories, most likely there will be access_log for today, as well as access_log.1.gz, access_log.2.gz, ... - these are archived logs for previous days.



You can start analyzing the log with requests that were made using the POST method:



grep 'POST /' access_log 


or

 cat access_log | grep 'POST /' 


The output can be saved to a new text file for further analysis:



 grep 'POST /' access_log > post_today.txt 


How to do the same for the gzip archived log? There is a zcat command for this (similar to cat, but prints the contents of the archived file).



 zcat access_log.1.gz | grep 'POST /' > post_today.txt 


For analyzing suspicious activity, it is advisable to use a sample of all available logs. Therefore, in the examples below, we will use the find command, which will search for all files and then execute the corresponding command for each (for example, zcat).



How to identify hacking attempts or search for vulnerable scripts?

For example, you can find all calls to non-existent .php scripts in all available logs.



 grep 'php HTTP.* 404' access_log find . -name '*.gz' -exec zcat {} \; | grep 'php HTTP.* 404' 


( instead of -exec, you can use xarg to call zcat. )



You can also search for all unsuccessful calls to php scripts (to which access was denied).



 find . -name '*.gz' -exec zcat {} \; | grep 'php HTTP.* 403' 


Here we are looking for requests in which the php extension and status 403 are encountered.



Next, we will look at the number of successful calls to the scripts on all available logs, sort them by the number of calls and derive the TOP-50 most popular ones. We will make a sample in three steps: first, we will perform a search on access_log, then on all access_log. *. Gz, we will display the results in a file, and then use it for sorting.



 find . -name '*.gz' -exec zcat {} \; | grep 'php HTTP.* 200' > php.txt grep 'php HTTP.* 200' access_log >> php.txt cut -d '"' -f2 php.txt | cut -d ' ' -f2 | cut -d '?' -f1 | sort | uniq -c | sort -n | tail -50 


For a Wordpress site, the result might look like this:

( examples for Wordpress are given solely for illustration, in fact, the described approach and commands are not limited to this CMS. These commands can be used to analyze the web site logs of sites running on any php frameworks and management systems (CMS), as well as on php scripts .

)



  1 /wp-admin/edit.php 1 /wp-admin/index.php 1 /wp-admin/update-core.php 1 /wp-admin/upload.php 2 /wp-admin/users.php 3 /wp-admin/plugins.php 4 /wp-includes/x3dhbbjdu.php 4 /wp-admin/profile.php 4 /wp-admin/widgets.php 38 /wp-admin/async-upload.php 58 /wp-admin/post-new.php 1635 /wp-admin/admin-ajax.php 6732 /xmlrpc.php 14652 /wp-login.php 




From the result it can be seen that the wp-login.php file had more than 14,000 hits, which is not normal. Apparently, the site was (or is still going) a brute force attack in an attempt to select access to the admin panel.



A large number of calls to xmlrpc.php may also indicate suspicious activity. For example, other Wordpress sites can be attacked (DDOS) through a site with XML RPC Pingback Vulnerability.



Successful references to /wp-includes/x3dhbbjdu.php look even suspicious on the list, since there is no such file in standard Wordpress. In the analysis, he turned out to be a hacker shell.



Thus, in just a few seconds, you can get statistics on calls to scripts, identify anomalies, and even find some hacker scripts without scanning the site.



Now let's see if there have been any attempts to hack the site. For example, search for vulnerable scripts or access to hacker shells. Find all requests for files with the .php extension with the status of 404 Not Found:



 find . -name '*.gz' -exec zcat {} \; | grep 'php HTTP.* 404' > php_404.txt grep 'php HTTP.* 404' access_log >> php_404.txt cut -d '"' -f2 php_404.txt | cut -d ' ' -f2 | cut -d '?' -f1 | sort | uniq -c | sort -n | tail -50 


This time the result could be:



  1 /info.php 1 /license.php 1 /media/market.php 1 /setup.php 1 /shell.php 1 /wp-admin/license.php 1 /wp-content/218.php 1 /wp-content/lib.php 1 /wp-content/plugins/dzs-videogallery/ajax.php 1 /wp-content/plugins/formcraft/file-upload/server/php/upload.php 1 /wp-content/plugins/inboundio-marketing/admin/partials/csv_uploader.php 1 /wp-content/plugins/reflex-gallery/admin/scripts/FileUploader/php.php 1 /wp-content/plugins/revslider/temp/update_extract/revslider/configs.php 1 /wp-content/plugins/ultimate-product-catalogue/product-sheets/wp-links-ompt.php 1 /wp-content/plugins/wp-symposium/server/php/fjlCFrorWUFEWB.php 1 /wp-content/plugins/wpshop/includes/ajax.php 1 /wp-content/setup.php 1 /wp-content/src.php 1 /wp-content/themes/NativeChurch/download/download.php 1 /wp-content/topnews/license.php 1 /wp-content/uploads/license.php 1 /wp-content/uploads/shwso.php 1 /wp-content/uploads/wp-admin-cache.php 1 /wp-content/uploads/wp-cache.php 1 /wp-content/uploads/wp-cmd.php 1 /wp-content/uploads/wp_config.php 1 /wp-content/wp-admin.php 1 /wp-update.php 1 /wso2.php 2 /wp-content/plugins/dzs-zoomsounds/ajax.php 2 /wp-content/plugins/hello.php 2 /wp-content/plugins/simple-ads-manager/sam-ajax-admin.php 3 /wp-content/plugins/dzs-zoomsounds/admin/upload.php 4 /2010/wp-login.php 4 /2011/wp-login.php 4 /2012/wp-login.php 4 /wp-content/plugins/wp-symposium/server/php/index.php 


As we can see from the result, there were such appeals. Someone “for good luck” tried to access the hacker shell in the directory of the supposedly vulnerable component Revolution Slider /wp-content/plugins/revslider/temp/update_extract/revslider/configs.php and WSO Shell at the root of the site and to a number of other hacker and vulnerable scripts. Fortunately, without success.

Using the same find / cat / zcat / grep, you can get a list of IP addresses from which these requests were executed, the date and time of access. But there is little practical use in this.



More benefit from sampling all successful POST requests, as this often helps to find hacker scripts.



 find . -name '*.gz' -exec zcat {} \; | grep 'POST /.* 200' > post.txt grep 'POST /.* 200' access_log >> post.txt cut -d '"' -f2 post.txt | cut -d ' ' -f2 | cut -d '?' -f1 | sort | uniq -c | sort -n | tail -50 


The result might look like this:



  2 /contacts/ 3 /wp-includes/x3dhbbjdu.php 7 / 8 /wp-admin/admin.php 38 /wp-admin/async-upload.php 394 /wp-cron.php 1626 /wp-admin/admin-ajax.php 1680 /wp-login.php/ 6731 /xmlrpc.php 9042 /wp-login.php 


Here you can see a lot of calls to wp-login.php and xmlrpc.php, as well as 3 successful POST requests to the script /wp-includes/x3dhbbjdu.php, which should not be in Wordpress, that is, most likely it is a hacker shell.



Sometimes it is useful to look at a sample of all 403 Forbidden requests executed by the POST method.



 find . -name '*.gz' -exec zcat {} \; | grep 'POST /.* 403' > post_403.txt grep 'POST /.* 403' access_log >> post_403.txt cut -d '"' -f2 post_403.txt | cut -d ' ' -f2 | cut -d '?' -f1 | sort | uniq -c | sort -n | tail -50 


In my case, it looked like this. Not very much, although it could be attempts to exploit XML RPC Pingback.



  8 /xmlrpc.php 


Finally, you can choose the TOP-50 popular requests to the site today:



 cut -d '"' -f2 access_log | cut -d ' ' -f2 | cut -d '?' -f1 | sort | uniq -c | sort -n | tail -50 


We get:



  6 /wp-admin/images/wordpress-logo.svg 6 /wp-admin/plugins.php 7 /wp-admin/post-new.php 8 /wp-admin/async-upload.php 9 /sitemap.xml 10 /wp-admin/users.php 13 /feed/ 13 /wp-admin/ 20 /wp-admin/post.php 22 /wp-admin/load-styles.php 38 /favicon.ico 52 /wp-admin/load-scripts.php 58 /wp-cron.php 71 /wp-admin/admin.php 330 /wp-admin/admin-ajax.php 1198 / 2447 /wp-login.php 


The statistics of accessing /wp-login.php in access_log confirms that the brute-force attack on the site is still going on (someone is trying to pick up a password), so you should limit access to wp-admin by IP or using server authentication, and if on Wordpress there is no user registration, then you can restrict access to wp-login.php.



Thus, without any specialized applications and additional tools, you can quickly analyze web server logs, find suspicious requests and their parameters (IP address, User Agent, Referer, date / time). All you need is an SSH connection and basic command line skills.

Source: https://habr.com/ru/post/261779/



All Articles