📜 ⬆️ ⬇️

Refinement of Squid log parser for correct viewing of resources visited by HTTPS

Hello! I received, and I receive many letters from people with questions on Squid, which works on the basis of my article. The most frequently asked question is about viewing Squid logs with any parser. The problem is that the version of Squid 3.5.8 with HTTPS configured transparent proxying logs the resources visited by HTTPS not as domain names, but as IP addresses with ports (approx. 164.16.43.56reen43). Accordingly, when viewing visitor statistics, instead of human information, these same IP addresses skip. It is rather difficult to collect statistics with such data. I contacted the Squid developers about this, but I never received a clear answer. The only thing that I found out is that normal logging works in newer versions of Squid, but for me transparent proxying did not work for me personally properly. Therefore, the question arose of how to rezolv IP addresses in the parser logs.

Personally, I use the Screen Squid parser, and it was in it that I decided to try to make the necessary changes. Since I need a similar rezolv just when working in the terminal with Bash, I decided to do the whole rezolv process as a script in Bash, and in Screen Squid I’ll use it in PHP when I need it.

So, for all our plans we need:

  1. in fact, the Screen Squid parser itself (I will not print the installation instructions, everything is on the offsite).
  2. Grep
  3. Sed
  4. Nslookup
  5. Whois
  6. Straight arms

The Bash script itself is the following:
')
#!/bin/bash #   - ip ,     IP="$1"; #  IP    NSLOOKUP,  GREP  SED #       hostname=$(nslookup $IP | grep -m 1 "name" | sed 's|.*= ||'|sed -r 's/ Auth.+//' | sed 's/^[ \t]*//;s/[ \t]*$//' ); #     NSLOOKUP  , #    IP    whois,   # GREP  SED        if [[ "$hostname" == '' ]]; then hostname=$(whois $IP | grep -m 1 "owner\|OrgName\|orgname\|NetName\|netname\|origin" | sed 's|.*: ||'|sed -r 's/. Auth.+//' | sed 's/^[ \t]*//;s/[ \t]*$//') fi #     echo "$hostname" exit 0; 

In principle, he has already been commented out, there is nothing special to describe here. We get the IP address information first using Nslookup, in parallel filtering the output of the command using grep and sed to eliminate unnecessary information. In order not to write a bunch of lines, grep features were used to include several conditions for the selection (" \ | " characters). Save the script in any convenient place, assign execution rights to it. Suppose it is saved to / usr / bin as gethost.sh .

The script can be used simply from the terminal:

 gethost.sh ip_address 

Further I will tell how to fasten this script to Screen Squid. Suppose it is installed in / var / www / html. In this folder there will be a subfolder of reports , where the reports.php file is located. That's it is necessary to make changes in it. In this file you need to find the lines:

 $result=mysql_query($queryOneIpaddressTraffic) or die (mysql_error()); $numrow=1; $totalmb=0; while ($line = mysql_fetch_array($result,MYSQL_NUM)) { echo "<tr>"; echo "<td>".$numrow."</td>"; if($enableUseiconv==1) $line[0]=iconv("CP1251","UTF-8",urldecode($line[0])); echo "<td><a href='http://".$line[0]."' target=blank>".$line[0]."</a></td>"; 

And instead of the last line, insert the following:

 //, HTTPS      (   ':') //  ,   HTTP ,     $dv=strpos($line[0], ":") ; if ($dv < 1) { echo "<td><a href='http://".$line[0]."' target=blank>".$line[0]."</a></td>"; } else { //      ':' ,   HTTPS ,  //  "" ... //  IP    , ..    ':' $str1=strpos($line[0], ":"); $row1=substr($line[0], 0, $str1); $ipaddress = ltrim($ipaddress); $ipaddress = $row1; //   IP     gethost.sh $hostname = shell_exec('/usr/bin/gethost.sh ' . $ipaddress); //       IP  echo "<td><a href='https://".$ipaddress."' target=blank>".$hostname."</a></td>"; } 

The code was written in a hurry, but it works. And it works, when the viewing of the report “Traffic of users of the IP address” is opened, personally, for the most part, I only need such a report. If desired, you can add similar code to any other reports.

The code itself is quite simple: first it is determined which resource is currently displayed in the table: HTTP or HTTPS, and if it is HTTPS (determined by the presence of the ":" character), then we separate the IP address from the port, pass the IP address to the gethost script .sh , we get the output of the script in the form of information about the IP address, and display it on the screen.

There were thoughts to enter the necessary data into the database right away, but resolving the above method at the stage of filling the database leads to a long process of drinking coffee drinks, so I refused it.

Oh yeah, I almost forgot, the script should be on the same server where the Screen Squid parser is located. Well it is, by the way.

If there are suggestions for improvement, refinement, alteration of this script, I will be glad to hear.

Additions:

I did a little differently, it seems to me more informative, as comrade kbool correctly noted here . You can get the SSL certificate data of the desired host directly from PHP by reading the information of interest from there. Below is the code that needs to be inserted into reports.php instead of the above:
 //, HTTPS      (   ':') //  ,   HTTP ,     $dv=strpos($line[0], ":") ; if ($dv < 1) { echo "<td><a href='http://".$line[0]."' target=blank>".$line[0]."</a></td>"; } else { //      ':' ,   HTTPS ,  //  "" ... //  IP    , ..    ':' $str1=strpos($line[0], ":"); $row1=substr($line[0], 0, $str1); $ipaddress = ltrim($ipaddress); $ipaddress = $row1; //   IP  /////////////////////////////////////////////////////////// $options = array( "ssl" => array( "capture_peer_cert" => true, "capture_peer_chain" => true, "capture_peer_cert_chain" => false, "verify_peer" => false, "verify_peer_name" => false, "allow_self_signed" => false ) ); $get = stream_context_create($options); $read = stream_socket_client("ssl://".$ipaddress.":443", $errno, $errstr, 30, STREAM_CLIENT_CONNECT, $get); $cert = stream_context_get_params($read); $certinfo = openssl_x509_parse($cert['options']['ssl']['peer_certificate']); $certinfo = $certinfo['name']; $CN=strpos($certinfo,"CN=")+3; $CN_end=strlen($certinfo); $hostname = substr($certinfo, $CN, $CN_end); //////////////////////////////////////////////////////////// //       IP  echo "<td><a href='https://".$ipaddress."' target=blank>".$hostname."</a></td>"; } 

Source: https://habr.com/ru/post/307686/


All Articles