📜 ⬆️ ⬇️

Humansnotinvited: solve captcha on bash

Greetings, dear reader!

Many of you have met with captcha - automatic Turing test. It allows you to separate real people from different bots. Recently, reCAPTCHA by Google Inc. has become very popular. On it you have to select images containing some objects, for example, cars. A relatively recently appeared website , which does exactly the opposite: it separates bots from people.

The site welcomes users with a suggestion to go through a captcha consisting of blurred squares:


In case of unsuccessful attempt, you will see the message:
')
You're a human. You are not invited. 

True, several Reddit users still managed to get through the captcha.

Getting started


If you pay attention, then the captcha on the site has a very limited number of topics (what needs to be found in the picture), as well as a small database of images. You can probably correlate a picture to a particular topic, that is, the more often the picture came across in combination with this topic, the more likely it is that the object of interest is shown on it. Let's associate image file names with a theme. Fortunately, the names of the files are different each time, so we cannot determine whether such a picture came to us before. In this case, we will save and compare them. You can compare not the pictures themselves, but the hash-sums to them, since if the pictures are equal, the hash-sums will also coincide.

First, the script downloads a page with the image:

 wget http://www.humansnotinvited.com/ &>/dev/null 

Defines the theme of the image:

 TYPE=`grep "value=\"[a-zA-Z0-9 ]\+\"" index.html -o | grep "\"[a-zA-Z0-9 ]\+\"" -o | sed "s/\"//g"` 

Gets the captcha image url:

 CAPTCHAS=`grep "img src=\".*?alt=\"\"" index.html -Po | grep "captcha/image.php.*?&id=[0-9]" -Po` 

Downloads captcha and assigns them a unique number. Immediately displays the md5sum binding to the topic:

 j=0 for i in $CAPTCHAS do wget http://www.humansnotinvited.com/$i -O $j.jpg &>/dev/null WHERE=`md5sum $j.jpg | cut -c 1-32` echo $TYPE";"$WHERE let "j=j+1" done 

Removes unnecessary files:

 rm index.html rm *.jpg 

Repeats ~ 1000 times and gets a table that contains a sufficient number of records so that by the number of repetitions of elements it can be judged whether this picture is necessary for selection.

All code
 #!/bin/bash for i in `seq 0 1000` do wget http://www.humansnotinvited.com/ &>/dev/null TYPE=`grep "value=\"[a-zA-Z0-9 ]\+\"" index.html -o | grep "\"[a-zA-Z0-9 ]\+\"" -o | sed "s/\"//g"` CAPTCHAS=`grep "img src=\".*?alt=\"\"" index.html -Po | grep "captcha/image.php.*?&id=[0-9]" -Po` j=0 for i in $CAPTCHAS do wget http://www.humansnotinvited.com/$i -O $j.jpg &>/dev/null WHERE=`md5sum $j.jpg | cut -c 1-32` echo $TYPE";"$WHERE let "j=j+1" done rm index.html rm *.jpg done 


We start and write to the full table, which is then needed to solve the captcha:

 ./createTable.bash > full 

Solve the captcha


The resulting table will have two sets of continuous values, between which there will be a significant gap. The boundary passes along this gap, below which elements are included in the set of incorrect answers, above — a set of correct answers. For ~ 1000 runs, the limit is 80-100 repetitions.

In response, the server accepts a POST request to www.humansnotinvited.com/ajax/sendCaptcha.php and returns JSON.

POST request example
capthcaData[0][id]:4
capthcaData[0][token]:$1$adig2JKH$m9NKp.98MT8N5A8c.SEaw0
capthcaData[0][active]:false
capthcaData[1][id]:7
capthcaData[1][token]:$1$R8hAbOML$SERl/oIWWTGimhb5ywioG0
capthcaData[1][active]:false
capthcaData[2][id]:1
capthcaData[2][token]:$1$5M/tB252$iQm9NjRu1qNKcSC2wF/u4.
capthcaData[2][active]:true
capthcaData[3][id]:3
capthcaData[3][token]:$1$kn.4h2yQ$7nmRt19MKrtv/3sytU1Tj1
capthcaData[3][active]:false
capthcaData[4][id]:2
capthcaData[4][token]:$1$hv4Ku.BF$CDyWe7tHQXA1gt4ru7j.11
capthcaData[4][active]:true
capthcaData[5][id]:5
capthcaData[5][token]:$1$TzUr8bR9$vfbdKyNuebod8hRmNRxN51
capthcaData[5][active]:false
capthcaData[6][id]:8
capthcaData[6][token]:$1$QLT/VbgI$lNYNOnSiXyk905WbB9zPH1
capthcaData[6][active]:false
capthcaData[7][id]:9
capthcaData[7][token]:$1$1A3.rD88$lxrUf.VdCFyEJnNzir1wz1
capthcaData[7][active]:true
capthcaData[8][id]:4
capthcaData[8][token]:$1$Pf5m5yjV$PypMZUid/smpNx/qBKMRv1
capthcaData[8][active]:false
category:spiders


The approach for solving a captcha is very similar to the approach for creating a table. But now we consider md5sum not to be added to the table, but to search for the image in the table. Parallel to this, we output a POST request. TOKEN - image name, WHERE - image md5sum, ID - sequence number.

 j=0 for i in $CAPTCHAS do wget http://www.humansnotinvited.com/$i -O $j.jpg &>/dev/null ID=`echo $i | grep "id=[0-9]\+" -o | grep "[0-9]\+" -o` TOKEN=`echo $i | grep "image_name=.*?&id" -Po | sed "s/image_name=//; s/&id//"` WHERE=`md5sum $j.jpg | cut -c 1-32` echo "&capthcaData[$j][id]=$ID" echo "&capthcaData[$j][token]=$TOKEN" if [[ $( grep "$TYPE;$WHERE" full -c ) -gt 100 ]] then echo "&capthcaData[$j][active]=true" else echo "&capthcaData[$j][active]=false" fi let "j=j+1" done 

All code
 #!/usr/bin/bash rm *jpg rm index* wget http://www.humansnotinvited.com/ &>/dev/null TYPE=`grep "value=\"[a-zA-Z0-9 ]\+\"" index.html -o | grep "\"[a-zA-Z0-9 ]\+\"" -o | sed "s/\"//g"` CAPTCHAS=`grep "img src=\".*?alt=\"\"" index.html -Po | grep "captcha/image.php.*?&id=[0-9]" -Po` j=0 for i in $CAPTCHAS do wget http://www.humansnotinvited.com/$i -O $j.jpg &>/dev/null ID=`echo $i | grep "id=[0-9]\+" -o | grep "[0-9]\+" -o` TOKEN=`echo $i | grep "image_name=.*?&id" -Po | sed "s/image_name=//; s/&id//"` WHERE=`md5sum $j.jpg | cut -c 1-32` echo "&capthcaData[$j][id]=$ID" echo "&capthcaData[$j][token]=$TOKEN" if [[ $( grep "$TYPE;$WHERE" full -c ) -gt 100 ]] then echo "&capthcaData[$j][active]=true" else echo "&capthcaData[$j][active]=false" fi let "j=j+1" done echo "&category="$TYPE 


Run:

  wget http://www.humansnotinvited.com/ajax/sendCaptcha.php --post-data `./gotcha.bash | tr -d "\n" | cut -c2-` 

Who is captcha?


As a response, the site returns a list of IP addresses that have successfully passed the captcha. Captcha was successfully completed with 3022 unique IPs.

A countryIP number
Russia1072
USA450
Japan307
Ukraine162
Great Britain104
France98
Other countries829

The geographical distribution of bots that successfully passed the captcha. Data on the IP of the country were taken on dev.maxmind.com . Part of the IP was attributed to Europe (8IPs) and Asia / Pacific Region (1IP). They were added to Denmark and Papua New Guinea, respectively. Hong Kong (6IPs) was also assigned to China. (The picture is clickable)



The growth of the total number of IP of some countries relative to the number on the list:



Useful links:

Source: https://habr.com/ru/post/374997/


All Articles