📜 ⬆️ ⬇️

Why Yandex refused to confirm sites txt file

This story is about a funny set of circumstances and one tiny bug, which until recently had a place to be in the service of “J.Webmaster.” The chronology and some points set out here are slightly modified for a more concise narration. But nevertheless, the essence remains unchanged.



It all started with the fact that I noticed a strange feature of CMS Wordpress. So, when I first accessed, my site at www.domain.com/non_existent_file.ext displayed the header “404 Not Found”, and when I ’ve repeated it, “200 OK”. At that moment it seemed that my edits in the engine and various bolted crutches could influence this. But when diagnosing, at the stage of disconnecting plug-ins, it turned out that the reason for this behavior is the plugin “W3 Total Cache”. Without understanding the details, with the thought of “finishing another,” he turned it on again and forgot.
')
After a couple of months I decided to add this site to J.Webmaster. The service provided several ways to confirm ownership of the site. At that time they were:

- html file
- meta tag
- txt file
- through whois
- through dns


Since the SSH connection with the server was open at that time, the simplest option seemed to be a "txt file", which said that for the appruva you need:

1. Create a txt file with the name yandex_59306eb68da05077.txt with arbitrary content (you can empty)

2. Upload it to the root directory of your site.

3. Make sure that the downloaded file opens at www.domain.com/yandex_59306eb68da05077.txt .

4. Click on the “Check” button.


The touch team did not take long to wait and after clicking the Check button, my site was successfully placed in the My Sites section.

Being an inveterate lover of reading the logs, during the next viewing, there were lines that showed how the Yandex bot was checking for the presence of this file. It looked like this:

... "GET /yandex_0250d52d00c8a904.txt HTTP / 1.1" 404 1362 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_59306eb68da05077.txt HTTP / 1.1" 200 0 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_11c01dd326a98199.txt HTTP / 1.1" 404 1362 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"


It became interesting to me, but what if I always give “200 OK”. It was decided to “torture” the robot a little. So, when the server returned codes other than 200 and 404, it was written about the following: “For a given page (or page received after redirection), the server returns the status code http 502 (code 200 was expected).”. If the bot received 200 constantly, then it also reported this and the check did not pass.

In the process of research, by chance, managed to get confirmation without the presence of the file. This I did not expect and began to deal with what happened.

The sequence of requests in the log turned out this:

... "GET /yandex_2dd0e3403151c956.txt HTTP / 1.1" 404 1362 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_s220d5d90c8a331.txt HTTP / 1.1" 404 1362 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_d43c048a7be5a791.txt HTTP / 1.1" 404 1362 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_d22193589eac5880.txt HTTP / 1.1" 404 1362 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_c220d5d90c8a331.txt HTTP / 1.1" 200 0 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_6a5ec74b714c7856.txt HTTP / 1.1" 404 1362 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"


Turning the trick again for clarity, I remembered the above feature of WordPress. It became interesting whether this happens elsewhere and how often.

A simple PHP checker was written, which sent two queries with a nonexistent file name and wrote down a successful result. “Alexa Top 1,000,000” was chosen as the testing list.

Not to say that the result was grand, but it was. About 1500 domains in various zones have been released. When looking at the received list, it became clear that the “W3 Total Cache” plugin had nothing to do with it, since sites with installed QuickCache, MaxCache and others plug-ins passed the “test”. The similarity was only one, most of them used WordPress. And as it turned out, there was another dependency, this included caching in files. In W3TC, the option is called "Disk (Enhanced)". Unfortunately, I am not a big fan of digging code, so the reason for this behavior is unknown to me.

I also note that on some sites no trace of WordPress was found. Perhaps he is skillfully disguised, or a similar bug arises somewhere else.

Then a report was sent to Yandex Bug Bounty. C content:

Hello.
The “txt-file” confirmation method, for some site configurations, may allow the site to be confirmed to attackers.

It is connected, as it seems to me, with various caching mechanisms.

So, in the implementation of the caching mechanism, for the popular CMS Wordpress in conjunction with the plug-ins W3TC, Quick Cache, Max Cache and the like. The server gives the 404 header only the first time the unavailable file is accessed, the second time the answer will be 200 OK. It is worth noting that the WP + cache plug-ins bundle is not subject to itself, there is still some kind of dependency, but to find it out, you need to research the engine code.

This is how the Yandex bot checks for the presence of a .txt file for the first time:

... "GET /yandex_2dd0e3403151c956.txt HTTP / 1.1" 404 136 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_c220d5d90c8a331.txt HTTP / 1.1" 404 136 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_d43c048a7be5a791.txt HTTP / 1.1" 404 136 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"

And the second will look like this:

... "GET /yandex_d22193589eac5880.txt HTTP / 1.1" 404 136 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_c220d5d90c8a331.txt HTTP / 1.1" 200 0 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"
... "GET /yandex_6a5ec74b714c7856.txt HTTP / 1.1" 404 136 "-" "... YandexWebmaster / 2.0; + http: //yandex.com/bots)"

Several sites with similar behavior and Wordpress installed:

htmldoc.ru
laminortv.ru
www.comediatv.ru

And in these cases:

www.3dnews.ru
rutv.ru
tvkultura.ru
marker.ru

Some other caching mechanism works, but the behavior is identical.

This report is also worth noting: hackerone.com/reports/477 . At that moment, most likely, the “txt-file” check by the Yandex bot would be positive. Who knows how many more sites contain similar "functionality"?


This is my first experience of participating in the Yandex program and I was pleasantly surprised by the operational response of the company's employees. On the same day they answered me and began to deal with my report, and the next day, they awarded me an award of 41,337 rubles (approximately $ 700 at that moment). The only thing I can find fault with, in my opinion, the number 313373 would look more beautiful. But by and large, this combination of circumstances did not bring much benefit to the attackers. For targeted attacks, it did not work because of the large number of dependencies. Extracting any material benefit from this is just as difficult to imagine. Is that the sale of XML limits. Therefore, I am pleased with the Yandex award, since at the time of sending I didn’t expect anything special.

As a moral bounty, a few days after the report was sent, the fate of the txt file was decided ( RIP ):
...

...


PS



I hope that he did it all :)

Source: https://habr.com/ru/post/276739/


All Articles