📜 ⬆️ ⬇️

As I wrote a bot for auction AliExpress

image You know, aliexpress has such auctions - Gaga Deals, now there is a dump, but once there were very interesting offers. Their idea was that:



There were very tasty lots at a discount of 90%, the top smartphones of this year for 200-300 dollars, last year at 100 dollars. Only 5 pieces of each position were played. I tried to win by hand, of course nothing happened. So I decided to write a bot ...

First of all, I felt how it generally works, manually. I wrote down all my attempts to win with fidler, then delved into his reports. While digging, I thought to write a bot. I started to make a console application with sending requests through sockets and parsing answers. Delicious lots were played out at first, and all attempts to win them manually were unsuccessful, but I did not get upset, analyzed traffic, programmed steps from what I analyzed. Here unnecessary lots were drawn up to anyone. Finally I went on and saw the captcha. The console application disappeared immediately. It occurred to me to write an extension for chrome. The goal was to help me quickly reach the captcha, put the cursor to enter, and then click on. Said done, for this extension you need only 2 files: manifest.json and content.js . In the manifest, we write content.js as an injected script and that's it, now it is executed on every page. And there through the document.querySelector we select the necessary elements and programmatically click on them. Bot or better to call while his Helper works with a bang, orders unnecessary goods without problems, but the necessary ones are still sold out. Too slow.
')
Part of the problem was that the reverse report on the site lags behind the real hours, probably 5 minutes in an hour. Plus, if you update 2 pages at the same time, they can load with a shift in the counter in 2 seconds.

Therefore, you need to constantly update the page and keep track of the current counter. I opened several tabs, started writing a bot with information exchange via the localstorage extensions, but inactive tabs had persistent problems, the script does not start, the querySelector does not find anything, then there is nothing left in the selected text element - there is nothing to parse. And the active page has always worked.

The problem seems to be in optimizing the browser with invisible tabs, you need to keep all pages in sight. I made 9 frames right inside the main page, in which product pages with a counter were constantly updated, but now there was no direct access to the frame. Strangely, all pages on the same domain, the script runs from the context of the extension, but when trying to access frames [i]. ContentDocument , the Indian national hut is called “figvam”. It turned out that the Chinese on the main page assigned document.domain = "aliexpress.com" - to the second level domain, I don’t know why they need it, there are no other frames on the page. I tried to return back, it was not there. It turns out that the domain level can be reduced, but it cannot be increased. Even if the page is loaded from the same domain that you want to assign. I had to change the domain everywhere to the second level - I got direct access to the frames. Parsil time, calculated the best and did not touch him until someone loaded with a better time, updated the other frames - it worked, the best counter was always ready to click. Then the number of frames was reduced to 4.

I debugged the extension on unnecessary lots, when the time came to “Ch”, it led me safely to the captcha, which I entered and pressed enter, then the order was made automatically. Everything was prepared, and I waited for the new rally.

During the next draw, I was in for a surprise in the form of color selection and / or configuration dialog boxes, I haven’t seen this yet and my bot stopped on them, the appropriate steps were quickly added and ...

Everything is sold, constantly everything is sold. Too slow, what to do? Chinese and Indian services for captcha recognition work no faster than me. Normal recognition program for the extension is not screwed. Already thought to return to the console application, and threw questions at work who - that knows any modules or libraries for captcha recognition. Began to look how to load captcha, where to get the url. as the feature noted, the captcha picture is loaded from another “checktoken1.alibaba.com” domain, the session identifier is inserted into the url and everything, when the picture is updated, the numbers are different each time.

And then it dawned on me
Captcha can be recognized in advance.
It apparently works like this: when a request arrives at a captcha server, it generates text and a picture for it. Saves the last match of the text and session id in the database, without checking that such a session exists at all and the page with the captcha was opened. And after the form is submitted, the compliance of the entered text with the user and the text from the database is checked.


I checked my guess like this: in the hosts I blocked the domain checktoken1.alibaba.com for checktoken1.alibaba. co registered the required IP and uploaded the image in another tab (I was most afraid of this, with the fake domain the picture could and would not load and would have to use the second comp). Updated form with captcha. He entered the old value, the form worked, the valid captcha, experimentally set the session reset value to 15 minutes.
I finished the page with frames, made a text field and a picture from another domain, entered the captcha about 5 minutes before the draw, and the extension worked like a clock, unless of course there was no fake and there were no other evil robots.

Actually, now it would be possible to return to the console application, without a browser it would work much faster, but laziness my mother would tell me - why write again when it works.

In the fall of 2013, the aliexpress developers did something with the captcha, the pre-recognized them ceased to satisfy them, and my bot stopped working. In Helper mode with manual captcha recognition, an order is placed too slowly.

Total

From the big one, he won the Nikon d5100 for $ 246 in May 2013, the Nokia lumia 800 for $ 136 in March 2013, the Blackberry bold 9900 for $ 136. I couldn’t win anything from the top offers, either there are fakes or bots are still grazing on my meaner ones. After 2-4 months, I checked who bought these top positions - buyers were only from Russia, Belarus and Ukraine.

Sources of extension on GitHub

Source: https://habr.com/ru/post/228209/


All Articles