📜 ⬆️ ⬇️

How I fought theft ... with php


When we pay daily for services - this is the purchase of services.
When we pay daily for nothing (sometimes without even knowing it), it is theft.

Good afternoon, readers of Habr!

How it all began


I wanted to steal less and let him fight! But manually it was very tiring, long and ineffective, then the thought came to somehow automate this business.
')
Which one of the “thieves” am I? About where we, walking on the Internet, click on the "watch video" button, some page is loaded, the video for some reason does not play, we leave and walk further, but in fact we "voluntarily" hooked up our service to receive that something that no one has ever seen for a nominal fee of 30 rubles a day from his mobile account. In humans, this is called wap-click or mobile subscriptions, and mobile operators come up with a variety of beautiful names. Still, not to include in the list of services “theft on a video button”.

Here a little more. And here is a story about a good way to "earn."

There are a lot of described cases of not quite voluntary subscriptions, for example. Undescribed - much more.

Wrestlers also have:


What and why was automated


Search and block ads in the Google AdSense publisher panel.
The goal is to improve blocking efficiency and free up the time spent manually cleaning.

The essence of the problem and the existing solutions
For many years (the first mention of a similar thing I found was in the summer of 2014), publishers manually caught streams of “deaths of Yakubovich”, “stone risers”, “watch video, watch video” and other evil spirits ( beginning , continued ), this process almost did not automate 1 and it seemed almost impossible.

1 There are (at least once) two solutions, but they have quite serious requirements that not everyone can afford.
These are the solutions:

  1. AdSense Cleaner . It takes a lot extra. BY.
  2. AdsAutomation . Script to control the browser Google Chrome (as I understand it, on ZennoPoster). Requires a separate PC. And at the moment the project has been deleted from GitHub.

If you make software that replaces the person blocking ads, then it should be done taking into account a number of requirements:

.

In general, php (with cURL) will be what you need. You can throw right on your site and work without additional computers and other difficulties.

And one clarification to the requirements.
Since the solution was intended to be automated in php, therefore, run via cron, the storage of user settings and temporary data should be on the disk (not in the cookie). Only the key to access the control panel will be stored in the cookies. For selected people who do not have the ability to configure cron, but can keep one tab open on their PC / tablet / smartphone, the ability to periodically run on a Javascript timer will be added.

What foreshadowed the beginning or Google API


And for AdSense there is an API, I somehow saw it out of the corner of my eye and did not go deep. And now - it's time to understand. There are many possibilities, but it turned out that neither here , well, there is nothing described about the API for the CTP . Want to watch ads that are spinning on the site, please - manually.

Start


The Google AdSense interface is built on AngularDart, everything looks beautiful and quite difficult from the point of view of the device.

First of all, I looked into Google Chrome’s developer tools on the “Network” tab to “listen in” how this clever interface communicates with the server. There was a lot of requests there, the most interesting for me were in the section “XHR and Fetch”, where I found something that looked completely unraveled, if you think well. For example, one of the post requests:

String to be passed.
{"method":"searchArcApprovals","params":"{\"1\":\"ca-pub-8958890276790964\",\"2\":{\"1\":0,\"2\":1,\"3\":0,\"4\":{\"1\":{\"1\":\"AClZvXKL6S3HChRty5YBa81BLWDBQkb3FYDsifZ9V/mBTKbOGlj3gMWVpzTtXggA1880Le9NyVZIicNm/4pz724e/MO8fyLfjOReF205cyjLV9C8OCCeKe7VvZHyvyKpXh8x9smTQ0n8qIIqzuIXle5UK0hD4VBkZDvy//qoSPRCr94UtWYqqi//Rot22LJ2JFNjWEGb4n1YQbAw0cKWPR3LAugPBajInWXEFGWJRTnmY2TkI5VzUzIkcXpJ/bkajn3c8GnecCfFNvNhGLS10VXdRwiykngG3xfoMTRhQOR5GXbm4kwdIhzQUM/d6xP0Xda3FOIZGGk9bymneg+9oDY+rMFiRfDFCb66g50t9J9r++oHXjek09Ci1rqC7LOw2pvkqp3hjG6RyVmsiT/eWGq+OsfjE7CgRk43QIRMSa+jlZBQhARUPlpUXzyZyoTiIPTRZ5ND/4MnIMqaUWSRoDGffiE/XkHJPEkNZtLX2XR5gZ3x5/K+ejU/fqxfZIjI6A3kueJybNA46wSLbmflhDCGDJEE2aeYemLFGqNzFG43B80LzU3yuwgZhrLu/jaMvBJozi0nq+gXEz6r+8tic4fvsQ9lWDA+IXzXw6MKzamgfWV0ORGDW0+966KIY6IkjtIlNRKGyp3pSAd2Po+br4Dl4WNwSkMdmuV60wOrkb5BpnKZKIhDtpjWF7q6ly3FFhwo8Ktdq5ddVJ8ijJ9Y9tQhs2O0idA9N0yV86khV1IQ72OgbMv15qAswnbqF9WCo3qpfJNjJqMCHBRTohPCxhRp0cWz2thszZTmDDADPxU46sclnurd/JxHFO7lJZVdrsFB4vdLIx9kObV3bP1gOpU66kdcmom2tiedknugj7s0jLcgf1EfXnp+SUUAQyoqwS+kdhhQtGqSXgI2TopsuaLVzj+EtAuPwWeLvtI9CFPSe4o2x+gjCRPl8wVvWKV5FIrZavUVOAHZIL4nKyJjHxZi3jPfVnAia/hq1gW6XKoCg1eWGg/cAWZY4mZYQ6W4XnC0MY0uMC6fhPQdXnIS5iLZNhan80jbr/leBr4fO22+tXc6oZpZsDkXd0r3ilBJFPS2I/zAhotuzZgNA+nF2N86pyiSrdeEYFDhKWKadcKAVc3BMxxlrqZYcAXnlus9GW7R9F/ImXQ/fjRfSjVRUaJuQ0EnFejNAwdGcS6STYMa1G0wnNMAKcZ52xcHgil1SZ6N9BQ7A27z6eViOxw0LHBqNJIRZwQml2KjPd5b00D9XvohDr6jBqYXLGS/HMVvpGDJZLDI2LRlmkqBqx7YEgDZqvspeoMLHIJP22SkQDnaJtsOLGVBSi20ZD5nRyjAgS6MmcgFCvfJVWjCIL1RPHqmUU90eK4WXve0ayH9cJnpbtWrkXYCibhVPCMmYowMROw7rI4bPir0\"}}}}","xsrf":"ABOvogKvrE9fIqAKh0w02RIsB4OJ4hsB_g:1535467885347"} 

In the request, the publisher ID is immediately visible, under the second item, a set of parameters, the essence of which can be determined experimentally and the XSRF token.

And in response, he receives detailed information about the ad, but not all of it, even without the ad itself (hereinafter, the pictures stretched out in base64, cropped).

Sheet on several pages.
 {"result":{"1":[{"1":0,"3":0,"4":{"1":"AClZvXJ2t4wiEZ/VZ0i54m0Qtqpi2DTqkI1kaPMTRi4LnsQn0iR5K1xBlFpS1xmJV7ko4a6qx5RcTkp7CzVjwoy5UDSWZ5jOCPLGRcoQdDt+wOk46bdr0yA\u003d"},"5":{"1":82,"2":0,"3":0,"4":"\u003cdiv id\u003d\"ad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60\"\u003e\u003c/div\u003e","5":"\u003cdiv id\u003d\"ad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60\"\u003e\u003c/div\u003e","6":"\u003cdiv\u003e\u041c\u043d\u043e\u0433\u043e\u0444\u043e\u0440\u043c\u0430\u0442\u043d\u044b\u0435\u003cspan id\u003d'multi-format-tooltip'\u003e\u003c/span\u003e\u003c/div\u003e\u003ca class\u003d'arc-url-link-ellipsis' target\u003d'_blank' href\u003d'https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/' title\u003d'https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/'\u003ehttps://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/\u003c/a\u003e","7":"\u003cdiv class\u003d'arc-one-by-one-legend'\u003e\u0422\u0438\u043f \u043e\u0431\u044a\u044f\u0432\u043b\u0435\u043d\u0438\u044f\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-data'\u003e\u041c\u043d\u043e\u0433\u043e\u0444\u043e\u0440\u043c\u0430\u0442\u043d\u044b\u0435\u003cspan id\u003d'multi-format-tooltip'\u003e\u003c/span\u003e\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-legend'\u003e\u0426\u0435\u043b\u0435\u0432\u043e\u0439 URL\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-data'\u003e\u003ca class\u003d'arc-url-link-ellipsis' target\u003d'_blank' href\u003d'https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/' title\u003d'https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/'\u003ehttps://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/\u003c/a\u003e\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-legend'\u003e\u0414\u043e\u043c\u0435\u043d\u044b \u0438\u0437\u0434\u0430\u0442\u0435\u043b\u0435\u0439\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-data'\u003e4aynikam.ru\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-data'\u003eandroidphone.su\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-data'\u003eandroidphones.ru\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-data'\u003efull-repair.com\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-data'\u003ehowgadget.com\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-legend'\u003e\u041e\u0431\u043d\u0430\u0440\u0443\u0436\u0435\u043d\u043d\u044b\u0439 \u0440\u0435\u043a\u043b\u0430\u043c\u043e\u0434\u0430\u0442\u0435\u043b\u044c\u003cspan id\u003d'adx-advertiser-tooltip'\u003e\u003c/span\u003e\u003c/div\u003e\u003cdiv class\u003d'arc-one-by-one-data'\u003eDNS Shop\u003c/div\u003e","8":"\u003cdiv\u003e\u003cspan class\u003d'arc-impression-score high'\u003e\u0412\u042b\u0421\u041e\u041a\u041e\u0415\u003c/span\u003e \u0447\u0438\u0441\u043b\u043e \u043f\u043e\u043a\u0430\u0437\u043e\u0432\u003c/div\u003e","9":{"1":"\u003ca href\u003d\"https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/\" target\u003d\"_blank\"\u003e\u003cimg onerror\u003d\"this.src\u003d'data:image/gif;base64,RA7'\" src\u003d\"https://www.google.com/webpagethumbnail?c\u003d58\u0026s\u003d400:400\u0026r\u003d4\u0026d\u003dhttps://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/\u0026a\u003dAIYkKU9ZGGjFTOWtm771MQwgDYxqtlBLCw\" border\u003d0 alt\u003d\"\"\u003e\u003c/a\u003e","2":"\u003ca href\u003d\"https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/\" target\u003d\"_blank\"\u003e\u003cimg onerror\u003d\"this.src\u003d'data:image/gif;base64,R0AA7'\" src\u003d\"https://www.google.com/webpagethumbnail?c\u003d58\u0026s\u003d400:400\u0026r\u003d3\u0026d\u003dhttps://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/\u0026a\u003dAIYkKU_CQ2K6v5f11Nk1RXtc87FtmG2B1w\" border\u003d0 alt\u003d\"\"\u003e\u003c/a\u003e","3":"\u003ca href\u003d\"https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/\" target\u003d\"_blank\"\u003e\u003cimg onerror\u003d\"this.src\u003d'data:image/gif;base64,R0lAA7'\" src\u003d\"https://www.google.com/webpagethumbnail?c\u003d58\u0026s\u003d400:400\u0026r\u003d6\u0026d\u003dhttps://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/\u0026a\u003dAIYkKU_My0a48LAsW-ZKpQX-ATXkMoPEVg\" border\u003d0 alt\u003d\"\"\u003e\u003c/a\u003e"},"10":"https://adwords-displayads.googleusercontent.com/da/b/preview.js?client\u003dasfe-arc-external-preview\u0026obfuscatedCustomerId\u003d5240877441\u0026creativeId\u003d288930210411\u0026htmlParentId\u003dad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60\u0026sig\u003dACiVB_yMUjLwDjRO2T-0VAaVuRPt8uLHGQ","13":"https://adwords-displayads.googleusercontent.com/da/b/preview.js?client\u003dasfe-arc-external-preview\u0026obfuscatedCustomerId\u003d5240877441\u0026creativeId\u003d288930210411\u0026htmlParentId\u003dad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60\u0026showVariations\u003dtrue\u0026sig\u003dACiVB_yMUjLwDjRO2T-0VAaVuRPt8uLHGQ","14":"https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/","15":"","17":"","18":"DNS Shop","20":"adv-5594449542310820","21":["site1.ru","site2.com","site3.com","site4.ru"]},"6":{"5":"-6668648012302470727","7":["DNS"],"9":0},"7":1,"9":{"3":[{"1":{"1":"AClZvXLE9HJbFYq9TrAsXFgV4YkXsQt9lXp1xWjSB5aT5bFBpe4VNgo\u003d"},"2":"\u0418\u043d\u0442\u0435\u0440\u043d\u0435\u0442 \u0438 \u0442\u0435\u043b\u0435\u043a\u043e\u043c\u043c\u0443\u043d\u0438\u043a\u0430\u0446\u0438\u0438","3":"\u0422\u043e\u0432\u0430\u0440\u044b \u0438 \u0443\u0441\u043b\u0443\u0433\u0438, \u0441\u0432\u044f\u0437\u0430\u043d\u043d\u044b\u0435 \u0441 \u0442\u0435\u043b\u0435\u043a\u043e\u043c\u043c\u0443\u043d\u0438\u043a\u0430\u0446\u0438\u044f\u043c\u0438, \u0432 \u0442\u043e\u043c \u0447\u0438\u0441\u043b\u0435 \u043a\u0430\u0431\u0435\u043b\u044c\u043d\u043e\u0435 \u0438 \u0441\u043f\u0443\u0442\u043d\u0438\u043a\u043e\u0432\u043e\u0435 \u043e\u0431\u0441\u043b\u0443\u0436\u0438\u0432\u0430\u043d\u0438\u0435 \u0438 \u0434\u043e\u0441\u0442\u0443\u043f \u0432 \u0418\u043d\u0442\u0435\u0440\u043d\u0435\u0442."},{"1":{"1":"AClZvXKrUJJ3kKBen2scP56BynOtGhf160i1F1LLmtBj3b/oh2dUFg8\u003d"},"2":"\u041c\u043e\u0431\u0438\u043b\u044c\u043d\u044b\u0435 \u0442\u0435\u043b\u0435\u0444\u043e\u043d\u044b","3":"\u041c\u043e\u0431\u0438\u043b\u044c\u043d\u044b\u0435 \u0438 \u0441\u043e\u0442\u043e\u0432\u044b\u0435 \u0442\u0435\u043b\u0435\u0444\u043e\u043d\u044b, \u0430 \u0442\u0430\u043a\u0436\u0435 \u0441\u043e\u043f\u0443\u0442\u0441\u0442\u0432\u0443\u044e\u0449\u0430\u044f \u0438\u043d\u0444\u043e\u0440\u043c\u0430\u0446\u0438\u044f, \u043d\u0430\u043f\u0440\u0438\u043c\u0435\u0440 \u0442\u0435\u0445\u043d\u0438\u0447\u0435\u0441\u043a\u0438\u0435 \u0445\u0430\u0440\u0430\u043a\u0442\u0435\u0440\u0438\u0441\u0442\u0438\u043a\u0438 \u0438 \u0441\u0440\u0430\u0432\u043d\u0438\u0442\u0435\u043b\u044c\u043d\u044b\u0439 \u0430\u043d\u0430\u043b\u0438\u0437 \u0442\u043e\u0432\u0430\u0440\u043e\u0432. \u0412 \u044d\u0442\u0443 \u043a\u0430\u0442\u0435\u0433\u043e\u0440\u0438\u044e \u043d\u0435 \u0432\u0445\u043e\u0434\u044f\u0442 \u0430\u043a\u0441\u0435\u0441\u0441\u0443\u0430\u0440\u044b \u0434\u043b\u044f \u043c\u043e\u0431\u0438\u043b\u044c\u043d\u044b\u0445 \u0442\u0435\u043b\u0435\u0444\u043e\u043d\u043e\u0432."},{"1":{"1":"AClZvXL4W+khZ4O9SJiu97cTbTs2+0Wecf1IVNju8ffd4ysIT9PJ7XY\u003d"},"2":"\u041c\u043e\u0431\u0438\u043b\u044c\u043d\u044b\u0435 \u0442\u0435\u043b\u0435\u0444\u043e\u043d\u044b \u0438 \u0430\u043a\u0441\u0435\u0441\u0441\u0443\u0430\u0440\u044b \u0434\u043b\u044f \u043d\u0438\u0445","3":"\u041c\u043e\u0431\u0438\u043b\u044c\u043d\u044b\u0435 \u0442\u0435\u043b\u0435\u0444\u043e\u043d\u044b, \u0430 \u0442\u0430\u043a\u0436\u0435 \u0441\u043e\u043f\u0443\u0442\u0441\u0442\u0432\u0443\u044e\u0449\u0438\u0435 \u0430\u043a\u0441\u0435\u0441\u0441\u0443\u0430\u0440\u044b \u0438 \u0430\u043f\u043f\u0430\u0440\u0430\u0442\u043d\u043e\u0435 \u043e\u0431\u0435\u0441\u043f\u0435\u0447\u0435\u043d\u0438\u0435, \u043d\u0430\u043f\u0440\u0438\u043c\u0435\u0440 \u0447\u0435\u0445\u043b\u044b, \u043c\u043e\u043d\u043e\u043f\u043e\u0434\u044b \u0434\u043b\u044f \u0441\u0435\u043b\u0444\u0438, \u0437\u0430\u0449\u0438\u0442\u043d\u044b\u0435 \u044d\u043a\u0440\u0430\u043d\u044b \u0438 \u0437\u0430\u0440\u044f\u0434\u043d\u044b\u0435 \u0443\u0441\u0442\u0440\u043e\u0439\u0441\u0442\u0432\u0430."},{"1":{"1":"AClZvXLQ3gPoVwjQbokDpB3+nni4xURwH5+YlnwkqjYtUowjhiKvk8Q\u003d"},"2":"\u041f\u041a \u0438 \u0431\u044b\u0442\u043e\u0432\u0430\u044f \u044d\u043b\u0435\u043a\u0442\u0440\u043e\u043d\u0438\u043a\u0430","3":"\u0422\u043e\u0432\u0430\u0440\u044b, \u0443\u0441\u043b\u0443\u0433\u0438 \u0438 \u0438\u043d\u0444\u043e\u0440\u043c\u0430\u0446\u0438\u044f, \u0441\u0432\u044f\u0437\u0430\u043d\u043d\u044b\u0435 \u0441 \u043a\u043e\u043c\u043f\u044c\u044e\u0442\u0435\u0440\u0430\u043c\u0438 \u0438 \u0431\u044b\u0442\u043e\u0432\u043e\u0439 \u044d\u043b\u0435\u043a\u0442\u0440\u043e\u043d\u0438\u043a\u043e\u0439."},{"1":{"1":"AClZvXLKYOGgOROaa32IUxU15jP89AtTM4dV24WKS+daMhqJMTNmeSY\u003d"},"2":"\u0422\u0435\u043b\u0435\u0444\u043e\u043d\u0438\u044f","3":"\u0422\u043e\u0432\u0430\u0440\u044b, \u0443\u0441\u043b\u0443\u0433\u0438, \u0430 \u0442\u0430\u043a\u0436\u0435 \u0438\u043d\u0444\u043e\u0440\u043c\u0430\u0446\u0438\u043e\u043d\u043d\u044b\u0435 \u0438 \u0434\u0440\u0443\u0433\u0438\u0435 \u0440\u0435\u0441\u0443\u0440\u0441\u044b, \u0441\u0432\u044f\u0437\u0430\u043d\u043d\u044b\u0435 \u0441 \u0442\u0435\u043b\u0435\u0444\u043e\u043d\u0438\u0435\u0439 \u0438 \u0433\u043e\u043b\u043e\u0441\u043e\u0432\u043e\u0439 \u0441\u0432\u044f\u0437\u044c\u044e."}]},"10":{"1":"AClZvXLdGOShgJo+BM3apOUAFzQkE41z1/hiZhIY8eUlC7p7xXPm82P3dq7yXhbEI+tN/YHgdH4P"}}],"2":0.0,"3":"60609","4":1,"5":"","6":"ClD3Z2nP2P/////1/ff99fXV98nMyMrJz8rH9fHV883Hx8bMz83Oz8vOzv8A/v/+9f33/fX11ffJzMjKyc/Kx/Xx1fPNx8fGzM/Nzs/Lzs7//hABIWxUk293Pm+qOQAAAAAnMJaYSAFQAFoLCS8wxxaTatL1EAJgp7737gY\u003d","7":"3639","9":0},"xsrf":"ABOvogKaRsVZECZZJU-gDWrOqoP0CSqf7Q:1535467886413"} 

After json_decode, it looks like this:

Object from json-string (carefully, 175 lines).
 object (stdClass) # 19 (2) {
   ["result"] =>
   object (stdClass) # 18 (8) {
     ["1"] =>
     array (1) {
       [0] =>
       object (stdClass) # 1 (8) {
         ["1"] =>
         int (0)
         ["3"] =>
         int (0)
         ["4"] =>
         object (stdClass) # 2 (1) {
           ["1"] =>
           string (120) "AClZvXJ2t4wiEZ / VZ0i54m0Qtqpi2DTqkI1kaPMTRi4LnsQn0iR5K1xBlFpS1xmJV7ko4a6qx5RcTkp7CzVjwoy5UDSWZ5jOCPLGRcoQdDt + wOk46bdr0yA ="
         }
         ["5"] =>
         object (stdClass) # 3 (17) {
           ["1"] =>
           int (82)
           ["2"] =>
           int (0)
           ["3"] =>
           int (0)
           ["4"] =>
           string (102) "<div id =" ad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60 "> </ div>"
           ["5"] =>
           string (102) "<div id =" ad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60 "> </ div>"
           ["6"] =>
           string (355) "<div> Multi-format <span id = 'multi-format-tooltip'> </ span> </ div> <a class = 'arc-url-link-ellipsis' target =' _ blank 'href =' https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/ 'title =' https: //www.dns-shop.ru/actions/c09a061b-a048-11e8-9547- 00155d03330d / '> https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/ </a> "
           ["7"] =>
           string (1066) "<div class = 'arc-one-by-one-legend'> Ad type </ div> <div class = 'arc-one-by-one-data'> Multi-format <span id = 'multi -format-tooltip '> </ span> </ div> <div class =' ​​arc-one-by-one-legend '> Destination URL </ div> <div class =' ​​arc-one-by-one-data '> <a class =' ​​arc-url-link-ellipsis' target = '_ blank' href = 'https: //www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/' title = 'https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/'>https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d / </a> </ div> <div class = 'arc-one-by-one-legend'> Publisher Domains </ div> <div class = 'arc-one-by-one-data'> 4aynikam.ru </ div> <div class = 'arc-one-by-one-data'> androidphone.su </ div> <div class = 'arc-one-by-one-data'> androidphones.ru </ div> <div class = 'arc-one-by-one-data'> full-repair.com </ div> <div class = 'arc-one-by-one-data'> howgadget.com </ div> <div class = 'arc-one-by-one-legend'> Detected advertiser <span id = 'adx-  advertiser-tooltip '> </ span> </ div> <div class =' ​​arc-one-by-one-data '> DNS Shop </ div> "
           ["8"] =>
           string (98) "<div> <span class = 'arc-impression-score high'> HIGH </ span> number of hits </ div>"
           ["9"] =>
           object (stdClass) # 4 (3) {
             ["1"] =>
             string (4191) "<a href="https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/" target="_blank"> <img onerror =" this.src = 'data: image / gif; base64, RCw "border = 0 alt =" "> </a>"
             ["2"] =>
             string (4191) "<a href="https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/" target="_blank"> <img onerror =" this.src = 'data: image / gif; base64, R1w "border = 0 alt =" "> </a>"
             ["3"] =>
             string (4191) "<a href="https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/" target="_blank"> <img onerror =" this.src = 'data: image / gif; base64, Rg "border = 0 alt =" "> </a>"
           }
           ["10"] =>
           string (291) "https://adwords-displayads.googleusercontent.com/da/b/preview.js?client=asfe-arc-external-preview&obfuscatedCustomerId=5240877441&creativeId=288930210411&htmlParentId=ad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60&sig=ACiVB_yMUjLwDjRO2T-0VAaVuRPt8uLHGQ "
           ["13"] =>
           string (311) "https://adwords-displayads.googleusercontent.com/da/b/preview.js?client=asfe-arc-external-preview&obfuscatedCustomerId=5240877441&creativeId=288930210411&htmlParentId=ad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60&showVariations=true&sig=ACiVB_yMUjLwDjRO2T -0VAaVuRPt8uLHGQ "
           ["14"] =>
           string (69) "https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/"
           ["15"] =>
           string (0) "
           ["17"] =>
           string (0) "
           ["18"] =>
           string (8) "DNS Shop"
           ["20"] =>
           string (20) "adv-5594449542310820"
           ["21"] =>
           array (4) {
             [0] =>
             string (8) "site1.ru"
             [1] =>
             string (9) "site2.com"
             [2] =>
             string (9) "site3.com"
             [3] =>
             string (8) "site4.ru"
           }
         }
         ["6"] =>
         object (stdClass) # 5 (3) {
           ["5"] =>
           string (20) "-6668648012302470727"
           ["7"] =>
           array (1) {
             [0] =>
             string (3) "DNS"
           }
           ["9"] =>
           int (0)
         }
         ["7"] =>
         int (1)
         ["9"] =>
         object (stdClass) # 16 (1) {
           ["3"] =>
           array (5) {
             [0] =>
             object (stdClass) # 7 (3) {
               ["1"] =>
               object (stdClass) # 6 (1) {
                 ["1"] =>
                 string (56) "AClZvXLE9HJbFYq9TrAsXFgV4YkXsQt9lXp1xWjSB5aT5bFBpe4VNgo ="
               }
               ["2"] =>
               string (52) "Internet and telecommunications"
               ["3"] =>
               string (217) "Products and services related to telecommunications, including cable and satellite services and Internet access."
             }
             [1] =>
             object (stdClass) # 9 (3) {
               ["1"] =>
               object (stdClass) # 8 (1) {
                 ["1"] =>
                 string (56) "AClZvXKrUJJ3kKBen2scP56BynOtGhf160i1F1LLmtBj3b / oh2dUFg8 ="
               }
               ["2"] =>
               string (35) "Mobile phones"
               ["3"] =>
               string (359) "Mobile and cell phones, as well as related information, such as technical specifications and a comparative analysis of products. This category does not include accessories for mobile phones."
             }
             [2] =>
             object (stdClass) # 11 (3) {
               ["1"] =>
               object (stdClass) # 10 (1) {
                 ["1"] =>
                 string (56) "AClZvXL4W + khZ4O9SJiu97cTbTs2 + 0Wecf1IVNju8ffd4ysIT9PJ7XY ="
               }
               ["2"] =>
               string (73) "Mobile phones and accessories for them"
               ["3"] =>
               string (283) "Mobile phones, as well as related accessories and hardware, such as covers, monopods for selfies, protective screens and chargers."
             }
             [3] =>
             object (stdClass) # 13 (3) {
               ["1"] =>
               object (stdClass) # 12 (1) {
                 ["1"] =>
                 string (56) "AClZvXLQ3gPoVwjQbokDpB3 + nni4xURwH5 + YlnwkqjYtUowjhiKvk8Q ="
               }
               ["2"] =>
               string (45) "PC and consumer electronics"
               ["3"] =>
               string (142) "Products, services and information related to computers and consumer electronics."
             }
             [4] =>
             object (stdClass) # 15 (3) {
               ["1"] =>
               object (stdClass) # 14 (1) {
                 ["1"] =>
                 string (56) "AClZvXLKYOGgOROaa32IUxU15jP89AtTM4dV24WKS + daMhqJMTNmeSY ="
               }
               ["2"] =>
               string (18) "Telephony"
               ["3"] =>
               string (181) "Goods, services, as well as information and other resources related to telephony and voice communications."
             }
           }
         }
         ["10"] =>
         object (stdClass) # 17 (1) {
           ["1"] =>
           string (76) "AClZvXLdGOShgJo + BM3apOUAFzQkE41z1 / hiZhIY8eUlC7p7xXPm82P3dq7yXhbEI + tN / YHgdH4P"
         }
       }
     }
     ["2"] =>
     float (0)
     ["3"] =>
     string (5) "60609"
     ["4"] =>
     int (1)
     ["5"] =>
     string (0) "
     ["6"] =>
     string (168) "ClD3Z2nP2P ///// 1 / ff99fXV98nMyMrJz8rH9fHV883Hx8bMz83Oz8vOzv8A / v / + 9f33 / fX11ffJzMjKyc / Kx / Xx1fPNx8fGzM / Nzs / Lzs7 // hABIWxUk293Pm + qOQAAAAAnMJaYSAFQAFoLCS8wxxaTatL1EAJgp7737gY ="
     ["7"] =>
     string (4) "3639"
     ["9"] =>
     int (0)
   }
   ["xsrf"] =>
   string (48) "ABOvogKaRsVZECZZJU-gDWrOqoP0CSqf7Q: 1535467886413"
 }


This was an example response, containing only one announcement. Understand what you need.
Yes, and other requests methods are quite humanly called. A few examples:


Theoretically possible. To battle?


Okay, to solve their communication is possible (theoretically), but all this will be useless, but the theory will remain, if you do not make authorization in Google.

Authorization Or how to log into Google on php + cURL


Again, developer tools, logout and look at the exchange of data. I do not remember in detail, because I could not understand anything there. A huge number of JS, it seems that some calculations are made directly on the client, the results are sent to the server. In general, it is almost impossible for a non-human to enter.

We think further. A bunch of js. And if JS off? Will not Google users without JS be able to log in? Well, try without JS. Externally, the authorization window already looks much simpler. As before, we first enter the login, and the password on the next page. Most importantly, in terms of HTML is also much easier! The usual tag "form" with the usual fields "input", though not without a heap of security or system hidden fields. But hidden fields are not a problem, because what they received at the entrance was transferred to the next script. And so it came to log in to Google. And two-step authorization? More on that later. First you need to make sure that you manage to pull out ads for inspection, otherwise it does not make all the sense.

Is theoretical in practice possible?


Google logged in - it's time to test the theory of solving communication protocols in practice. I had to tinker with experiments and observations, carefully observe and record what actions the user leads to which requests, identify common and changing elements of the request, match the long incomprehensible values ​​received from the server and the same long sent back in the next request. It was a dense forest, which eventually became clearer and more transparent.

What should have been done to understand that continuation makes sense?

  1. Sign in to cpo .
  2. Get a list of ads.
  3. Get a specific ad (for a start text).

The entrance to the center is the simplest, roughly speaking, just follow the link. Happened.

Details
We just follow the link, get the answer (which we don’t use in this case). We also need to request and save a digital token for further requests.

In AdSense at the time of this writing, there are two CPOs. I will name them conditionally old and new.

For old cpo.

Post request "no load":

https://www.google.com/adsense/gwt-properties?pid=pub-8958890276790964&authuser=0&tpid=pub-8958890276790964&ov=3&hl=en

Answer:

 <meta name="gwt:property" content="usePropertyService=true"> <meta name="gwt:property" content="applicationType=ASFE3"> <meta name="gwt:property" content="syn.token=ABOvogJ1yQyL9pgHcGYM-J3OLj_9VSh31w:1535115071772"> <meta name="gwt:property" content="syn.token.pb=ABOvogKJ6-xmsNWK4Mbe_H5bT1xXhyj8SQ:1535115071772"> <meta name="gwt:property" content="syn.login=XXXXXX@gmail.com"> <meta name="gwt:property" content="syn.csi.backendUrl="> <meta name="gwt:property" content="syn.helpCenterUrl=//support.google.com/adsense/"> <meta name="gwt:property" content="syn.helpHost=//support.google.com"> <meta name="gwt:property" content="syn.helpCenterUri=/adsense"> <meta name="gwt:property" content="syn.newHelpHost=https://clients6.google.com"> <meta name="gwt:property" content="syn.newHelpCenterUri=/adsense"> <meta name="gwt:property" content="syn.helpCenterGaiaAuthDisabled=false"> <meta name="gwt:property" content="syn.billing3BaseUri=https://bpui0.google.com"> <meta name="gwt:property" content="syn.contextPath=/adsense"> <meta name="gwt:property" content="syn.userLanguage=en-US"> <meta name="gwt:property" content="syn.bruschettaContextPath=/adsense/new"> <meta name="gwt:property" content="userProfileImageUrl=https://lh5.googleusercontent.com/-v7nuoAI4eEQ/AAAAAAAAAAI/AAAAAAAAAAA/AT3-yjmKyg8/s96/photo.jpg"> <meta name="gwt:property" content="userDisplayName=" "> <meta name="gwt:property" content="userSettingsUrl=https://www.google.com/settings"> <meta name="gwt:property" content="googlePlusProfileUrl=https://plus.google.com/me"> <meta name="gwt:property" content="googlePrivacyUrl=http://www.google.com/intl/en_US/policies/privacy/"> <meta name="gwt:property" content="syn.features=562,465,612,604,616,618"> <meta name="gwt:property" content="analyticsHomePageUrl=https://www.google.com/analytics/web/"> <meta name="gwt:property" content="disableDebugIds=true"> <meta name="gwt:property" content="syn.pubControlsCapabilitiesLoadTimeout=5000"> <meta name="gwt:property" content="pid=pub-8958890276790964"> <meta name="gwt:property" content="tpid=pub-8958890276790964"> <meta name="gwt:property" content="syn.asfeGtmCampaignId=GTM-K7WZ"> 

We need the fourth line, namely “syn.token.pb”. We keep this value for further query generation.

For a new cpo.

Post request "no load":

https://www.google.com/ads-publisher-controls/acx/5/darc/loader?onearcClient=adsense&pc=ca-pub-8958890276790964&tpid=pub-8958890276790964&hl=en

Answer:

 (function() {function loadAsyncOrDefer() {var scriptElement = document.createElement('script'); scriptElement.src = 'https:\/\/ssl.gstatic.com\/ads-publisher-controls\/onearc_20180822-12_RC00\/darc\/arc_app.dart.js';scriptElement.type = 'application\/javascript';scriptElement.defer = true;scriptElement.nonce = window['acxCspNonce'];scriptElement = document.head.appendChild(scriptElement); if ('_resourceTimingBuffer' in window) {_resourceTimingBuffer.add(scriptElement.src);}};loadAsyncOrDefer();})();window['__darc_app_params'] = {'onearcClient': 'ADSENSE','hl': 'ru','pc': 'ca-pub-8958890276790964','tpid': 'pub-8958890276790964',};window['__app_metadata'] = {'token': 'ABiMD8TT9vzK99SFB7iaI0ssBySxT9jjrQ:1535116725529','pre': '\/ads-publisher-controls\/acx','scs': 'https:\/\/ssl.gstatic.com\/ads-publisher-controls\/onearc_20180822-12_RC00','oacf': '\x7b\x221\x22:\x5b5,25,22,8,27,32,43,44,45,48,49,5,25,22,8,27,32,43,44,45,48,49,29,46\x5d\x7d','hats': 'ibhswcm2x2iztju5i6jbbzlkma',}; 

The sequence we need is here:

'token': 'ABiMD8TT9vzK99SFB7iaI0ssBySxT9jjrQ:1535116725529'


Getting the list is an interesting task, because you need to transfer a bunch of settings — let us know what we want to receive (type of ads, verified / new / blocked, number of ads, etc.). Plus a digital XSRF token for each request. Happened. In response, came a large amount of data that even contained thumbnails of images of those sites where the ads led. And, of course, links to ads.

Details
Drafts of my attempts to solve which parameter is responsible for what answers. I slightly ennobled them cut out all the mat and emoticons and laid out here. First there will be a sequence for the old CVD, and then for the new one.

Further, I will call requests the names of the methods (only for the CPS, for the old method specified in the request body) and the json string, since they are the "carriers" of information, everything else (address, headers, digital tokens, other parameters) - "wrapper ", They are not fundamental, I will tell you all about it later.

For the old CVD (“params” json query variable):

 { "1":"ca-pub-8958890276790964", "2":{ "1":0, //  ,          "2":32, // -   "3":0, // 0 - , 1-  "4":{ "1":{"1":"-    "} //    }, "5":{ "1":"video" //   "2":1, //  ,     "3":1, //       "6":7, //      "16":[0], // 0 - ; 1 - ; 2 - .      . "17":0 //       } }, "3":"-3945261286198141534" //,    } 

Decoding is, form a request and get the answer.

For an old CVD, you first need to get a token - make another request before requesting ads itself:

 {"method":"getWebPropertyMetricsToken","params":"{\"1\":\"ca-pub-8958890276790964\"}","xsrf":"ABOvogKJ6-xmsNWK4Mbe_H5bT1xXhyj8SQ:1535115071772"} 

Answer:

 {"result":{"1":{"1":{"1":"AClZvXKte+4mEwsFB7kw20LrbWQ6jOMxmK8j4At4Vxqc7w+5dDDYWIx2k1ldCvvGbAT59UClLSkQty6zyZZQSmgxKvpKhq22bKRfGy8ywt0B5L8WE53vo+YtI8ixM8Xe0RPixTjPtOLQA8sCZod+hvHxqU5Depi3I9XUV6JMn8uCOg67m+5oe5TT1L0OytnUBDIsjAaQ+kcldN23yGoppKKCs2Zf5XI6i7nk5QHehS8wvsDlugvkKSU3fUo3J+ZHJvoUXyCGLP3lP9Gh+6fOMir/SLrOJx8udRbtjTJhLsvXTXUN2QbjcEfFFAIaWfgMr5euHtYwYYWuMoI5ofZTc9L8sCY5pA0Q/CWyZ6QLH85XI70vxH6cBZtsnfrPLRh18cxSxFgzXuAwPHW8+CueCznqiHcY7gOhxQc2YWmSgwMIP9Cpgr089dWoB58wulcK0g+EqnTJiQdI9MMUj4zzLpu5DYja5ftP7lF3jeCSuKT9q70B9OqMDvlGlruZd2hhHe3k5S+LoyWo/4WZDUTvWpCMmnPzCP3R4OIQnrhS0s5ffOVxjyNHrXJXtrNhppap3BY4iByIn1cowMfVFfx3hNep0JW59db9fVuXKaSy/mqHZKC1ToRM/UyCoSZ9ZjY/Ot091ptURLRYoTFal5TBbMKISgxn5UCz4vSoxVe1fC64dwXHatSzCCg9AjJOpKR4p/9smxOaKg73pmMHsEY98I6TJhvaeJ9o6lcHsG8PZnB6xNS4ZJHBtN1baHkrCHOfqaepMVyRCF2kPNhr9SgujjTTbiKGMUO3UVamOQQ5/EckTgFMr0PIda7PPw7op8qFEhxZmkoo9KgERcYLGHxzGePjfo0IiNbf7k50lgDipwk5ag3CI0tw3CtDicQn6isHwKOmlfSctrEGv/Fjlmcgjhl1sTAL/rTWxDCABKN7/OhdysBAOq0j6viFgzjM8WI0ZuYPIVIm19CQ+YGcOx77oiyxev+3sAj7uSJoYFslmgiZV4jrF5P+b+U/5fknRf2Ho8plAUh4AHweXMeaPFYZAYooe6jC79EzgizqXvx1H/HrKKQcaXdDZ1ivoOM/7DtzJbawzO7ALUnHkqR1ZYmw3+3E/pmsDXedYgzERWYWvJltS+P46iWYOS43SUVw+whDWZnjJOwVOFFLDWcg4ykfzNmbq4B/vUibrV1dCiRpTIXSP92xk1I8MCfQGiptqo5MiKttqJ9Orj7nrGXEDz5pJBTTem919nz5rNIjI/sus3GZ+G4rBE+9i1sJN0jxszvpRD2AKsl1KSOrPCuOBhpNbD2HnFgQd+EUw8CpH2MLZlrZ8l3cqzDVc5aeCQ1eiUKlONlZpIxZi5wE5HyKZRxC8ljtX5xe+Fpg8R8/yDarvAkjeb0yKzN/e893nEVz3CmF68pphNp71kjJtvwBS2JtSWhFc81Ys51GEw\u003d\u003d"}}},"xsrf":"ABOvogJLbcTkcBxU_TCJddIrW4L-mVwPcw:1535115072920"} 

This huge token ("1": "AClZ ...") we need to request ads.

Request Ads:

 {"method":"searchArcApprovals","params":"{"1":"ca-pub-8958890276790964","2":{"1":0,"2":24,"3":0,"4":{"1":{"1":"AClZvXKte+4mEwsFB7kw20LrbWQ6jOMxmK8j4At4Vxqc7w+5dDDYWIx2k1ldCvvGbAT59UClLSkQty6zyZZQSmgxKvpKhq22bKRfGy8ywt0B5L8WE53vo+YtI8ixM8Xe0RPixTjPtOLQA8sCZod+hvHxqU5Depi3I9XUV6JMn8uCOg67m+5oe5TT1L0OytnUBDIsjAaQ+kcldN23yGoppKKCs2Zf5XI6i7nk5QHehS8wvsDlugvkKSU3fUo3J+ZHJvoUXyCGLP3lP9Gh+6fOMir/SLrOJx8udRbtjTJhLsvXTXUN2QbjcEfFFAIaWfgMr5euHtYwYYWuMoI5ofZTc9L8sCY5pA0Q/CWyZ6QLH85XI70vxH6cBZtsnfrPLRh18cxSxFgzXuAwPHW8+CueCznqiHcY7gOhxQc2YWmSgwMIP9Cpgr089dWoB58wulcK0g+EqnTJiQdI9MMUj4zzLpu5DYja5ftP7lF3jeCSuKT9q70B9OqMDvlGlruZd2hhHe3k5S+LoyWo/4WZDUTvWpCMmnPzCP3R4OIQnrhS0s5ffOVxjyNHrXJXtrNhppap3BY4iByIn1cowMfVFfx3hNep0JW59db9fVuXKaSy/mqHZKC1ToRM/UyCoSZ9ZjY/Ot091ptURLRYoTFal5TBbMKISgxn5UCz4vSoxVe1fC64dwXHatSzCCg9AjJOpKR4p/9smxOaKg73pmMHsEY98I6TJhvaeJ9o6lcHsG8PZnB6xNS4ZJHBtN1baHkrCHOfqaepMVyRCF2kPNhr9SgujjTTbiKGMUO3UVamOQQ5/EckTgFMr0PIda7PPw7op8qFEhxZmkoo9KgERcYLGHxzGePjfo0IiNbf7k50lgDipwk5ag3CI0tw3CtDicQn6isHwKOmlfSctrEGv/Fjlmcgjhl1sTAL/rTWxDCABKN7/OhdysBAOq0j6viFgzjM8WI0ZuYPIVIm19CQ+YGcOx77oiyxev+3sAj7uSJoYFslmgiZV4jrF5P+b+U/5fknRf2Ho8plAUh4AHweXMeaPFYZAYooe6jC79EzgizqXvx1H/HrKKQcaXdDZ1ivoOM/7DtzJbawzO7ALUnHkqR1ZYmw3+3E/pmsDXedYgzERWYWvJltS+P46iWYOS43SUVw+whDWZnjJOwVOFFLDWcg4ykfzNmbq4B/vUibrV1dCiRpTIXSP92xk1I8MCfQGiptqo5MiKttqJ9Orj7nrGXEDz5pJBTTem919nz5rNIjI/sus3GZ+G4rBE+9i1sJN0jxszvpRD2AKsl1KSOrPCuOBhpNbD2HnFgQd+EUw8CpH2MLZlrZ8l3cqzDVc5aeCQ1eiUKlONlZpIxZi5wE5HyKZRxC8ljtX5xe+Fpg8R8/yDarvAkjeb0yKzN/e893nEVz3CmF68pphNp71kjJtvwBS2JtSWhFc81Ys51GEw\u003d\u003d"}}},"3":""}","xsrf":"ABOvogI3FCm29t4pdIded8L-Q98R0Voy-Q:1535121289188"} 

I translate section 2 of the “params” variable:
Google, , :
("1":0),
24 ("2":24),
("3":0),
: AClZvX....


A number of parameters can be omitted, they accept the default values:

  • ad type: all;
  • period: all available;
  • predictable blocking: no;
  • show only unverified: no.

In response, dozens or hundreds of kilobytes come in, depending on the number of ads requested. The most difficult is the graphics “stretched out” in the text (data: image / gif; base64 ....). And if there is no unverified, the answer is simple:

 {"result":{"4":1,"5":"","8":"0","9":0},"xsrf":"ABOvogLWqmyC7KH1zfvmPxk-Y69-Jzj5XQ:1535115074392"} 

If the declarations were contained there: result -> {5}.

For a new cpo:

 { "1":"ca-pub-8958890276790964", "2":{ "1":10, //  ,          "2":7, // -   "3":11, //  - 10;  - 1;  - 11; "5":{ "6":3 //      "7":3534 //    "14":"en" //  "16":[0] // 0 - ; 1 - ; 2 - . "18":"dfd.com" //   "24":"video" //   }, "7":""}, //          "3":"-2876348936240321457", //          "5":true //    . . } 

Preliminary requests do not need to do, you can immediately request ads.
SearchApprovals (this is a method)

 {"1":"ca-pub-8958890276790964","2":{"2":100,"3":11,"5":{"16":[0]},"7":""},"5":true} 

Google, , :
100 ("2":100),
("3":11),
("5":{"16":[0]},
("7":"")


Optional parameters and defaults:

  • the sequence number of the first requested ad is: 0;
  • period: all available;

In response, we get almost the same thing as in the case of the old CVD. It differs only in one word - the name of the data container. In the old it is “result”, in the new - “default”.


Getting a particular ad is simple, take a link from the previous answer and download the ad. There is no protection, access is free for everyone.

Details
Link to the ad. We will look for it in the previous answer, where we received many, many kilobytes of text in response to an ad request.

So that there was not too much incomprehensible code I quote the answer to the request of one ad (and even that was mercilessly chopped, it was 10 times more, only the most important thing at the moment was left):

 {"result":{"1":[{"1":0,"3":0,"4":{"1":"AClZvXJ2t4wiEZ/VZ0i54m0Qtqpi2DTqkI1kaPMTRi4LnsQn0iR5K1xBlFpS1xmJV7ko4a6qx5RcTkp7CzVjwoy5UDSWZ5jOCPLGRcoQdDt+wOk46bdr0yA\u003d"},"5":{"1":82,"2":0,"3":0,"4":"\u00GQ","13":"https://adwords-displayads.googleusercontent.com/da/b/preview.js?client\u003dasfe-arc-external-preview\u0026obfuscatedCustomerId\u003d5240877441\u0026creativeId\u003d288930210411\u0026htmlParentId\u003dad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60\u0026showVariations\u003dtrue\u0026sig\u003dACiVB_yMUjLwDjRO2T-0VAaVuRPt8uLHGQ","14":"https://www.dns-shop.ru/actions/c09a061b-a048-11e8-9547-00155d03330d/","15":"","17":"","18":"","20":"adv-5594449542310820","21":["domain1.com","domain2.com"]},"6":{"5":"-6668648012302470727","9":0},"7":1,"9":{"3":[{"1":{"1":"AClZvXLE9HJbFYq9TrAsXFgV4YkXsQt9lXp1xWjSB5aT5bFBpe4VNgo\u003d"},"2":"\u041/YHgdH4P"}}],"2":0.0,"3":"59917","4":1,"5":"","6":"ClD3Z2nP2P/////1/ff9oPjm7gU\u003d","7":"5751","9":0},"xsrf":"ABOvogJJJuNM1d0i22yN48ibBAY8vpvC_A:1535125743731"} 

From the {13} parameter, you can pull out the link to the ad:

https://adwords-displayads.googleusercontent.com/da/b/preview.js?client=asfe-arc-external-preview&obfuscatedCustomerId=5240877441&creativeId=288930210411&htmlParentId=ad-parent-id-6A2DE3D206234468F53C743C0EEACD67A59E6C5B62C0371F770419826258CB1AD9591F60&showVariations=true&sig=ACiVB_yMUjLwDjRO2T-0VAaVuRPt8uLHGQ .

For some time (days, maybe weeks) the link will live and anyone can get an ad on it. There are approximately 100 - 150 kilobytes and at the very bottom (and not only) you can find excerpts from the text of the ad.

In addition, this important parameter is the internal identifier of the ad, which we will use to manage (blocking / unblocking the ad, blocking / unblocking the AdWords account that this ad unscrews, requesting ad statistics, setting the “checked” mark, sending a violation report) . It is stored here:
result -> {1} -> {4} -> {1}.

Looks like that:

AClZvXJ2t4wiEZ/VZ0i54m0Qtqpi2DTqkI1kaPMTRi4LnsQn0iR5K1xBlFpS1xmJV7ko4a6qx5RcTkp7CzVjwoy5UDSWZ5jOCPLGRcoQdDt+wOk46bdr0yA=

Its length is 120 characters (with rare exceptions).

There is a lot of information in this data flow:

  • Type of announcement.
  • Destination URL.
  • Domains unscrewed.
  • Information about the advertiser (name, if there is an identifier).
  • Qualitative characteristics of the number of impressions, for example, "high".
  • Three miniatures of the landing page in the form of data: image.
  • The category to which the ad belongs, for example, "Telephony".

The result is obtained - automation gives in. Next, the order in the functions was brought, because the working prototype was terrible, I just wanted to understand more quickly whether the process could be automated. The first version was offered to people and the finishing and correction of errors began. The first problem is that “two-step machines” could not log in.

Two-step authorization


If you go to check how it looks when JS is turned off, then you can see a lot of authorization options: a password via SMS, one-time passwords from a piece of paper, through the application ...
Each option to automate, so that everyone was comfortable - you can go crazy.

Developer Rescue


When, without JS in Chrome, I looked at the two-step authorization mechanism, I saw a reference to the choice of another method, for which I got hooked. Whichever method is chosen by default, there is always an option to go to the selection and select SMS. This was a real salvation. Of course, I had to do a check on the method chosen by default, and in the case of the “wrong” method “press” the shift button and select “one-time password by SMS”.

For the authorization itself, I only saved the intermediate data from the form (the same pile of hidden fields) and the one-time password input form. Everything, "two-stage" could also enter.

Completion of the creation process


The main task was completed - anyone could install and use an automated solution to periodically filter ads on their sites.

Then there were improvements, additions, correction of deficiencies ...
... identified both by users and independently, the refinement of the external design (intuitive for the author was incomprehensible to almost everyone else).
There were also completed and added various functions and filters to search for unwanted ads. For example, to automatically determine the hash of Cyrillic and Latin. Normal advertisers do not make up ads like this, but sometimes there are errors in the form of mixing one Latin character into Russian words (popular errors in the filter are also taken into account).

Added for convenience add-ons:

  • List of blocked advertisers.
  • List of blocked domains.
  • Income plate.
  • AdSense links.

The list of blocked advertisers is an opportunity to watch and edit, and it is more convenient (but not more beautiful in appearance) than in the standard interface! Plus there is the possibility of unlocking the "wholesale", which is not present in the standard AdSense.

The list of blocked domains is similar to the previous list.

Ability to work with AdX (via AdManager, where AdX recently moved).

There are many improvements, the most interesting in my opinion are listed above.


Functions for sending a request and receiving a result


Previously, wrote about requests in the form of json-strings, and promised to disclose more details later.

When all this was done the new CPS was not yet, therefore everything was done for the old, and we will begin with it.

Communicating with an old CVD


With the help of observations, we managed to find out that the main exchange of requests goes to the same address:

  https://www.google.com/adsense/gp/creativeReview?ov=3&pid=pub-8958890276790964&authuser=0&tpid=pub-8958890276790964(&hl=ru) 


What is not always in parentheses is just a parameter that indicates the answer language, it can be applied to almost all Google products. This is important because I use English everywhere and the software recognizes some parameters, waiting for a response in English.

In addition to the address, there is a standard form of transmitted post-data (in the developer’s tools they are visible in the section “Request Payload”) - this is a json-string with method, params and xsrf variables:

 {"method":"getArcSettings","params":"{\"1\":[\"ca-pub-8958890276790964\"]}","xsrf":"ABOvogJlvXKkBQUbPYEsM04recgCsukFMg:1535467881599"} 

method - here, like, everything is clear.
params - depending on the method, its standard format of the transmitted json-string.
xsrf - the initial receipt of the digital token that we use for the request is described above, and in the reply we receive a new XSRF-token for the next request.

The answer also comes in the form of a json string from the parts of result (requested information) and xsrf:

 {"result":{"1":[{"1":"ca-pub-8958890276790964","2":{"1":"ca-pub-8958890276790964","2":0},"3":{"1":"ca-pub-8958890276790964","2":0}}]},"xsrf":"ABOvogIH7wJjD8t1xmuu8WbGplQowqjjJA:1535467883406"} 

php code function
 function creative_review($method, $params) { $xsrftoken = file_get_contents($GLOBALS['xsrftoken_file']); $creativeReview = new stdClass(); //to make json request string $creativeReview->method = $method; $creativeReview->params = $params; $creativeReview->xsrf = $xsrftoken; $creativeReview_post_request = json_encode($creativeReview); unset($creativeReview); $result = curl_post($GLOBALS['creative_review_req_string'], $creativeReview_post_request, $GLOBALS['arc_tab_req_string'], $GLOBALS['myheaders']); $result = json_decode($result); // decode result string if ($result->xsrf) file_put_contents($GLOBALS['xsrftoken_file'], $result->xsrf); // Renew standard XSRF token return $result; } 

post- — curl_post($url, $postfields, $referer, $myheaders).

.
$myheaders :

 accept-language:en-US;q=1,en;q=0.4
content-type:application/javascript;  charset = UTF-8 


$GLOBALS['creative_review_req_string']:

 https://www.google.com/adsense/gp/creativeReview?ov=3&pid=pub-8958890276790964&authuser=0&tpid=pub-8958890276790964&hl=en 


, .

$GLOBALS['arc_tab_req_string']:

 https://www.google.com/adsense/new/u/0/pub-8958890276790964/main/allowAndBlockAds?webPropertyCode=ca-pub-8958890276790964&tab=arcTab 

referer , .

Communication with the new CPS


Here the address for the request is more complicated - it changes. There is only the initial common part. The scheme is as follows:

The general part + method + '?' + GET parameters + rpcTrackingId = <repeat previous GET parameters in URL encoding> + ':' + <The sequence number of the request using the same method within ... in general, until the user refreshes the page>.

https://www.google.com/ads-publisher-controls/acx/5/proto/creativereview/GetArcSettings?hl=ru&pc=ca-pub-8958890276790964&onearcClient=adsense&rpcTrackingId=%2Fads-publisher-controls%2Facx%2F5%2Fproto%2Fcreativereview%2FGetArcSettings%3Fhl%3Dru%26pc%3Dca-pub-8958890276790964%26onearcClient%3Dadsense%3A1

XSRF-token here is passed in the header 'x-framework-xsrf-token' and it is reusable, therefore, it does not come in the answers and it is not necessary to constantly update it.

php code function
 function creative_review_new($method, $params) { if (!isset($GLOBALS['xsrftoken_new'])) $GLOBALS['xsrftoken_new'] = file_get_contents($GLOBALS['temp_folder'] . 'xsrftoken_new.txt'); $myheaders = $GLOBALS['myheaders_new']; $myheaders[] = 'x-framework-xsrf-token:' . $GLOBALS['xsrftoken_new']; $query['pc'] = 'ca-' . $GLOBALS['pub_id']; $query['onearcClient'] = 'adsense'; $query['hl'] = 'en_US'; foreach ($query as $index => $value) $rpc[] = $index . '=' . $value; if(!isset($GLOBALS[$method_count])) { $GLOBALS[$method_count]=1; } else { $GLOBALS[$method_count]++; } $append = ':'.$GLOBALS[$method_count]; $query['rpcTrackingId'] = $GLOBALS['creative_review_new_string'] . $method . '?' . implode('&', $rpc) . $append; $query = http_build_query($query); $url = 'https://www.google.com' . $GLOBALS['creative_review_new_string'] . $method . '?' . $query; $result = curl_post($url, $params, $GLOBALS['new_arc_tab_req_string'], $myheaders); if (mb_strpos($result, 'Error 400 (Not Found)', 0, 'UTF-8') !== false) { return '-32000 XSRF token validation'; } $list = explode("\n", $result, 2); $result = $list[1]; $result = json_decode($result); // decode result string if (@$result->default->{5}) file_put_contents($GLOBALS['temp_folder'] . 'some_digi_token.txt', $result->default->{5}); // Renew token if (@$result->default->{6}) file_put_contents($GLOBALS['temp_folder'] . 'some_long_token.txt', $result->default->{6}); // Renew token 

post- — curl_post($url, $postfields, $referer, $myheaders).

$myheaders (javascript → json):

 accept-language:en-US;q=1,en;q=0.4
content-type:application/json;  charset = UTF-8 


$GLOBALS['creative_review_new_string']:

 /ads-publisher-controls/acx/5/proto/creativereview/ 
.

$GLOBALS['new_arc_tab_req_string']:

 https://www.google.com/adsense/new/u/0/pub-8958890276790964/arc/ca-pub-8958890276790964 
referer , .

. «» . (« 10 , 30-»). , .


The function of requesting a list of domains and their management


It is almost the same as the function of communication with the old CVD, differs only in the address of the appeal.

php code function
 function blocking_controls($method, $params) { $xsrftoken = file_get_contents($GLOBALS['xsrftoken_file']); $creativeReview = new stdClass(); //to make json request string $creativeReview->method = $method; $creativeReview->params = $params; $creativeReview->xsrf = $xsrftoken; $creativeReview_post_request = json_encode($creativeReview); unset($creativeReview); $result = curl_post($GLOBALS['blocking_controls_req_string'], $creativeReview_post_request, $GLOBALS['arc_tab_req_string'], $GLOBALS['myheaders']); $result = json_decode($result); // decode result string if ($result->xsrf) file_put_contents($GLOBALS['xsrftoken_file'], $result->xsrf); // Renew standard XSRF token return $result; } 

$GLOBALS['blocking_controls_req_string']:
 https://www.google.com/adsense/gp/blockingControls?ov=3&pid=pub-8958890276790964&authuser=0&tpid=pub-8958890276790964 
.

XSRF tokens are saved to disk in a file, it is necessary that requests to block / unlock ads, AdWords accounts and other actions work right from the control panel without having to request a new one.

Handling Replies


The data comes either in the form of json strings (responses received by the three functions above) and in the form of a JS code (requested declarations), where a series of characters are “encrypted” in hexadecimal encoding (\ x <character code of two characters >).

An excerpt from the ad, the link to which is above:
target \ x3d_blank title \ x3d \ x22 \ x22 \ x3e \ x3cspan \ x3e Buy Xiaomi Redmi S2 and get Redmi 5 \ x3cbr \ x3e in a gift. From August 24 to 26. \ x3cbr \ x3eLearn more on the site.

For json, there is a function in php, which at the output will give at least an object, even an array.
For "Kosoix" somewhere in the network I found a small function that brings data to a human form.

php code function
 function hex_repl($html) { $i = 256; while ($i >= 0) { $hex = dechex($i); $html = str_ireplace("\x$hex", chr($i), $html); $i--; } return $html; } 

Result of decoding:
target = _blank title = ""> <span> Buy Xiaomi Redmi S2 and get Redmi 5 <br> as a gift. From August 24 to 26. <br> More on the site.

Ad recognition


Text . I started with them. They were more important and, as it turned out, everything was much easier with them. There are only two types of them: the old ones, with one heading (which are almost gone) and the new ones, with two headings.

The announcement comes already in the form of HTML code, but in addition to the announcement, the received response contains a lot of unnecessary data for us - the Javascript code (it didn’t even delve into the essence of this code).

Recognition ultimately resulted in the following steps:


The headings, the text and the link contain certain classes, and the recognizer has “clung” to them.

What where is the text ad processing function
rhtitleline1 — 1;
rhtitleline2 — 2;
rhtitle — ( );
rhbody — ;
rhurl — URL.

 function text_ad($html) { $list = explode('</head>', $html); $ad_html = array_pop($list); unset($list, $html); $dom = new DOMDocument('1.0', 'UTF-8'); @$dom->loadHTML($ad_html); unset($ad_html); foreach ($dom->getElementsByTagName('a') as $a_node) { if (stripos($a_node->getAttribute('class'), 'rhtitleline1') !== false) { $ad['header1'] = $a_node->textContent; continue; } if (stripos($a_node->getAttribute('class'), 'rhtitleline2') !== false) { $ad['header2'] = $a_node->textContent; continue; } if (stripos($a_node->getAttribute('class'), 'rhbody') !== false) { $ad['body'] = $a_node->textContent; continue; } //Old ads (with just 1 header) support if (stripos($a_node->getAttribute('class'), 'rhtitle ') !== false || stripos($a_node->getAttribute('class'), 'rhtitle"') !== false) { $ad['header1'] = $a_node->textContent; continue; } if (stripos($a_node->getAttribute('class'), 'rhurl ') !== false || stripos($a_node->getAttribute('class'), 'rhurl"') !== false) { $ad['displayUrl'] = $a_node->textContent; continue; } } $fulltext = implode(' ', $ad); $ad['fulltext'] = $fulltext; if (!isset($GLOBALS['set_gl']['utf8_off'])) foreach ($ad as $index => $value) $ad[$index] = utf8_decode($value); return $ad; } 

— . .

$fulltext — .

utf8_decode . DOMDocument . .


Graphic . They check only the target URL. There is no image recognition, saving pictures for inspection too (for pictures, if desired, can be viewed in the CPS). I see no reason to reinvent the wheel here (most likely, a curve and not needed by anyone).

Multimedia . Under this general name hides a number of different ads:


For multi-format created 3 recognition functions depending on the type of ad.
For media created 2 functions.

For HTML5 created 3 functions.

Filtration


After the recognition of ads begins the process of determining undesirable by different criteria (all filters are enabled, some customizable):


Job report


According to the results of filtering a report on the work done.
It is built as a list of ads for each filter in its column, plus a column for “good” ads, the following information is included in the report:


Appearance is based on the basis of the old CVD (and only at the time of the creation of design).


Clickable


When viewing from a mobile, each column occupies the entire width of the screen and buttons for selecting the viewed column appear.

Little about security


You can make limited access to the control panel (to manage yourself from one place) or “worldwide” so that you can manage from anywhere.

The first case is safe - no one will fit, if the PC does not sit down. In the second case, the address where the software is stored must be kept secret, plus the setting of a password for entering the control panel is provided. To prevent your secret address from “leaking out” when clicking on links to third-party sites (from ads), the following is done:


Automation result


24 hours, 7 days a week, all newly appeared ads in the CPO are scanned at intervals of two to three minutes. As a result, the objectionable (according to the criteria specified by the user) are sent to the "blocked" section. Similarly, I never counted, but approximately out of 100 blocked pieces, 90 to 95 are blocked for a reason. Out of one hundred “clean”, according to software, on average less than one “bad.”

What do I call “bad ads”? Everything that leads to mobile subscriptions, everything that offers “download”, just download or “download file” without any specifics at all, everything that offers “watch the video”, again without any details, everything that leads not at all , as indicated in the title and ad text, any mention of casinos in countries where it is prohibited by law.

As a result, I practically do not spend my time searching and blocking ads, and the advertisements of casinos and various pahabschiny distributed through my sites have become less than ten times (unfortunately, the problem is not solved completely - I don’t stop thinking about it).

It has become less and theft in the form of unconscious subscriptions, even without a MegaFon card!

And where is the card "MegaFon"?
« » « » « »:
«» , . , .

But not all users of our sites have “Megaphone cards” and analogues of other PPSOs.
Therefore, gentlemen, protect visitors to your resources from unwanted charges yourself!

An open source project is on GitHub .

Source: https://habr.com/ru/post/422379/


All Articles