Foreword
Now many people use instagram (further insta): someone collects albums there, someone sells, someone buys, and I'm lazy there. I was always wondering how my friends, classmates, colleagues and insta were doing there to help. I wanted to find out what was new there - I went in, I looked through the tape, I saw everything that interested I had left ... BUT! For some reason I always had to like every post (I can not explain why, but such are the cases). And imagine, I didn’t go there for a week, you sit, you like a weekly pool, and when you have 200+ subscriptions, it’s all hell.
Active activities
As a result, like any normal person, I was too lazy to like everything and I scored. It seems that everything was fine, I stopped spending a lot of time on useless huskies, but my conscience ate me. I understood that subscribers are bad without my royal husky, they feel sad and blah blah ... In general, it was decided that you need to write something simple and easy that can solve the problem of resentment, and maybe help someone else. From friends I have heard a lot about python and how to test applications using selenium or use it as a certain crawler. It was decided to use python and selenium in conjunction with phantom js, it was all new to me, because Before that, I was not familiar with these technologies at all.
Why Selenium and phantom?
It's all very simple. The instagram client part is written on react, therefore, any data can be pulled there only after the page is srenderen. Since selenium just serves to automate actions in the browser, and phantom js helps to do it all without any display, it was decided to use them. Looking ahead to say that I decided to give up phantom js due to the fact that it is quite slow, and chrome had a headless option, which allowed it to be used as a “headless” browser.
')
Why python?
I heard and read a lot about the fact that this language is great for working with big data, from here I concluded that it is convenient to work with any data at all (to parse, sort, compare, format, etc.), I also read somewhere that it’s convenient and quick to write my own mini-library (and this is what is needed for the bot to make it as versatile as possible). After weighing everything, I decided to stop on python3 (before this, part of the project was already written with the ability to run on python2 and python3).
Library development for the bot
It’s stupid to describe the whole process, so let’s focus on the most interesting points:
Authorization
Since a bot is a repetition of a large number of the same actions for which you need to be authorized, it was necessary to invent something with this process. Each time it is very suspicious to log in through the form, it was decided to try to pull cookies and use them for authorization.
It turned out that with instagram everything is simple (but mail ru gave me a wild headache):import pickle import time import tempfile import os import selenium.common.exceptions as excp def auth_with_cookies(browser, logger, login, cookie_path=tempfile.gettempdir()): """ Authenticate to instagram.com with cookies :param browser: WebDriver :param logger: :param login: :param cookie_path: :return: """ logger.save_screen_shot(browser, 'login.png') try: logger.log('Trying to auth with cookies.') cookies = pickle.load(open(os.path.join(cookie_path, login + '.pkl'), "rb")) for cookie in cookies: browser.add_cookie(cookie) browser.refresh() if check_if_user_authenticated(browser): logger.log("Successful authorization with cookies.") return True except: pass logger.log("Unsuccessful authorization with cookies.") return False def auth_with_credentials(browser, logger, login, password, cookie_path=tempfile.gettempdir()): logger.log('Trying to auth with credentials.') login_field = browser.find_element_by_name("username") login_field.clear() logger.log("--->AuthWithCreds: filling username.") login_field.send_keys(login) password_field = browser.find_element_by_name("password") password_field.clear() logger.log("--->AuthWithCreds: filling password.") password_field.send_keys(password) submit = browser.find_element_by_css_selector("form button") logger.log("--->AuthWithCreds: submitting login form.") submit.submit() time.sleep(3) logger.log("--->AuthWithCreds: saving cookies.") pickle.dump([browser.get_cookie('sessionid')], open(os.path.join(cookie_path, login + '.pkl'), "wb")) if check_if_user_authenticated(browser): logger.log("Successful authorization with credentials.") return True logger.log("Unsuccessful authorization with credentials.") return False def check_if_user_authenticated(browser): try: browser.find_element_by_css_selector(".coreSpriteDesktopNavProfile") return True except excp.NoSuchElementException: return False
In case of unsuccessful authorization by cookies, we login with a login / password, save the cookie and use it in the future, the standard scheme.
#TODO:
Licking News Feeds
Since First of all, I wrote this for myself; I was interested in my news tape being always otlaykan. Initially, everything was simple, scrolling down to the last processed post, web elements of posts are entered into an array, the back is turned on and like everything on the way back, laid out through web elements of posts that lie in the previously created array. I was happy that everything works exactly the way I wanted it, but after about two months the “moon was ibex” and my bot stupidly stopped working. I checked everything as I could, on different web drivers, visually nothing has changed, but at the same time nothing works. In general, I killed in search of a problem for about three days. Everything turned out to be very simple: earlier when the bot was going through the scrolled posts, it took their objects from the array, scrolled to the post (imitating the actions of a person), found the like button there, pressed it and went on; now the instagram decided to keep in the html markup only ~ 9 posts from which in the 5th structure is active for the user, the previous 4 and the next 4, and all the others from html were simply deleted. It was necessary to solve the issue by collecting those posts that need to be likened to the array by their link, then when scrolling up (stupidly upwards) to look for the current post in the previously collected array and if there is one there - like.
She is also a drug addiction .. for post in progress: real_time_posts = br.find_elements_by_tag_name('article') post_link = post.get('pl') filtered_posts = [p for p in real_time_posts if self._get_feed_post_link(p) == post_link] if filtered_posts.__len__(): real_post = filtered_posts.pop()
VICTORY!
Action limits
In order not to attract a lot of attention, you need to put some restrictions to the bot. In order to adhere to these restrictions, you need to save the counters of actions taken somewhere. For the storage of all internal information, sqlite was chosen - quickly, conveniently, locally. Right in the library, I wrote a small module to work in the database, and added migration to the same place - for subsequent releases. Every Like / Follow is stored in the database with the hour in which it is made, then likes / followers per day / current hour are considered, based on these data it is decided whether someone else can like or follow someone. Limits are still fixed in the library, you will need to make them configurable.Branch in the development process
While the library for the bot was being written, the tsiferka issue was in my head. It was interesting how many user likes, views, comments in the context of the post or sumno. To satisfy the interest, a small class of the library was written, which through a private api instagram collected all available (without authorization) statistics and gave it to the user:
Hidden text +-- https://instagram.com/al_kricha/ --------------------------+ | counter | value | +------------------------------+-------------------------------+ | followed | 402 | | posts | 397 | | comments | 1602 | | likes | 20429 | | following | 211 | | video views | 6138 | | | +--------- https://github.com/aLkRicha/insta_browser ----------+ +--------------------------------------------------------------+ | top liked posts | +--------------------------------------------------------------+ | https://instagram.com/p/BVIUvMkj1RV/ - 139 likes | | https://instagram.com/p/BTzJ38-DkUT/ - 132 likes | | https://instagram.com/p/BI8rgr-gXKg/ - 129 likes | | https://instagram.com/p/BW-I6o6DBjm/ - 119 likes | | https://instagram.com/p/BM4_XSoFhck/ - 118 likes | | https://instagram.com/p/BJVm3KIA-Vj/ - 117 likes | | https://instagram.com/p/BIhuQaCgRxI/ - 113 likes | | https://instagram.com/p/BM6XgB2l_r7/ - 112 likes | | https://instagram.com/p/BMHiRNUlHvh/ - 112 likes | | https://instagram.com/p/BLmMEwjlElP/ - 111 likes | +--------------------------------------------------------------+
Having such data, we and a friend ( txwkx ) decided to visualize them and created instameter.me - a small service where you can see the “summary” of any open instagram account.
What can a bot?
At present, the bot is not able to do so much as it should be, but nevertheless, it performs the key actions:
- Like a news feed to the last unlisted.
- Like tag on the specified number of posts
- Like location on the specified number of posts
- Auto-follower of people from the location / tag posts, when the setting is turned on, but not all in a row, but only those who can potentially become subscribers
- Collect user statistics
- Stores statistics on the clock about the committed action
What would you like to do in the future?
- Writing ± meaningful comments
- Unsubscribe from unnecessary accounts
- Like a few posts of a freshly named person
- Rewrite the passage of the news feed
- Compare multiple accounts
Conclusion
There is still a lot to do, optimize, rewrite. You can always effectively use the tool for other purposes. Laziness is precisely the engine of progress. I hope someone my bot will help or in work, or a hobby. A repository with a pypi-package can help a beginner automator. A repository with examples may be useful for SMM users. Thank you all for your attention.
Links
- insta_browser - my mini library, the heart of the bot
- insta_bot - examples repository, the bot itself (in this form, I use it)
- instameter - a project to remove statistics on an instagram account