📜 ⬆️ ⬇️

Crossposting posts from Instagram to VK public in Python



Foreword


I decided to enter a new sales market, especially the target audience of my online store, which does not have Instagram accounts, has long been interested in the appearance of a duplicate in VK. The idea is good, but the posts on the page are hundreds, and ctrl + c ctrl + v didn’t feel like working manually, plus the further prospects of the ape’s work were not impressive.

Sure that the Internet is full of free solutions, I began to google. Naturally, the first pages of search results are full of paid services, with rather extensive functionals. But for me, just for everything, I had to transfer all posts from the Instagram page to VK public and further replenish it synchronously.
')
Not finding anything suitable could be badly searched , it was decided to strangle the toad to write the script myself. Chose the python language. Simple, convenient, without unnecessary bells and whistles, and the speed in this matter is not important.

Documentation on api Instagram and VK is quite detailed and the task does not seem complicated. After freeing myself a couple of evenings, I set to work. First of all, it was necessary to get tokens both on Instagram and VK. There were no problems with this, both were received in a couple of minutes.

Then the first pitfall was waiting for me ...

Crossposting first 20 posts


To my surprise, I discovered that after changes in Instagram policy, it was possible to get a json dictionary (links to photos, post descriptions, publication dates ...) only to the last 20 posts on the page. It suited me for the second task - updating the public with new posts from time to time. Because new posts do not appear so often, and 20 posts are quite convenient. It was decided to first take on this puzzle.

Get the VK session and declare the necessary variables:

session = vk.Session( access_token='123abc') #  123abc   vk_api = vk.API(session, v='5.85') groupID = '12345678' #id  vk upload_url = vk_api.photos.getWallUploadServer(group_id=groupID)['upload_url'] #  vk   data = [] photo_id = 0 attachments = [] direct = 'C:/Users/jo/PycharmProjects/repost/foto' #     d = date(2019, 3, 21) # ,      data_parsing = int(time.mktime(d.timetuple())) #    unix  

Next, write the main function. First we get an array of data, with which we will work:

 answer = get( 'https://api.instagram.com/v1/users/[id_inst]/media/recent?access_token=[access_token]', verify=True).json() 

In the query you need to insert your data instead of square brackets. Find the page id by username, for example, here .

If we get something back, go ahead:

 if answer: for x in answer['data']: #     json  global photo_id #     photo_id = 0 global attachments attachments = [] global data data = [] date_create = x['created_time'] # date_create    Unix if int(date_create) > data_parsing: n = 0 #     if not os.path.isdir(direct + '/' + date_create): #     os.makedirs(direct + '/' + date_create) if 'carousel_media' in x: #      for a in x['carousel_media']: li = list(a.keys()) #      if li.count('videos') == 0: #    ,      req.urlretrieve(a['images']['standard_resolution']['url'], "foto/" + date_create + "/" + str(n) + ".jpg") #       if x['caption'] is not None: #   text = str(x['caption']['text']) else: text = ' ' n = n + 1 post_foto(date_create, text) else: #      req.urlretrieve(x['images']['standard_resolution']['url'], "foto/" + date_create + "/0.jpg") if x['caption'] is not None: #   text = str(x['caption']['text']) else: text = ' ' post_foto(date_create, text) 

I repent
I know that it’s bad to work with global variables this way, but the size of the script makes it possible not to delve into the complexity

So, if the date of publication of the post is more than the date we set, create a folder whose name (attention tautology, especially impressionable skip) the date of its publication. Next come checks for the number of photos and videos. Surely you can attach it, I just do not need it. We load photos in the created folder. We take the description to the post by the key caption and go to the post_foto function:

 def post_foto(date_create, text): quantity_foto = len([name for name in os.listdir(direct + "/" + date_create) if os.path.isfile(os.path.join(direct + "/" + date_create, name))]) #     append_attach(quantity_foto, date_create) params = {'attachments': attachments, 'message': text, 'owner_id': '-' + groupID, 'from_group': '1'} vk_api.wall.post(**params) #    VK    

We determine the number of photos in the folder, upload them to the VK server, add them to the post parameters and publish it in the public. Adding to the parameters is done using the append_attach function:

 def append_attach(x, directory): #     for t in range(x): upload_foto(t, directory) global photo_id photo_id = data[t][0]['id'] global attachments attachments.append('photo' + str(data[t][0]['owner_id']) + '_' + str(photo_id)) #  id     

And directly upload photos to the server VK is performed by the upload_foto function:

 def upload_foto(num_foto, directory): #     request = requests.post(upload_url, files={'photo': open("foto/" + directory + "/" + str(num_foto) + ".jpg", "rb")}) params = {'server': request.json()['server'], 'photo': request.json()['photo'], 'hash': request.json()['hash'], 'group_id': groupID} global data data.append(vk_api.photos.saveWallPhoto(**params)) 

With the second task figured out. The script can be executed either on its own or on a schedule (for example, in cron , once every 15 minutes). And now how to transfer all the other hundreds of posts?

Transfer entire page


Part of the script is already ready with us, the one that was responsible for the publications in VK. It remains to find a way to deflate all the photos and descriptions to them. I did not bother with parsing the source codes of Instagram pages and took a ready-made solution. In fact, I am sure that there are many such programs, I used the first one that came free ( 4K Stogram ). An intuitive interface allows you to quickly cope with the download of all photos from the page. The menu also has an export of all descriptions to posts. We need the " * .txt " format. It remains only to decompose all the photos into folders (one post - one folder) and match the description of the posts from the textbook according to the regular expression.

The following code will help us lay out the photo in folders:

 i = 0 for top, dirs, files in os.walk(os.getcwd()+"\\res\\"): for nm in files: if re.findall(r'\d\d\.\d\d\.\d\d', nm): old_file = os.path.join(top, nm) frq = re.findall(r'\d\d\d\d-\d\d-\d\d \d\d\.\d\d\.\d\d', str(nm)) frq = str(frq[0]) if not os.path.exists(os.getcwd()+"\\res\\" + frq): i = 0 os.makedirs(os.getcwd() + "\\res\\" + frq) new_file = os.path.join(os.getcwd() + "\\res\\" + frq, str(i)+'.jpg') os.rename(old_file, new_file) i = i+1 else: new_file = os.path.join(os.getcwd() + "\\res\\" + frq, str(i)+'.jpg') os.rename(old_file, new_file) i = i+1 

Here the key point is the time of publication, which combines several photos into one folder.

Well, then everything is simple. Open each folder, upload all photos to the server, attach a description and publish. Do not forget about the restriction of VK: no more than 50 posts per day:

 f = open(os.getcwd() + "\input_opis\\gusi.txt", "rt", errors="ignore", encoding='utf-8') #     data = f.read() #  ,      result = re.findall(r'"([^\"]*)"', data, re.S) new_x = [el for el, _ in groupby(result)] #     dlina = new_x.__len__() i = 1 f.close() for top, dirs, files in os.walk(os.getcwd() + "\\res\\"): #   for nm in dirs: attachments = [] photo_id = 0 data = [] DIR = 'C:/Users/jo/PycharmProjects/repost/res/' + nm #     quantity_foto = len([name for name in os.listdir(DIR) if os.path.isfile(os.path.join(DIR, name))]) post_foto(quantity_foto, nm) attachments.reverse() #   params = {'attachments': attachments, 'message': new_x[dlina - i], 'owner_id': '-' + groupID, 'from_group': '1'} vk_api.wall.post(**params) i = i + 1 

Perhaps I didn’t go the easiest and quickest way, but the result was achieved. Thank you all for your attention, I am ready to answer all the questions and critical remarks on my script.

All 3 scripts entirely:

Crossposting first 20 posts
 from requests import get import urllib.request as req import vk import requests import os import time from datetime import date session = vk.Session( access_token='123abc') vk_api = vk.API(session, v='5.85') groupID = '12345678' #id  vk upload_url = vk_api.photos.getWallUploadServer(group_id=groupID)['upload_url'] #  vk   data = [] photo_id = 0 attachments = [] direct = 'C:/Users/jo/PycharmProjects/repost/foto' #     d = date(2019, 3, 21) # ,      data_parsing = int(time.mktime(d.timetuple())) #    unix  def upload_foto(num_foto, directory): #     request = requests.post(upload_url, files={'photo': open("foto/" + directory + "/" + str(num_foto) + ".jpg", "rb")}) params = {'server': request.json()['server'], 'photo': request.json()['photo'], 'hash': request.json()['hash'], 'group_id': groupID} global data data.append(vk_api.photos.saveWallPhoto(**params)) def append_attach(x, directory): #             for t in range(x): upload_foto(t, directory) global photo_id photo_id = data[t][0]['id'] global attachments attachments.append('photo' + str(data[t][0]['owner_id']) + '_' + str(photo_id)) #  id     def post_foto(date_create, text): quantity_foto = len([name for name in os.listdir(direct + "/" + date_create) if os.path.isfile(os.path.join(direct + "/" + date_create, name))]) #     append_attach(quantity_foto, date_create) params = {'attachments': attachments, 'message': text, 'owner_id': '-' + groupID, 'from_group': '1'} vk_api.wall.post(**params) #    VK    def get_all(): answer = get( 'https://api.instagram.com/v1/users/12345678/media/recent?access_token=123abc', verify=True).json() #     instagram if answer: for x in answer['data']: #     json  global photo_id #     photo_id = 0 global attachments attachments = [] global data data = [] date_create = x['created_time'] # date_create    Unix if int(date_create) > data_parsing: n = 0 #     if not os.path.isdir(direct + '/' + date_create): #     os.makedirs(direct + '/' + date_create) if 'carousel_media' in x: #      for a in x['carousel_media']: li = list(a.keys()) #      if li.count('videos') == 0: #    ,      req.urlretrieve(a['images']['standard_resolution']['url'], "foto/" + date_create + "/" + str(n) + ".jpg") #       if x['caption'] is not None: #   text = str(x['caption']['text']) else: text = ' ' n = n + 1 post_foto(date_create, text) else: #      req.urlretrieve(x['images']['standard_resolution']['url'], "foto/" + date_create + "/0.jpg") if x['caption'] is not None: #   text = str(x['caption']['text']) else: text = ' ' post_foto(date_create, text) get_all() 


Sort photos by folders
 import re import os i = 0 for top, dirs, files in os.walk(os.getcwd()+"\\res\\"): for nm in files: if re.findall(r'\d\d\.\d\d\.\d\d', nm): old_file = os.path.join(top, nm) frq = re.findall(r'\d\d\d\d-\d\d-\d\d \d\d\.\d\d\.\d\d', str(nm)) frq = str(frq[0]) if not os.path.exists(os.getcwd()+"\\res\\" + frq): i = 0 os.makedirs(os.getcwd() + "\\res\\" + frq) new_file = os.path.join(os.getcwd() + "\\res\\" + frq, str(i)+'.jpg') os.rename(old_file, new_file) i = i+1 else: new_file = os.path.join(os.getcwd() + "\\res\\" + frq, str(i)+'.jpg') os.rename(old_file, new_file) i = i+1 


Posting in VK of all publications
 import os import vk import requests import re from itertools import groupby session = vk.Session( access_token='123abc') vk_api = vk.API(session, v='5.85') groupID = '12345678' upload_url = vk_api.photos.getWallUploadServer(group_id=groupID)['upload_url'] data = [] photo_id = 0 attachments = [] def upload_foto(num_foto, direc): request = requests.post(upload_url, files={'photo': open('res/' + direc + "/" + str(num_foto) + ".jpg", "rb")}) params = {'server': request.json()['server'], 'photo': request.json()['photo'], 'hash': request.json()['hash'], 'group_id': groupID} global data data.append(vk_api.photos.saveWallPhoto(**params)) def post_foto(x, direc): for i in range(x): upload_foto(i, direc) global photo_id photo_id = data[i][0]['id'] global attachments attachments.append('photo' + str(data[i][0]['owner_id']) + '_' + str(photo_id)) f = open(os.getcwd() + "\input_opis\\gusi.txt", "rt", errors="ignore", encoding='utf-8') #     data = f.read() #  ,      result = re.findall(r'"([^\"]*)"', data, re.S) new_x = [el for el, _ in groupby(result)] #     dlina = new_x.__len__() i = 1 f.close() for top, dirs, files in os.walk(os.getcwd() + "\\res\\"): #   for nm in dirs: attachments = [] photo_id = 0 data = [] DIR = 'C:/Users/jo/PycharmProjects/repost/res/' + nm #     quantity_foto = len([name for name in os.listdir(DIR) if os.path.isfile(os.path.join(DIR, name))]) post_foto(quantity_foto, nm) attachments.reverse() #   params = {'attachments': attachments, 'message': new_x[dlina - i], 'owner_id': '-' + groupID, 'from_group': '1'} vk_api.wall.post(**params) i = i + 1 

Source: https://habr.com/ru/post/445408/


All Articles