📜 ⬆️ ⬇️

VK Friendly Link Analysis with Python

Most recently, an article appeared on Habré about the implementation of friendships on Vkontakte using Wolfram Mathematica. I liked the idea, and naturally I wanted to make the same graph using Python and d3 . That's what came out of it.

Attention! The article will contain parts of the code, describing the most important actions, but it should be noted that the project will undergo more than one change in its code base. Those interested can find the source code on GitHub .

We divide the task into elements:

What we need for this:

Creation and authorization of the application


To access the VKontakte API, we need to create a Standalone application , after which we will be able to use the API methods we need, which will be described later. The application is created here - select the Standalone application . We will be asked to enter the confirmation code sent to the mobile, after which we will be taken to the application management page. On the Settings tab, we can use the application ID to get access_token .
Next, we need to authorize our application. This process consists of 3 stages.

User authentication on VK website

To do this, create a url, as shown below:
')

https://oauth.vk.com/authorize?client_id=ID&scope=friends,offline&redirect_uri=https://oauth.vk.com/blank.html&display=page&v=5.21&response_type=token 

Quoting vk.com/dev/auth_mobile :
APP_ID - your application identifier;
PERMISSIONS - the requested application access rights;
DISPLAY - the appearance of the authorization window, supported: page, popup and mobile.
REDIRECT_URI - the address to which the access_token will be transmitted.
API_VERSION is the version of the API you are using.

In our case, PERMISSIONS is access to friends and API at any time from a third-party server (perpetual token). If the address is formed correctly, we will be prompted to enter a login and password.

Allowing access to your data

Next, allow the application to access the necessary information:

Getting access_token

After authorization of the application, the client will be redirected to REDIRECT_URI. The information we need will be enclosed in the link.

 https://oauth.vk.com/blank.html#access_token=ACCESS_TOKEN&expires_in=0&user_id=USER_ID 

We edit the settings.py file, inserting the received access_token and user_id there . Now we can make requests to the VKontakte API.

Data acquisition


To begin with, let's look at the methods that we will use for this purpose.

Since we need at least some information about the user id, which will be used to build the graph, we need users.get . It accepts both one id and several, a list of fields, information from which we need, as well as a case in which the surname and first name will be inclined. My base_info () method gets a list of id and returns information about a user with a photo.

 def base_info(self, ids): """read https://vk.com/dev/users.get""" r = requests.get(self.request_url('users.get', 'user_ids=%s&fields=photo' % (','.join(map(str, ids))))).json() if 'error' in r.keys(): raise VkException('Error message: %s. Error code: %s' % (r['error']['error_msg'], r['error']['error_code'])) r = r['response'] # ,  id  settings.py   if 'deactivated' in r[0].keys(): raise VkException("User deactivated") return r 

This can be important for those who want to send an id from friends.getMutual to it , thus generating a huge number of requests. More on that later.
Now we need to get information about the user's friends, in which the friends.get method will help us . Of all its parameters listed in the documentation, we use user_id , which is in our setting.py and fields . Additional fields will be id of friends, their names, surnames and photos. After all, I want the nodes to have miniatures of their photos.

 def friends(self, id): """ read https://vk.com/dev/friends.get    """ r = requests.get(self.request_url('friends.get', 'user_id=%s&fields=uid,first_name,last_name,photo' % id)).json()['response'] #self.count_friends = r['count'] return {item['id']: item for item in r['items']} 

Next comes the most interesting.
The id list of shared friends between two users returns the friends.getMutual method. This is good because we only get id, and we already have more extensive information thanks to friends.get . But no one forbids you to make an extra hundred or two requests using users.get . Schemes are located a little lower.
Now we’ll decide how we’ll use friends.getMutual . If the user has N-friends, then we need to make N-requests so that for each friend we get a list of mutual friends. In addition, we will need to make delays so that we have a permissible number of requests per second.
Suppose the id we scan has 25 friends.
A total of 52 requests are too much, so let's remember that users.get can accept an id list:
25 friends - 28 requests, but as stated above, we already have the information, thanks to friends.get .

And here it is useful for us to execute , which allows you to run a sequence of methods. It has a single code parameter, it can contain up to 25 calls to API methods.
That is, as a result, the code in VKScript will be something like this:

 return { “id": API.friends.getMutual({"source_uid":source, "target_uid":target}), // * 25 ... }; 

Find those who will write how to cut this code without using API.friends.getMutual all the time.
Now we just need to send in batches id friends for 25 each. In our example, the scheme will look like this:

But we could send each friend to friends.getMutual using for , and then find out more detailed information through users.get .
Next, we will create a human-friendly structure, where instead of the friend's id and the list of your mutual friends' id, there will be information from friends.get . As a result, we get something like:

 [({ }, [{ }, {   }]),({ }, None)] 

In the dictionaries is id, first name, last name, photo, in the lists - dictionaries of common friends, if there are no common friends, then None. Tuples of all this is shared.

 def common_friends(self): """ read https://vk.com/dev/friends.getMutual and read https://vk.com/dev/execute               """ def parts(lst, n=25): """     -  25   """ return [lst[i:i + n] for i in iter(range(0, len(lst), n))] result = [] for i in parts(list(self.all_friends.keys())): #  code ( execute) code = 'return {' for id in i: code = '%s%s' % (code, '"%s": API.friends.getMutual({"source_uid":%s, "target_uid":%s}),' % (id, self.my_id, id)) code = '%s%s' % (code, '};') for key, val in requests.get(self.request_url('execute', 'code=%s' % code)).json()['response'].items(): if int(key) in list(self.all_friends.keys()): #       result.append((self.all_friends[int(key)], [self.all_friends[int(i)] for i in val] if val else None)) return result 

So, if you want to see your list of friends and friends in common with them, run:

 python main.py 

Graph visualization


The choice fell on d3 , namely Curved Links . To do this, you need to generate json , which will be something like this:

 { "nodes": [ {"name":"Myriel","group":1, "photo": "path"}, {"name":"Napoleon","group":1, "photo": "path"}, {"name":"Mlle.Baptistine","group":1, "photo": "path"} ], "links":[ {"source":1,"target":0,"value":1}, {"source":2,"target":0,"value":8} ] } 

Slightly modifying index.html , photos of friends become nodes.

If you want to immediately visualize the graph:

 python 2d3.py 

The file miserables.json appears in the web folder. Do not forget to open index.html in Mozilla FireFox or use python -m http.server 8000 and open it in Chrome.

Visualization slows down with a large number of friends, so for the future I think about using WebGL.

This is the graph of the friendships of one of my friends. Connections are everything.

Of course, I was wondering who is faster.

The article that inspired me says:
On my 333 friends, it took 119 seconds.

At the time of this writing, Himura had 321 friends on VK. It took me 9 seconds (the work of the entire program, and not just one friends.getMutual ).

Finally


All the necessary information about the methods used can be found in the generously written VKontakte documentation, but I discovered a couple of errors: the error code 15 was not described ( 'error_msg': 'Access denied: user deactivated', 'error_code': 15 ), you can guess what it means, and uid instead of user_id in the documentation for the friends.get method. After 2 days:


As mentioned at the outset, the project can be found on GitHub , I will be glad if someone else likes it and I get a lot of delicious pull requests ...

UPD (05/27/2014):
As I was told by WTFRU7 , I added the ability to use stored procedures. To do this, click on the link .
Create a stored procedure getMutual . Copy the contents of execute_getMutual.js into a form and save. Do not forget to download a newer version. The final view of our scheme will be as follows:

UPD (06/16/2014):
We get a perpetual token.
UPD (07.11.2014):
Added schema explanation.
UPD (11/14/2014):
Continuation

Source: https://habr.com/ru/post/221251/


All Articles