This post is a small report on the process of reverse design and analysis of the work of the most popular social. networks in the CIS - vk.com. Most of the analysis was carried out by the security side (although the social network itself is very attractive as a high-load project, of course). For myself, made some interesting decisions and just got pleasure. The post turned out to be a bit muddled, so it went down just in interesting moments for me.
Content
Overview
Architecture
php 5.2 / 5.3
Load periods (off tape auto load)
Message Wrappers
Different code for mobile and full versions
Security
Features
Authorization
Anti-CSRF tokens
Ban iframe
POST disabled on content servers
Fiche-bugs
Find out age through search
Bugs
Xss
Loading unnamed documents
Uploading ajax photos from closed albums (?)
Not everywhere anti csrf
miscellanea
Overview
')
First of all, you should read a couple of old, but interesting articles:
After turning on firebug and seeing the site just from the side of the web developer, it immediately becomes clear that there is some mess inside (we’ll return to it at the end of the article), the lack of dispatchers (perhaps to reduce the load), scattered styles and js files and much more. But in general, a straightforward web application that uses nodejs in busy places, and for the rest “standard” (I would even say “worn out”) development technologies (excluding their undisclosed database. I also tune the balancers configuration).
Architecture
php 5.2 / 5.3 In the course of working with the application, depending on which backend we fall on, we come across different versions of php (5.2 / 5.3). Those. This most likely means that the php version is changing more because of the fixation of various bugs in the interpreter itself, than because of the innovations in php 5.3. By the way, suhosin is installed
Load periods (off tape auto load) Most likely, a monitor is written that looks after the load and dynamically balances the configs of the scripts. For example, towards evening the automatic loading of the tape is turned off, which, with a few mln users, can significantly reduce the load.
Message Wrappers The feature of adding different content in messages is made quite primitive, but maybe this is true. When adding a video, sending a message looks like this: act=a_send&al=1&chas=XXXXXXXXXXXXXX&from=box&media=video%3A-22558194_163667075&message=&title=&to_ids=6254003 Those. just indicate the type of attachment (video, audio, map etc ...) and an internal link to it. I have been digging with this moment for a long time, trying to slip something in there — it didn't work out.
Different code for mobile and full versions This is probably obvious, but not gud. Most likely due to the low code abstraction, which does not allow you to simply take a method and call it on another (for example, mobile) interface. There are confirmations, for example, various translations (the status of “single” is translated into English as “Single” in the full version, and “Not married” in the mobile version. I wrote a ticket to this). But these are only templates, there are other points that indirectly indicate different code (perhaps the code is simply truncated-duplicated)
Authorization This is just the first item. Authorization takes place according to the following scheme - the action of the form leads on https to login.vk.com, there he sets up a cookie for himself (https://login.vk.com) and for vk.com (remixsid). If something is wrong with the cookie on the “working” domain (changed, the last IP did not match, etc.), a redirect to login.vk.com automatically occurs. And if the password was actually entered on this browser, there will be a persistent cookie (which was set when the original password was entered), which will be used for automatic login. If not (for example, cookies were taken away and logged in from another ip) then they will not allow your account. By the way, when I found the XSS I got excited. In fact, hitting cookies does not give anything (of course, if it is not XSS on login.vk.com). The same is the record of the last active sessions - 6 pieces, which are considered "alive"
Anti-CSRF tokens I think (I think, not only me) that this method is the best protection against CSRF. But how beautifully they implemented it ... The first moment is standard, for any action a unique code is generated for the user, which is impossible to predict. But here the peculiarity of tokens is that they depend on the parameters of the action. In this case, we have - from_id, to_id, action_id, and sometimes some more parameters. A hash is taken from all of them (most likely some kind of fast 72-bit hash function), most likely salt is added (static?) (Although, recently it is not passed from_id explicitly) and added to the form. Those. for each action with different parameters - a different token.
Ban iframe Actually, the subject. The site can not be displayed in the iframe, which is correct. There are various ways to prohibit the display of a site in an iframe.
POST disabled on content servers For various reasons, this is true to limit the server to available methods. POST is disabled on all content servers.
Fiche-bugs
Find out age through search Little trick. If a person has an age, but is set not to display it - then you can find it out if you find a person through a search and begin to select an age (by filter). Probably a hyper-bearded fiche-bug.
Bugs
Xss Probably, it all started with the fact that I wanted to find XSS, personally for myself. It was found in the main module of the site - search. The vector was: vk.com/search?c%5Bage_from%5D=alert(String.fromCharCode(88, 83, 83, 32, 101, 120, 97, 109, 112, 108, 101, 33))&c%5Bname%5D=1&c%5Bsection%5D=people
Alert screen
Source
The number to search for age in its pure form was substituted into the js-function by the parameter. The only restriction that I “caught here” is the impossibility of using quotes (they were transformed), which is easily bypassed by string functions. Corrected just a day or two. They don't have a “bounty” program. In response, it was said - corrected, check and everything. PS Sniffer successfully infiltrated, but it gave nothing, because of the first point about the authorization system.
Loading unnamed documents This bug and the next did not have enough time to investigate. But something and so the article is constantly postponed, so as it is. When loading documents, he looks at the file extension and either accepts the file or not (essno, I tried to load htaccess and reassign the file roles by extension). There was a bug with the .model file. The script missed it, but it is always "404"
Uploading ajax photos from closed albums (?) By creating an ajax request for the desired document, we can get the full image of the photo (I think, we noticed in the news feed), which is closed by the privacy settings for you personally. Maybe it's worth digging.
Not everywhere anti CSRF tokens It really is. But the places I found were uncritical.
miscellanea
Files are not deleted, even after deletion. Many users somehow care. If you read the articles at the beginning, you can find out that there are two versions about this - the fragmentation of data on hard drives and the problem of data integrity control. The second point is really complicated. The same picture can be deleted, and it may be used somewhere. Analyzing the entire database for an example of using the document is not easy. But it's okay hotlink. But the documents are available only by contacting through the script, but even the documents, after deletion, remain available, although you could add a check. The technical support answer was: “the document will be available by reference, even if it is deleted.” And that's it.
https is not enabled by default Coming back to the Kaspersky letter - right now I advise you to forcibly change your bookmarks from http to https, if this has not yet been done. Now mitm is so simple that enough of any android-smartphone with root. The droidsheep is installed and clicking on just a couple of buttons will already turn out to sniff cookies (and not only) and automatically go to the site with them
Garbage
Trite, but vk.com/config.php exists :) In general, such things are taken out for htdocs
And "in detail". Base begin to clean. For example, the limit of messages with the user in the correspondence is large, but already cleared. Various other developments left trash in a notebook.
Maybe somewhere I was mistaken in the idea of ​​the work of the resource and you have a clearer idea about some of the points described above or studied something yourself - I will be glad to hear and discuss them in the comments.