Already at the time of launch,
Yandex.Disk gave many developers the opportunity to use it in their applications and programs. And it ensures that we have chosen
WebDAV as the protocol for Disk's desktop clients.
Since the protocol determines how programs and the server communicate with each other, almost everything depends on its choice. And the way customers will be arranged, and what file handling capabilities they will have.

')
Today we want to talk about the reasons that stopped our choice on WebDAV and made it the protocol for
Yandex.Disk clients .
Thanks to the
API implemented on its base,
ABBYY FineScanner ,
Handy Backup 7 ,
ES Explorer and the
unofficial Yandex.Disk client for Linux are already working with our service.
Before choosing a protocol, we defined for ourselves the most important requirements for it:
- Work speed;
- Open license;
- The ability to implement all the necessary actions: authentication, support for file operations, competitive access to files, resuming from the server and resuming downloading to the server;
- Prevalence - it should work with target operating systems (primarily Windows, Mac OS X, Linux) out of the box or with minimal modifications.
We were even ready to develop our own protocol if the existing ones did not suit us. Changing the protocol after launch would require a lot of man-hours of work, so it was necessary to explore different options and choose the one that best meets our requirements.
FTP . This protocol for remote work with files is time tested. But it was created without taking into account the requirements of information security, which became for us its significant drawback. In addition, it does not support many of the operations we need, for example, the transfer of metadata along with the contents of the file. And it requires special applications to connect.
BitTorrent . Since it was about synchronization between devices at once, it would be very useful to use the connection between them without creating a load on the servers, but this would require double development work on the client. In addition, there would be problems when working through NATs and firewalls, which would greatly reduce the benefits of using this protocol.
Amazon S3 . This repository uses its own HTTP-based protocol. We considered the possibility of using the S3 API, however, we rejected this idea due to the lack of familiar work with directories and the need to use special applications for access.
WebDAV . Based on HTTP and XML and easily extensible, it supports almost everything we need in specifications. C it works quite well pre-installed packages in all target operating systems. In addition, the Yandex.Disc desktop client development department, which was working on the Yandex XMPP server, at that time already had experience with open XML-based protocols.
The main reason for which we did not want to create our own protocol was that only our applications could work with it, and we wanted openness.
In the end, of all the options discussed, we chose WebDAV. The only thing missing in the protocol is informing the client about changes on the server, a very important synchronization feature. But since the protocol is extensible, it did not become a problem.
After selecting the protocol, work began on a prototype of Yandex.Disk. We wrote our WebDAV server on
Erlang . As a framework for the web server,
mochiweb was chosen, quite lightweight and well-known to our developers library. It was also used in the well-known article about connecting millions of users to one server -
A million user comet application . We also thought about using the
Yaws web server, which can be compared to Apache. This is a full-fledged web server that can give statics, run CGI scripts, handle special pages with server scripts. But we did not need all this. If we started doing the project now, the choice would be
Cowboy , as it provides more opportunities to identify connection problems.
After studying the WebDAV protocol, work began on listing the files and directories on the server. As a repository for the prototype used mysql-database, which stored the meta-information and the usual file system for storing the contents of files. Scaling and high reliability at this stage was not required.
The layout was pretty simple, since it was a prototype. As is usually the case with file systems, there was a question of restrictions on the way. Since the maximum length of the path to the resource was not specified in the protocol, it was decided to make the path component length 255 characters and the number of nesting levels unlimited. Approximately the file storage table looked like this:
id | number, auto increment, unique resource identifier |
uid | user, resource owner |
path | string length 255 resource name |
type | resource type, file or directory |
parent | number, owner id |
depth | number, resource nesting level used to optimize sample requests |
One of the first non-trivial tasks was the listing of the root, in which there is nothing. The difficulty is that
the PROPFIND method , besides just the listing, also performs the task of reading the properties of the resource. It was necessary to properly parse the request, to understand what we can give out and what not; form the correct answer. As the first client, the gvfs built into Ubuntu was used. Having debugged the work with it, we decided to check the work of the connection from Windows 7 and found that it does not work with us. The study of working with other servers showed that the clients built into Windows do not process the “DAV:” namespace, if it is declared default, without a prefix. Other standard clients turned out to be more tolerant and easily digested the output, formed especially for Windows clients. Fortunately, this was the only incompatibility that we managed to find.
When the work on the listing was completed, we implemented the trivial operations of creating directories and deleting resus.
Further it was required to learn how to upload files, but this operation was not so simple. And why - if this topic will be interesting to you - we will tell in the next post.