1.5 MB network file system Habr

Maybe not everyone knows, but Habr is not far behind the market leaders in providing data storage on their servers to all registered users. It is not as big as Google’s or Yandex’s, only megabytes, but it will allow you to store a dozen draft articles and other data tied to the site to use with site materials without cross-domain access in scripts. We assume that for all our needs, 1 MB of Unicode characters is enough. What does the system offer?

Available types and volumes:

1) In the form of draft articles (visible only to the author). Each draft stores at least 100 thousand characters. Drafts - conditionally unlimited.

2) In the form of notes to users (visible only to the author, but no one guarantees vulnerabilities, leaks and deletes when the user closes the account). One note stores up to 65535 characters. There are as many notes as there are authors (but the limit number of system notes was not checked). You can not write a note to yourself.
')
3) In personal messages to the interlocutors. (You can not write to yourself).

4) In a public message on a personal page, up to 65 K (except that it is public, it is also on another adjacent domain of the 3rd level).

5) (a bit publicly (c)), in habrastorage files, if you don’t tell their names to anyone, then you can store a lot of data rather compactly and almost imperceptibly. But they do not know when will be removed, so all versions will be open for all to read indefinitely.

It’s up to these rich features to write drivers to access something like the browser’s localStorage API. Optimization pragmas (indications about the future usage format) will be needed to make data distribution efficient.

These methods will not lead to the inconvenience of other readers, unless you consider the cases of erroneous publications of drafts and personal messages to others without warning. Even publishing a public post on a personal page can be hidden by an HTML tag and will not be visible to readers.

How to use?

In total, to store 1 MB of characters, you need to have 16 notes or 10 (and less) drafts. The most error-proof way is to create notes. Therefore, we will not consider other methods, although a draft can store a block of information of a larger size.

On the contrary, large blocks of information are not always needed. Most often - small, and here the notes also win, because they are shown with a limited sidebar, and drafts - with a full (in reading mode). In sum, it turns out less load design, unnecessary traffic.

It is not very convenient that the list of notes is displayed in full, and not the first N characters. This can be convenient for dumping notes or another reason for a complete reading. But inconvenient for viewing indexing information that could be stored at the beginning of each note. But since there is no such thing, we will select 1-2 notes under the index (two - for duplicating the index, as in real disk systems) and we will almost always read only individual notes, knowing in advance the user names for them.

Then we need to choose 16 user names, and, perhaps, to get a personal read-only account, to be sure of its non-deletion, for the content of the index file. Since the package needs to read small, but current volumes, several accounts in our file system need to be allocated to files of small size. Not all accounts will contain large volume notes. It is more profitable to have several times more account names than to write in one long line and be forced to read unchanging data in a line. Therefore, for the file system, you need to prepare several times more account names than 16. For example, 100.

In a pair of duplicate indexes we will contain a table of the distribution of files by account names.

It should be noted that the use of someone else's name does not "burden" the account holders. For them, the notes do not exist, it is - the personal information of the user. Formally - about them, but in fact - arbitrary. Users may damage files by deleting their account. The administration can do the same by deleting a user account or by posting notes.

For reliability, as well as, to maintain the same system for users without accounts, you need to have a backup server for storing notes. Maintaining a system of 20-100 text files will greatly simplify the work with this megabyte of information. In fact, it will be highly structured information like the browser's local storage, but available on the web. It will be possible to work with him on the same principles - through reference to a single-level structure (hash, dictionary). The lines in it can be serialized structures.

The entire file system will work on JS and via Ajax requests.

To be continued, but for now we will formulate suggestions for the owners of this free hosting - maybe they will have time to finalize something by the time of release.
1. The list of notes is very desirable to compile from the beginning of the texts in order to use them as indices (100-300 characters);
2. Have a list of 100 guaranteed non-removable users with calculated names like aa00..aa99.
3. Have a mode of reading the list and notes without design - save 18K Bytes on each request!
4. Make the removal of pictures from the author by the author so that it can be used as public data - for example, write articles in the format of pictures and publish corrections and comments to them. Then it will be enough to have a script for expanding the texts of articles and corrections from the binary data of the pictures. Since only script users will be able to read, this format will become popular among developers, the share of phone reviews will decrease in pictures.
5. Do not duplicate data in input hidden, so as not to make the page of a note heavier.

While negotiations and approvals take place, we will have time to write drivers for the file system using at least the described existing capabilities of the site. Since they will be cross-domain, even non-use of the site’s capacity will not interfere with the storage operation on another domain in a similar format.

The project is a crownsourcing project ( like this one ), the codes will be laid out on Github. As practical tasks, readers will be asked questions:
1. What is the maximum amount of a single message in a drug system?
2. What is the maximum amount of article supported in the draft? (I checked that 100K is stored.)
3. What is the maximum amount of private data you can store in total?

Source: https://habr.com/ru/post/194560/

All Articles

1.5 MB network file system Habr

How to use?

More articles: