📜 ⬆️ ⬇️

Release of KPHP and engines

Quite often, speaking at various conferences, we shared our desire to release KittenPHP under an open license, according to the tradition laid down by large IT companies, such as Google and Facebook.

This event was postponed several times due to the fear that we did not have enough time and energy to interact with the opensource community, but in the end the cherished day came, and the KPHP code and some other tools used inside the project were laid out in open access.

In this regard, under the cut you will find a more detailed story about the internal structure of VKontakte and those tools that are now available to opensource-community.
')



The source codes were laid out under GNU licenses (GPL and LGPL). These licenses are ideologically close to us, since we often used tools licensed specifically by GNU to create these libraries.

Kphp

VKontakte source code is developed in a PHP-like language called KittenPHP or KPHP for short. This code is translated into C ++ by a special translator with the same name. After that, the generated C ++ code is automatically compiled using gcc, resulting in a binary ready to run. This binary is a web server that accepts http requests and generates pages.
In order to speed up the development process, KPHP compiles the various project files separately, and then links it. Subsequent compilations process only modified files, or, in the case of large files, only parts of them.

KPHP is a minimalist language designed to provide very high speed of work, without prejudice to the convenience and speed of development. In this regard, KPHP does not support all the features of PHP, in particular, it does not have OOP, with the exception of some objects of the standard library. In addition, eval and related things, such as regular expressions with the 'e' modifier, are not supported (instead, it is suggested to use the preg_replace_callback function). Also, functions for working with certain elements of arrays first, end, next, prev, current, reset, key; to replace them, the getValueByPos and getKeyByPos functions are implemented.
Refusal to support a large amount of functionality allowed KPHP to become incredibly fast compared to other web development tools.
As an example, we compared it with the HipHop VM developed on Facebook and obtained the following results:
TestsKphpHHVMPhp
simple0.0000.0070.137
simplecall0.0000.0040.174
simpleucall0.0070.0080.178
simpleudcall0.0070.0090.181
mandel0.0100.0660.392
mandel20.0110.0740.355
ackermann (7)0.0010.0110.189
ary (50,000)0.0030.0080.024
ary2 (50,000)0.0030.0100.022
ary3 (2000)0.0110.0770.191
fibo (30)0.0030.0190.481
hash1 (50,000)0.0180.0340.044
hash2 (500)0.0110.0210.039
heapsort (20000)0.0120.0400.101
matrix (20)0.0070.0210.121
nestedloop (12)0.0000.0120.235
sieve (30)0.0130.0160.114
strcat (200000)0.0020.0050.014
results0.1190.4422.992


The test code is available at the link:
gist.github.com/anonymous/9391146#file-bench-php

From a development point of view, KPHP is sufficiently compatible with PHP, so that you can use plain PHP to quickly test written code, and only compile code before final testing and rolling out the project. To support the functions implemented in KPHP, but not in normal PHP, a special library github.com/vk-com/kphp-kdb/tree/master/vkext was added , extending the capabilities of PHP.

In addition, KittenPHP is a good static PHP code analyzer, indicating potential errors. For example, in the process of transferring VKontakte to it a year ago, more than 20 serious bugs were found.

Together with the compiler under an open license, the developers have laid out a set of engines that perfectly complement KPHP, but can be used separately from it. For the first time we announced these libraries of the opensource-community on Highload 2010 , so we apologize for a rather long waiting period.

PMemcached (“Persistent Memcached”)

Reliable key-value storage that allows you to store data without time limit. According to the MC protocol, the engine works identically to Memcache, except that after a reboot, all data remains.
In addition to its main functions, when enabled, the corresponding option in the pmemcached configuration allows you to immediately receive groups of records whose key prefix matches the one specified in the request.

Lists

This engine allows you to store and receive various lists of data.
One copy of the engine can store a set of lists. Each list must have an identifier (int) by which it is possible to work with this list.
Each list may contain an unlimited number of items. Each element must also have an identifier (int), a value (int), a flag (int) and can store arbitrary 256 characters of text.
In addition to receiving lists, it is possible to receive sublists by filtering by flags and sorting by values.

Documentation: github.com/vk-com/kphp-kdb/blob/master/docs/ru/KittenDB_Lists.wiki

Lists-x

A modification of the Lists engine, which allows the use of keys and identifiers of records consisting of more than one number (int), but of the number of numbers (int) pre-set in the engine configuration. For example, it allows you to create lists whose key is formed from the user ID and the record ID on its wall.

Documentation: github.com/vk-com/kphp-kdb/blob/master/docs/ru/KittenDB_Lists-X.wiki

Search

Designed to search for data on the site. Any textual information can be indexed in the engine with a specific identifier, and subsequently found by words in the text. The search results will return the identifiers specified during indexing.
Search supports arbitrary parameters for search by criteria, and special parameters for different sorts. The engine also allows for complex groupings and intersections.

Documentation: github.com/vk-com/kphp-kdb/blob/master/docs/ru/KittenDB_Search.wiki

Storage

The engine is designed to store user data - photos, video, audio, documents. By storing different content in one file and indexing offsets in memory, Storage does it better than using the classic approach of storing individual files in the file system.

Documentation: github.com/vk-com/kphp-kdb/blob/master/docs/ru/KittenDB_Storage.wiki

Texts

Texts engine allows you to store various text data sets. It was originally developed for VK's private messaging system, but was later reused for walls and for comments.
In addition to storing texts, the engine supports various groupings of lists with texts and text search. Thanks to him, an instant search through the entire personal correspondence of the user is available, no matter how big it is.
An HTTP server is also built into this engine that implements a long poll for receiving updates from the client side. However, later for this purpose a separate queue engine was created, which is described below.

Documentation: github.com/vk-com/kphp-kdb/blob/master/docs/ru/KittenDB_Texts.wiki

Hints

Hints solves two important tasks:
1) Designed to search for user objects by the prefixes of words, used for quick search on the site.
2) Allows you to generate ratings of objects, with which you can organize lists of objects according to the degree of interest to them from the user. For example, the list of friends of VKontakte works this way.

Documentation: github.com/vk-com/kphp-kdb/blob/master/docs/ru/KittenDB_Hints.wiki

Queue

Queue allows you to organize communication between client and server sides in real time. The client connects to the assigned Queue server and receives updates from it, and the server can send the corresponding event to the client at any time. Thanks to the use of channels that a client can subscribe to when connecting to a Queue, the engine can be used to transfer one-to-many data, for example, when a user has a news page, he subscribes in queue to the events of all his friends, groups and subscriptions. When someone from this list publishes a record, he also writes it to the corresponding queue subscription, and each subscribed user receives information about this on the client, after which the latter can display the data received.

Documentation: github.com/vk-com/kphp-kdb/blob/master/docs/ru/KittenDB_Queue.wiki

In addition to these, you can find in the repository a number of other, not so versatile, but equally interesting tools, documentation for which you will find here .

Conclusion

Through the publication of these developments, we return the debt to the open-source community, which many owe.

We hope that now they will help the projects that are currently being developed, like MySQL, Memcache, nginx and PHP have helped in creating VKontakte.

The source code for the engines and KPHP you can see in the repository on github: github.com/vk-com/kphp-kdb
Detailed documentation is available at: github.com/vk-com/kphp-kdb/tree/master/docs/en

Source: https://habr.com/ru/post/214877/


All Articles