Video hosting development on Erlang

We present to your attention the report by Maxim Lapshin made at the Application Developer Days conference. We have put together video and audio, presentation slides, as well as a transcript of the report. The latter required tremendous effort, but it is clearly worth it. The forty-minute report can be “heard” several times faster.

Brought together the video and presentation into a single video clip, and also recorded the transcript of Stas Fomin (man and ~~steamboat~~ locomotive :)).

annotation

Maxim Lapshin ( erlyvideo ), developer of scalable web services, spoke about the development of video streaming server for Erlang. This is an open-source project ErlyVideo - a growing, reliable, scalable and free server for broadcasting any video - from security cameras to video conferencing. Of particular interest is the technology, because it was the choice of such a little-known language like Erlang, which ensured high reliability, scalability and speed of development.
')
Erlang is a reliable object language for creating network services. The concepts of processes and data uncomfortableness adopted in it make it the only platform in which both garbage collection and the fixed time of death of an object exist at the same time. The semantics of the language is one of the simplest among the common ones on the market.

These features make Erlang an excellent choice for serving statefull clients: a video streaming server (erlyvideo), the most common jabber server (ejabberd), poker servers (OpenPoker), etc. The report examines why it is very convenient to do this at Erlang.

Video

Podcast

Link to the podcast .

Presentation

Link to PDF with slides .

Transcript

The transcript of the video recorded Stas Fomin.

What is streaming?

Good afternoon, my name is Maxim Lapshin, I am the author of the ErlyVideo video streaming server written in Erlang , and today I would like to tell you what this product is, why it was made, what it is, why I did it all.

So, what is streaming in general?

Slide:
YouTube is not streaming

What is YouTube? A huge number of different videos are stored there, but this is not a streaming project, there is no streaming. There are just ten-second videos that are given to you by nginx-ohm, well, or another web server.

Each browser comes in, gets some video and loses it. But this has nothing to do with streaming video. Where, then, is all this used, what is it all about?

Custom TV

Slide:
User uploads video files
Makes a playlist
At the request of other playlist starts playing
If no one needs, then the video is not played

Let's look at the task of custom television, which I recently had. So, the idea is what the user uploads his files, almost like on YouTube, then he creates a playlist of them, which he wants to be shown as a TV program on ordinary television.

After that, other users come at some time, not necessarily at the right time. They come whenever they want, they see his interesting playlist, they like his program, they want to see what he has taken. And then, when all users diverge, they became uninteresting, it was necessary to release all these resources that were downloaded.

Why not do it all on the usual nginxe?

It seems like a run-in technology, for the same YouTube it works, but no, it does not work.

What makes a streamer?

Slide:
What makes a streamer?
Unpacks video and audio from file containers
Packs in shipping container
Sends frames synchronously with real time

The problem is that it is necessary to transfer the available files to the video stream. Because it is necessary monotonously for all users to show the same thing.

For this task, you just need a streamer, that is, a streaming video server that will do this.

He will take the necessary files from those containers in which they lie on your disk.
Packs them in a shipping container, which will be able to deliver the video to those users who came to see it.
And it is very important to understand that it sends frames synchronously with real time.

So, if you log in, in half an hour you get real time by the hour, you get half an hour in real time, which is from a file. If you just download a file, you could download it much faster.

Codecs Retreat

A small digression for everyone to understand what codecs and containers are, so that there is no confusion. A codec is something that you really want ..., this is the format of the raw data captured from the matrix of a video camera, or there from a microphone.

The container is what the already encoded data is packed into. For example, if you met h264 or AAC, then these are video and audio codecs, respectively. And MP4 ... - there is no such codec, it is a container into which absolutely any video and absolutely any sound can be removed.

Slide:
Codec - format for presenting compressed audio and video data
Container - the packaging format of one or more audio and video streams in a file or stream
H.264 / AAC - the best codecs
MP4 - the smallest file container

Stages User TV

Slide:
Stages User TV
Download playlist
Unzip file
Pack frames into a transport container (RTMP, MPEG-TS, ...)
Clean up everything when customers leave.
Allow to update the code without disconnecting clients

So, what is the server doing, showing custom television? It downloads playlists, unpacks files, repacks them, cleans up everything, and a very important thing, which I did not mention separately, is that in this task it is very important to update the code without disconnecting clients, because there are up to a thousand, ... many thousands customers who came to see an interesting video. If we want to roll out an update for them at this moment, then these thousands of clients will come to us again, reconnect.

Besides the fact that it is banal just inconvenience for users, it is also very, very, very expensive, because our traffic will be completely blocked.

Traditional solutions

What do people usually do on such decisions? Traditional solutions are traditional tools - Java, C ++, they have products that stream video. For example, Red5, free, paid Wowza, written in Java, or rtmpd, written in C ++.

Parsing mp3 to java

What is the problem…? Well, I gave an example, this is a small piece of code, like parsing RTMP in Java, this is a small piece of code, one hundredth of a file.

This is how any Java server looks like, you can look closely - this is a small, small, part of the file. To understand this is very difficult.

if (id3v1 instanceof ID3V1_1Tag) { try { // Add the track property graph.add(mp3Resource, processor.resolveIdentifier(IdentifierProcessor.TRCK), factory.createLiteral("" + ((ID3V1_1Tag) id3v1).getAlbumTrack())); } catch (GraphException graphException) { throw new ParserException( "Unable to add track number to id3v1 resource.", graphException); } catch (GraphElementFactoryException graphElementFactoryException) { throw new ParserException( .... ѐ 600   graphElementFactoryException); } }

Parsing mp3 to Erlang

This is all you need to write on Erlange to decode MP3s. Everything. Five lines. It is already all unpacked and can be sent to users.

 decode(<<2#11111111111:11, VsnBits:2, LayerBits:2, _:1, BitRate:4, _/binary>> = Packet) -> Layer = layer(LayerBits), Version = version(VsnBits), <<Frame:(framelength(bitrate({Version,Layer}, BitRate))/binary, Rest/binary>> = Packet, {ok, Frame, Rest}.

Accordingly, what we get - starting from the very beginning, when unpacking a file, something is not right, both in Java and in C ++, a lot of code, a lot of laid-back logic, we write in our code.

But everything becomes ... all this syntactic sugar, it all becomes completely unimportant when thousands of clients come to us.

And we have problems of a completely new character, without regard to whether it is convenient or inconvenient to write code there.

What is the problem? Well, this is all as always: memory management, so that segfolts are not flowed and caught, this is control over the resources of the clients who came to us, who need to be remembered, and who needs what and when, in order to effectively free up memory.

In the case of C ++, we have another problem, Java allows you to somehow protect the code due to the lack of direct work with memory. In C ++, a bug in one place, especially if you have a multi-application, it can destroy the entire application and you can never debug this bug. You can be sure that there is a bug in any C ++ program, especially multi-thread, that you haven’t found yet, do not even expect it to be there.

And another problem is that when you start thousands of customers, you need to be difficult to organize input-output. It is not enough for you to just use threads, and just to write to a socket, you need to use complex libraries or events that use different complex mechanisms.

Slide:
Problems of classic solutions with thousands of customers
Memory management: leaking or premature release
Control over customer resources
Chaotic destruction of the system in case of failure in one place
I / O when serving thousands of customers

What happens? Red5 server crashes under a hundred users. Ah, here unfortunately, not red turned out

Already a hundred users server falls and does not serve customers. Why? Yes, because it is written poorly, when it was developed, people did not take into account the input-output issue, and now it just stops servicing.

In the case of Wowza, we have other problems that arose with my clients - they have Vovza flows, despite the fact that there is a garbage collector in Java, somewhere some link remains, resources are not released, the server swells, and so here it looks scary.

How does this work? Well, for example, our streaming server serves some other message delivery channel. The user logs in, the object that was created for him is registered via some channel in the list, the link to it is remembered, the test for the object is held, the user is disconnected, but we forgot, we forgot to remove the link to it. Everything.

His data remained forever, we can not get information that needs to be disabled. And that's all, until the server restarts.

Slide:
epoll / kqueue are difficult for long connections due to memory management.

As for I / O, the epoll / kqueue mechanisms, for which there are libevent libraries, are the only way to serve thousands of sockets, they are very, very complex, for ... when you start complicated business logic ..., because effective memory management is Event model, in my opinion, is incredibly difficult.

Here, it turns out such a construction with a C ++ server. You are guaranteed to start your working day by raking out the cores that have survived overnight on the file system and well, if you have enough hard disk.

Roots of problems

In some ways, the roots of these problems that lie, common to traditional solutions. First, it is a shared memory.

Here, unfortunately, the picture is not visible again.

Shared memory that is shared between all objects that are in the system. Anyone can go anywhere, take a link to anything, and in the end, it turns out such a construction, when all objects cross each other are referenced, and it is much more difficult for us in this situation to control the memory, who captured what, and who needs what .

We must understand that these problems do not interest us when we write a website in PHP. They do not interest us, for the reason that your application that works when servicing a web service lives one second maximum. In one second, everything that it used can be destroyed, because all this becomes unnecessary, we already have a new request, a new connection.

Slide:
Web approach → “let it flow, we will soon arrive” → does not work.

Here this will not happen, clients connected for hours, days and even more. And it is necessary that your code works effectively, without leaking, for weeks.

Erlang solves these problems radically

Erlang turned out to be a platform that, surprisingly, radically solved these problems, and almost completely closed the problems I was talking about.

This was done at 90% due to its concept of processes.

Slide:
Processes
Parallel threads
Isolated memory area
Exchange by sending messages
Variable variables
No data outside processes

Processes in Erlange, this is something like threads, in conventional systems. They are lightweight, they take up much less space, and most importantly, they are completely isolated.

Each process in Erlange is a box from which nothing flows out. And we know for sure that all the memory that is in the system is guaranteed to belong to some process. There can be no data outside the process. That is, always, if there is a gigabyte piece of memory, like a draw, but we know what process it owns and we can beat it in order to free this data.

Q: But binaries are passed by reference between processes?

Well, this is the subtleties of implementation, but in fact these binaries can always be tracked. They are being investigated.

Question: What and how?

We know the process knows which binaries and what size it refers to.

Therefore, what we get is: all the data that is in the system is stored solely within enumerated processes. You can go through all the processes in the system to find out who devoured all the memory, and stop this mockery of the system.

Slide:
All data is stored inside enumerated objects.

The next feature of the process approach to organizing data within and threads of execution is that the errors that occur are hard ??? process. If we have a mistake that we didn’t handle, which we didn’t want to intercept, we decided, letting it fall further, we are not interested in its fate, this is a fatal mistake, our process ends. And most importantly, it is very, very similar to the release, the destruction of the object, because it is a time-known procedure, that is, we know that once the error has occurred, the process has ceased. It will not be in an hour, not in two days, it will be right now. And it is important to understand that the processes that have ordered ..., who want to monitor its condition, other processes that are neighboring, will receive information about it that their neighbor has died.

Slide:
Error processing
You can catch them
If not to catch, the process ends.
Neighbors will find out about it through messages
Guaranteed cleanup of resources

The result is that we can monitor the status of the processes. For example, we start a separate process that serves the connections with the user, we begin to monitor him, and if there was any error, our error in the code, most likely, the process that follows him finds out, “aha, we have died process serving socket ". So, in principle, there is no further sense to serve the user, there is still nothing to serve him, and it is necessary to cascadely complete all those processes that were created to serve him.

Accordingly, in the system that is proposed, the platform that comes with this language has a system of supervisors, ready-made mechanisms, a very streamlined set of programs, there are practically no errors in them, that is, I don’t know that they have found errors lately, they work stably and allow you to restart your processes.

Why do you need it? For example, you have one of the most important processes in the system, this is a demon, this is a process that listens to a socket. It is fixed on the socket, and accepts connections from the system. You can be sure that it will work guaranteed, otherwise your entire system may all fall off (???). Now, if it fell off, there is no point in tormenting your server.

Slide:
Process tracking
Connections
Supervisors
appmon

Unfortunately, I can not show on my laptop, I do not have an adapter, but I would like to show such a thing as an app monitor. It comes with, again, a platform mechanism that allows you to graphically see a list of all the processes that you have, with their tree. That is, we can ... this is a very useful thing, when you can see that a user is coming to you, you have created an object for him, he has requested some resources, for them just crawled (???) into some processes ... The user leaves, the processes remain - in fact it is a leak of processes, and with the help of app monitora, it all seems very clear.

But, unfortunately, I will not show you. :(

Erlang is a real hot code update.

And in Erlang there is probably the only one of all existing platforms, a real, hot code update. It looks like this - clients are not disconnected, they continue to work, in the case of a video streamer, they continue to receive video, in the case of online games, the connection is not lost, and the code is already serving a new one.

Slide:
Without disconnecting customers!

Other systems that allow this to do, I do not know.

What are the results of using Erlang?

And what was the result, after I decided to use the Erlang to create my own server? It turned out our video streaming server Erlyvideo, which is now in the top two, the best in its field, according to the set of implemented features, in terms of development speed, stability and efficiency.

Slide:
Erlyvideo:
Multiprotocol server
Keeps thousands of clients on one server.
Existing plugin infrastructure

For example, it serves thousands of clients from one server in a perfectly normal way, now it is in production at BD (???) and works.

It turned out to be very effective and simple, due to the dynamic typing of the language, which is naturally dynamic, because we cannot find out what the process is, therefore all communication between processes comes down to the exchange of messages. Therefore, this language ... we can talk about the dynamic typification of this language.

Therefore, it turned out very convenient infrastructure for plug-ins, which is also very actively developing.

But this is a very painful topic for any product, how to correctly build a system of plug-ins. It is very incomprehensible where to make these places, where you can stick this plugin.

As a result, Erlyvideo perfectly solves the voiced task, the server can stand for weeks, and without restarts and without any memory leaks, there is no problem with that. I, for example, even stand for months, and do not swell, preserving the memory at the same mark.

findings

Slide:
Tasks for streaming video have specificity that distinguishes them from the web:
Efficient and high-level tools are needed at the same time.
Erlang fits perfectly into this niche
Practical use has shown the effectiveness of the choice

In the end, what conclusions can be drawn from the use of Erlanga for this task?

The tasks of streaming video on the Internet have their own specifics, which greatly distinguishes them from the web. And unfortunately, many solutions are run-in and reliable, and so understandable ... and it seems that you can easily find programmers for them, carry within themselves, in their structure, the roots of the problems that were originally suppressed in Erlange.

You will not have these problems because ... simply because of the specifics of the organization of the code.

Therefore, erlang fits very well in the task of servicing statefull clients. And, practical application, showed the extreme effectiveness of this choice.

Applicability erlang

Well, it is clear that the niche of streaming video is a rather narrow thing, these products are only five in the market, and in principle, this is probably enough. It makes no sense to write another server streaming video on Erlang.

Slide:
Video streaming (erlyvideo)
Jabber server (ejabberd)
Banking Processing (Privat Bank)
Online Games (Online Poker)

However, he has other niches of applicability, for example, the very best Jabbera server is made on the same erlang, this is ejabberd, it is so cool that in Yandex, for example, despite the terrible antipathy to the Erlang, they decided to use it, yes, Yash? Doesn't Grisha love Erlang (bobuk) much? He swore very much, spat, but they had no choice left - it says a lot about the product. Also, for example, it is reliably known that banks make banking processing systems. I don’t know, of course, which particular details, details, of course, are not disclosed, but I know that Privat Bank also has a number of companies transferring their long-lived processing systems to erlang, because it turns out to be convenient for them.

And of course, online games. People regularly appeal to me, after our success with the unwinding of the top-end VKontakte toy, that is, people turn to "we would have to make it work well and conveniently."

I look at their problems, and I understand that they should implement their toy on an erlang, most likely. Because business logic is not very much, but it is very specific, and all the problems that I described, they scrape using the usual technologies, rails there, or java.

And even there is an online implementation of poker on the Erlang.

Questions?

So, in general, I probably have everything, so this report has turned out. If you have any questions, I will be happy to answer them.

Slide:
Max Lapshin
max@maxidoors.ru
erlyvideo.org

Sessions of answers and questions did not fit into the size of habro-publication, so it should be searched on the conference website via the “transcript” link.

Source: https://habr.com/ru/post/114560/

All Articles