📜 ⬆️ ⬇️

All on one or as we built CDN

Among high-load (highload) systems, there is a big difference between systems with high load in terms of requests per second (RPS) and high load in terms of generated traffic (that is measured in gigabits per second). In our ivi.ru load there is one and the other. Now I want to tell you about how we generate hundreds of gigabits per second, and it does not hurt anyone.



Bike dreams


In the summer of 2011, a terrible thing happened: the RuNet enjoyed watching free movies so much that the growth in load overtook the growth in capacity for some time. All this went to the channels of a single provider from a single Moscow data center. It got to the point that some Moscow subscribers received a movie through Amsterdam. It is clear that in such conditions it is quite problematic to fight for the title of the house of high culture of life of the best Internet cinema in Russia. It was a volitional decision to use CDN (Content Delivery Network, content delivery network), and everything began to turn (including my modest participation).

When we want to convey to the user several megabits of video traffic, we have two fundamental problems (of course, there are more problems, but they are not of such fundamental nature):
')
1. You need server power to give the user these megabits
2. We need communication channels that these megabits will convey to the user

TCP insists that the longer the packet turnover time (RTT, return-trip-time), and the greater the packet loss, the slower the data transfer rate. Again, long-distance trunk lines are the most expensive, which means they are clogged. As a consequence, the probability of packet loss increases with increasing distances. Accordingly, we, with our HTTP (working on top of TCP), need to keep servers closer to subscribers.

All this can be kept at home, and you can pay your uncle's money and use his CDN. And you can only pay for the actual consumption, and all sorts of bad topics, such as a shortage of servers or clogged channels, become a headache for others. Again, such an uncle has several clients, which means that it has large volumes both for the purchase of servers and for the purchase of channels. Those. low cost. But how good is he?

On the other hand, building your CDN has three major drawbacks:

1. We must learn to build it.
2. We'll have to work with many suppliers (instead of one, okay, two, CDN operator)
3. To support, develop and plow oneself.

But is it really scary?

Bicycle factory


Most commercial CDN operators offer free (or for a reasonable amount of money) testing. And we took advantage of this opportunity. The international CDN operators were surprised by the fact that they could send a subscriber from, say, Novosibirsk to the server where the thread is in South America. And, of course, there was no server capacity in Russia (and we don’t show cinema abroad), they didn’t have at that time. In fairness, I will say that now some of them are now putting their knots in our country. And then I learned about a South American specialist from another domestic CDN operator. In networks affected by Western capitalism, another goal of balancing is that they believe that there are enough communication channels. They have the limiter is the server power. That is where the server is more balanced ...

Both foreign and domestic CDNs finally showed a level of quality (by which we understood the effective speed of content downloading by the end user), comparable to what we could have done ourselves at that moment. But at prices everything turned out to be far less rosy than in theory. Do you understand now why neither the names of companies, nor the numbers of measurements, will not be here?

By the way, already this year the study “State of the Union: Ecommerce Page Speed ​​& Web Performance” (available, for example, here ) was released - that the use of CDN slows down the loading of the site. Here, as they say, coincided. About why this happens, I plan to write separately.

Well, okay, back to our CDN. It became obvious that the network should be its own. How to build it - that was the main question. But here we are lucky. First, it turned out that the server architecture at our central site was already divided into edge (border) and origin (source) servers. And the edge servers, as it turned out, were very well delivered to external sites.

Secondly, it turned out that providers know such a resource as ivi.ru, and most of them want to work with us to localize traffic. In a significant number of cases, the operators themselves came out to contact and offered cooperation. This certainly helped in the construction of new hubs. In several regions, representatives of local providers directly told me: “For us, the quality with which ivi.ru looks is an important competitive advantage.”

Thirdly, it turned out that we do not need anything from the additional functionality offered by CDN providers. No need to transcode content "on the fly" (all content is pre-encoded in all possible options). No need for exotic protocols, like RTMP (we have everything on HTTP, and even the new-fashioned HLS is HTTP). You do not need a thick channel with QoS to Moscow (for managing a node and updating the cache, a “normal” Internet is enough in the amount of 50-100 Mbit / s), even the fall of such an Internet does not stop subscriber service. You do not need a “raspasovshchik” and “patented algorithms of balancing” (I will not write this now, I will leave it for another article).

As a result, we were able in a very short time to deploy our own CDN throughout the country.



As a result of this work, now the ivi.ru sites are present in 23 cities of Russia. To be honest, after 20 I was already uninteresting to recalculate, new nodes appear constantly. The deployment of the new node itself takes one working day. Node sizes range from one to eight servers. On the multi-server nodes, of course, there is also network equipment: cisco series 3750X or 4500-X. On the part of the nodes, the servers are connected by an aggregated 4 * 1 GbE link (these are small and medium-sized nodes). On large nodes, servers connect with 10GbE interfaces.

In some cities we have several knots, although with this we are now fighting. To increase efficiency, it is advisable for us to have one large cluster rather than several small ones. After all, small caches of the same popular content, and if you combine the same servers into a cluster, the amount of unique cached content will be much larger.

More than half of the traffic generated by ivi.ru is now generated by nodes outside Moscow. Interestingly, the daily fluctuation (the graph shows the ratio of the traffic generated in the regions to the total, Moscow time).

image Clickable:



It is very clear that at night the load on the CDN is minimal. There are several reasons for this, but the main one is that people watch unpopular content at night. One that is not cached (and not cached). The maximum load on CDN is the time when people to the east of Moscow have already woken up, and Muscovites are still sleeping :)

Traffic localization on a node ranges from 40% (small single-server nodes) to 90% (on the largest nodes). Without a doubt, this could not affect the quality of customer service. Here is a beautiful chart:



I will not give fresh data - they have long been fluctuating at the level of statistical error, and such a beautiful “step” cannot be seen there.

This article was conceived as an overview about our CDN. The next aspect I plan to talk about load balancing between cities and villages. What other aspects of our CDN would be interesting for you?

Source: https://habr.com/ru/post/236065/


All Articles