📜 ⬆️ ⬇️

Introduction to the Content Delivery Network

Content: What is a CDN? History of occurrence. Why is it needed? Who needs it, and who does not? Entry threshold, cost, cost. Basic technology.

CDN is short for content delivery network, i.e. “content delivery network”. Most often it is a set of servers with specialized software that accelerate the delivery (“return”) of content to the end user. Servers are located all over the world in such a way that the response time of the site visitors is minimal. “Content” most often means videos and static website elements (not requiring server code execution or database requests, such as css / js), but completely unexpected things are also referred to as “content” - for example, games in Steam (uses CDN for game sharing), updates for operating systems, etc.


')

A bit of history

The rapid growth of the Internet in the mid-90s led to the situation that the servers of those years could not withstand the load alone (can the powerful dual-processor Pentium Pro-based server at 266 MHz with 128 MB of memory give up much?). The server performance limit and the need for more and more performance has generated the now forgotten words: “server farm”, “hierarchical caching” ... The awesome newspeak is surprisingly sensitive to age - and words like “servers farm” or “information superhighway” are now associated with warm tube lamps. CRT monitors, not progress. During the development and implementation of different solutions, one important feature was noticed: there are two types of content - static and dynamic.

Dynamic content is generated by the server at the time the server receives the request, most often with the active participation of the database. If on the page below the inscription “page was generated in 0.333 seconds” is just an example of dynamic content.

Static content on the server is ready-made - whoever sends the request, the server will give the same thing (adjusted for possible ACLs). It is important that the content does not change from request to request.
Static and dynamic content create different types of server load. When the “dynamics” is distributed, the processor, IO (for the database) and some memory are important. When static is distributed, the processor is almost not important, IO is important only for those files that are not cached, and the main requirement is network speed. It is possible to force distributing statics to servers that distribute dynamics, but this is a combination of roles that interferes with each other. It is especially hard at the moment when IO from the statics begins to interfere with IO from the dynamics, and the load on the IRQ prevents the execution of dynamic scripts.

An even more important detail is that “dynamic” usually means having a “state” (the session and its associated data), but statics is not. Statics can be scaled horizontally without complex two-way synchronization with the central server. In the case of dynamics, this will not work out - you need either a common database or methods of synchronization and locks.

Medium and large companies started distributing statics and dynamics from different servers located in different places of the planet, reducing the load on sites with dynamics due to the removal of statics from them to easily scalable servers. After that, it was easy to take a step to the “outsource” distribution of statics, and companies began to emerge that made static distribution the basis (or at least a major component) of their business.

The main thing


Note that CDN solves an even more important problem than making life easier for application servers.
All modern CDNs place copies of content on different servers around the world and direct the client to the closest (to the client) server. The result is an abbreviation of latency, that is, the delay between the request and the response. If there are a lot of images on the page (even small pictures), then the faster the client will have them, the faster the client will see the page. And if we remove from consideration of the sufferers on dialup / gprs, the time for which the page will be shown is determined almost exclusively by the network delay. If we are talking about distances of hundreds of kilometers (~ 10 ms delay), this is not significant. But if we are talking about distances to continents, then a delay of hundreds of milliseconds (up to 500-600!) Begins to play a radical role. And if the content is given from the server, which is a few kilometers away from the user, then a miracle happens! Australia sees data from a site from the USA in units of milliseconds, China from a site from Russia, France from a site from Brazil. Without the participation of ocean cables.

This also works on a smaller scale: For example, Yandex, with the help of a CDN, at one time notably accelerated the work of the post office in the regions of Russia, which use Moscow to stomp and stamp on optics.

Accelerating content delivery became the main killer feature of the CDN, and everything else (load reduction, balancing, etc.) became secondary. Important, but not critical. In the end, any load can fill up with money. But no money can be made so that, without local points of presence, the signal from Perm reaches San Francisco in tens of milliseconds.

Given that saving is not a killer feature, it is also important. CDN in some situations allows you to significantly save on traffic. Transferring files to another continent once, keeping them there on a local server and distributing via local links is cheaper than driving the same traffic ten thousand times across the trans-Atlantic. Most often, people start to think about saving the moment when it becomes critical (video hosting in the first place).

However, servers around the world, a system for synchronizing content and directing clients to nearby servers, etc. - all this is not free. Most often, CDNs are asking for extra money compared to regular uplink traffic, although for some regions it may turn out that CDN traffic is more profitable than uplink traffic (but this, rather, suggests that the Internet in the region is not so hot).

How does it work in practice?


From the visitor’s side: he enters example.com, where he is given an html page. In this html-page all css, js, pictures and video - point to the site cdn.example.com - the content is loaded from there. When the client’s browser accesses this address, thanks to the magic of BGP, its request is sent to the closest presence node. The very magic of BGP is that the visitor’s provider on the IP network in which cdn.example.com is located receives several announcements from different networks (which have a point of presence), and the provider’s router chooses the closest one. As a result, the request goes to the nearest server that responds to it, and the answer goes the same way, also on the short route.

On the part of the site owner, there are two options:
  1. Static files are uploaded to the object storage, via ftp, scp or another convenient method. Object storage (in the control panel) is assigned a dns-name (its own, or issued by the provider - depending on the technology), which is indicated in the html-page.
  2. The site owner specifies the 'origin' for the domain, after which, at the client’s request, the CDN goes to the site to which cdn is connected and downloads files to itself and gives them to the user's browser.

Magically, the data is available to the client much faster than the main html-page.

By the way, it can be static too. According to this principle, for example, the pages on imtqy.com work - this is a pure CDN, everything in it is distributed by statics.

Who needs a CDN?


Those who are important to give statics quickly to many visitors who are far from the company's servers (the situation is even more acute for companies that have visitors scattered over a large area, that is, even moving servers closer does not make sense - most will be far away) ).

Those who have a very large amount of files - and the cost of CDN traffic is lower than the cost of traffic going to uplinks (large sites usually cost different money - local is cheaper, “global” is more expensive).

At a certain band, the removal of static on a CDN is more profitable than an upgrade of network equipment. Usually, static takes up a significant part of the band, and instead of upgrading from 1G to 10G, or from 10G to 40G, it is much cheaper to throw 80% of the traffic on a CDN and stay at reasonable price servers.

Differences


If everything is clear with CDN, then what about their suppliers? There are many companies, they differ in price, services and quality.
Here are the main factors that you need to determine for yourself when choosing a supplier:

1. The number of points of presence (Point of Presence)
The more points, the better, but ... However, why do you need points of presence in China, if the site is Russian? And the number of points of presence in Australia when entering the US market ... When comparing a CDN, consider the number of points of presence in countries and regions of interest. Just assurances about the large number of points of presence and good connectivity is not enough - for informed choice you need to see a list of points of presence and compare them with the potential audience of the site.

The points of presence themselves are also not equivalent - connectivity and peering agreements with local providers are very important. Unfortunately, it is rather difficult for a non-resident to assess connectivity (you need to understand the distribution of forces in the local provider market), but comparing the offers you should clarify about the list of peers of each candidate in the most important points of presence.

2. Caching policy
In order to quickly deliver content from the local server, it is necessary that the content on the local server appears (and remains). The caching schemes are many, here are the most obvious:

Next to the caching policy is the policy of obsolescence (retention policy): when exactly the object is removed from the server at the point of presence? By timeout, by reducing the number of hits below a certain value, “never”, after a fixed time? And who pays for keeping a copy?

3. SLA
Yes, yes, the legendary and immense Service Level Agreement. Before you enjoy the long line of nines, specify - is this SLA for a CDN “in general” or for all points of presence? If the server in the most important location for you breaks down and the content is sent “from the neighboring country”, will it be counted as downtime for SLA? Well and, the main thing, what threatens with non-observance SLA to the supplier? Will you get back a penny from a monthly payment, or are there substantial penalties?

By the way, even though the selling manager will resist, it will be great if they show you the failure statistics for the previous time. There will be failures, and they happen to everyone (hint: if they tell you that someone has never had accidents - either they are very young or very arrogant) - the whole question is in their duration and frequency.

4. Value added services
CDN may provide additional services. Example (incomplete list):


It is very important to pay attention to the support needed by the protocols and files. Find out if your provider supports streaming playback of flash and media files (RTMP, RTSP) if you plan to deliver such content.

The provider may be very good at everything else, but if it does not support the technologies you need, you are unlikely to like it.

5. Technical nuances
Redirection technology: This is either enikast at the DNS level, or redirection through redirects. Enikast, for obvious reasons, is faster.

Accuracy of redirection: Unfortunately, the supplier himself will not be able to objectively evaluate this indicator, although this indicator is very important - how much of the target audience falls on the nearest server. People often talk about the expected delay (because the actual distance doesn’t bother anyone, but everyone cares about the time of passing the packets - for example, it happens that the junction between two networks is overloaded and the packets go slowly, in this situation it is better to go a little further, but faster).

6. Accounting
How exactly does the supplier take the money? For megabytes or for megabits per second? Is there a minimum commit (“if there was less than the amount stipulated in the contract to pay extra to the minimum”), what happens when overcommit (limit is exceeded) - disable / take more money? Is there a minimum contract period? Is there a contract at all (between the site owner and the CDN provider), or is it an automatic self-serving on-demand provisioning, that is, “threw money into the account and got the control panel”?

From what volume does it make sense to think about a CDN?


We repeat the idea: if you need to quickly serve customers, the volume of traffic is no longer important - the points of presence are closer to the target audience.

If there is no significant need for low latency, and CDN is used to alleviate the load on the servers, then a meaningful amount of traffic that you should start thinking about CDN is a few terabytes per month.

The main question: how much does it cost?


The price varies greatly from the specifics of the CDN, the degree of “coolness” of the supplier and the adaptation of the CDN to specific special needs. The price range on the market is from $ 1 to $ 140 / megabit of bandwidth, or $ 0.03- $ 0.3 per GB of traffic. The actual price very often depends on the added services and CDN capabilities. Traffic in the USA and Europe is usually the cheapest, traffic continues in Asia / Australia, and the most expensive traffic is outside these regions.

Market Overview


All companies are divided into two categories - operating under existing public tariffs and operating on the basis of agreements. Second companies are extremely difficult to compare, as the conditions in them can vary greatly. However, “private” does not mean “small” - private companies often have very large clients with huge volumes of hundreds of terabits (lanes), and they do not bother with a ten-minute gigabit.

Here is a list of popular CDNs (in order not to offend anyone, the list is sorted in random order):

Public CDNs:

Private CDNs:


Additional Information


The article was written with the support of our colleagues from the company UCDN , who are too modest to include themselves in the list above.

Source: https://habr.com/ru/post/236511/


All Articles