📜 ⬆️ ⬇️

What we should build CDN


Slow sites annoy users. When the main content - photos, and the site slows down - it is doubly annoying. And no matter how we optimize our service, there is always a factor such as the quality of communication between the user and our data center. The CDN helps us solve this problem.

We are the company Wheels Wheels Market, the developer of the largest and most visited private ad sites in Kazakhstan and photographs from ads is a critical part of our business.

The specificity of the Kazakhstan Internet space is as follows - there are several large Internet providers in the country that are quite tough enough competing with each other. In addition to accessing the network, they also provide colocation services and, with the goal of monopolization, they are extremely reluctant to build peering among themselves. At the same time, the country is large and traffic flows between cities often pass very unexpected and not at all optimal routes.
')
Under these conditions, we need as quickly as possible to give users 1.5 Gbit / s photos of cars, real estate and personal consumption goods.

We searched for a public CDN for our needs and found only Akamai present in Almaty without any details on the cost and plans for expansion to the rest of Kazakhstan. We decided to build our own.

The first idea was to get to the user's IP address his geographical location and give him data from the nearest server. However, this option was quickly rejected - we remembered cases when traffic to the neighboring village goes through 1000 km and in this case the speed can be even lower than without using a CDN.

For the same reasons, we did not use any other geo-positioning. One of our admins suggested “pinging the server from the browser,” which was the starting point for the implementation of the current scheme.

We built our CDN on a bunch of OpenResty and Lua using JavaScript. This did not require any improvements in the code of the sites (managers and developers are happy - you can "cut" features instead of infrastructure tasks :)) and a little bit of "finished" in mobile applications.

OpenResty is a great Nginx fork from Chinese developers, which has been repeatedly written on Habré. We used it as a reverse proxy.

Lua is a simple, powerful, embedded language that also received enough attention on Habré.

When the user first visits the site (launching the mobile application), we determine the host from which the user receives data as quickly as possible. On the site for this, a small JavaScript code is embedded in the server’s response (in mobile applications, this logic had to be implemented additionally). He, in turn, embeds a page with one invisible picture from each of the CDN hosts and measures the time for which this picture was received. According to the measurement results, the user is put a cookie on the main domain with the name of the fastest host.

function getFastestHost() { var fastest = arguments[0], fastestDuration = 600000, timing = [], track = function (host) { var tracker = new Image(); tracker.src = "/set.gif?cdn=" + host; }; for (var i = 0; i < arguments.length; i++) { (function(host) { var image = new Image(), timeStart = (new Date()).getTime(); image.onload = function () { var duration = (new Date()).getTime() - timeStart; if (duration < fastestDuration) { fastestDuration = duration; fastest = host; } timing[timing.length] = duration; if (timing.length == arguments.length) { track(fastest); } } image.onerror = function () { timing[timing.length] = -1 if (timing.length == arguments.length) { track(fastest); } } image.src = host + "/empty.gif"; }(arguments[i])); } } 

In subsequent requests, OpenResty runs the Lua code, which checks for the presence of cookies, validates it and, if all is well, replaces the host in the image URL with the one that was obtained from the cookie.

 init_by_lua_block { --     function getCdnHosts(file) local hosts = {} for line in io.lines(file) do table.insert(hosts, line) end return hosts end --         function stringToTable(t, s) local it, err = ngx.re.gmatch(s, "(//[^;]+);?") while true do local m, err = it() if not m then break end table.insert(t, m[1]) end return t end --     function valueExists(tbl, value) for k,v in pairs(tbl) do if value == v then return true end end return false end } server { server_name kolesa.kz; #   cdn set $cdn_project kl; #   cdn set $cookie_host .kolesa.kz; #    cdn set $cdn_hosts_file "/etc/nginx/cdn/cdn.data.active"; #   set $replace_hosts "//photos-a-kl.kcdn.kz;//photos-b-kl.kcdn.kz"; #         uri location / { proxy_set_header Host kolesa.kz; proxy_pass http://kolesa; header_filter_by_lua_block { ngx.header.content_length = nil } body_filter_by_lua_block { allCdnHosts = getCdnHosts(ngx.var["cdn_hosts_file"]) replaceHosts = stringToTable({}, ngx.var["replace_hosts"]) cdnHost = ngx.var["cookie_" .. ngx.var["cdn_project"] .. "_cdn_host"] replaceEof = ngx.arg[2] if cdnHost ~= nil and valueExists(allCdnHosts, cdnHost) == true then --  ,    ,   for k,v in pairs(replaceHosts) do local newStr, n, err = ngx.re.gsub(ngx.arg[1], v, cdnHost) if n > 0 then ngx.arg[1] = newStr replaceEof = false end end else --    ,       local scriptStr = "<script src='/cdn.js' type='text/javascript'></script>" .. "<script type='text/javascript'>" .. "(function(){" .. "getFastestHost('" .. table.concat(allCdnHosts, "', '") .. "')" .. "}())" .. "</script>" local newStr, n, err = ngx.re.gsub(ngx.arg[1], "(</body>)", scriptStr .. "$1", "i") if n > 0 then ngx.arg[1] = newStr replaceEof = false end end ngx.arg[2] = replaceEof } } } 

The list of available hosts is in the file, which is formed according to the results of a survey of hosts for accessibility from the frontend, which distributes the html-code of the site. Thus, we deactivate the hosts for some reason from the service.

We have 5 CDN hosts at the moment - three in Almaty and one each in Astana and Shymkent. Each host is served by two Supermicro servers (for fault tolerance). OpenResty + Memcached on 120 Gb is spinning on each for caching photos.

As a result of the implementation, we reduced the traffic to the main data center (1.2 Gbit against 400 Mbit) and increased the total traffic from us to users (1.5 Gbit against 1.2 Gbit). Fotochki ceased to slow down the users of individual Internet providers (which often happened before the introduction of CDN) and in general our customers became happier.

In the near future, we plan to install servers in the data centers of mobile operators, since the problem is even more relevant for mobile Internet users.

Source: https://habr.com/ru/post/328844/


All Articles