Have you ever thought about such questions:
- How does the world relate to CDN technology for downloading libraries?
- How many successful sites are written in Wordpress?
- Which scripts are developers often downloading from Google CDN?
- How popular is jQuery?
But I thought about it.
And not just thinking, but did a little research.
And he wrote a small extension for chromium, which, perhaps, will make life better
or break the Internet .
Results inside.
Conclusions for the lazy, or TL; DR;
- 10% of the 300,000 most popular sites use Wordpress.
- Popular sites that use jQuery go to the library connection from the CDN. Every year the right guys more and more.
- The most popular versions of jQuery in the world: 1.7.x , 1.8.x , 1.9.1 , 1.10.2 .
- jQuery 1.7.x leads by a wide margin: each 4th connected jquery has version 1.7.1 or 1.7.2
- Google , jQuery and Cloudflare are the most popular CDNs.
- 89% of all downloads from Google CDN are jquery.
')
How it all began, or the prelude
I've become thoughtful -
why browsers don't add popular js-libraries to their distributions ? After all,
CDN is very good, one URL for the resource, caching, everything. But it is even better not to download static files at all, but to have them immediately in the browser.
As a response to the injustice of fate,
this pattern of expansion of the structure was built, which is designed to speed up the Internet.
But you can’t just put forward a couple of hypotheses and “fill in” the prototype to calm down and rest on its laurels: the brain requires evidence, facts and a cheerful movement (
yes, this is how I treat interesting research, although there were few in the process of preparing the data ).
Why investigate?
So there are a few ideas:
- All statics from a CDN can be painlessly placed in the browser, because it is not modified and generally permanent .
- If many many people load static from the browser without sending requests to the CDN servers, then everyone will be fine .
- If you store locally all common static files (read - js libraries) and assume that sites are written by good programmers who do not modify minimized types like jquery-1.7.2.min.js , then such files are permanent and apply to them p.1 and p.2
These ideas required confirmation. And during the implementation of the extension, I encountered additional questions:
- Is it true that jQuery is the most popular script?
- What percentage of scripts connect to sites from CDN?
- What versions of jQuery do people use?
- Do minified libraries connected from their servers fall under the right pattern in the right quantity?
What are we researching?
Initially, I wanted to use the
Common Crawl package. But in view of the fact that this beast weighs
81 TB , and considering the amount of time and money that will have to be spent on its analysis, the beast was left alone.
A little later, I came across a
wonderful article in which the author explored the Internet just for the topic that I needed.
The problem was that I did not find the right answers in the article, but I found the right tools!
Study
For the answers I needed, I used
httparchive . This is a crawler data set that polls sites from the TOP 300,000 of the
Alexa service. Those. we can say that this is a huge bunch of the most popular sites on the Internet.
I downloaded the freshest dataset for myself - the results of a site survey for
March 1, 2014 .
Below I will give the results of the study and the requests that I used to obtain them.
You can compare my results with the results
obtained a year earlier .
Number of sites loading jQuery from CDN
Hidden textSELECT "jquery" AS name, count(distinct(pageid)) AS count, (100*count(distinct(pageid))/290835) AS percent FROM requests WHERE pageid <= 14802750 AND pageid >= 14489007 AND url LIKE "%//ajax.googleapis.com/ajax/libs/jquery/%"
Name | amount | % |
---|
jquery | 59977 | 20.6223 |
Every year, the number of sites that use various jQuery CDN solutions grows. This means that progress does not stand still and people are aware of the coolness of such a decision.
Popularity of different versions of jQuery from Google CDN
In this case, I modified the original request. My goal is to examine the share of each version of jQuery in the total number of sites that generally connect jQuery. In the articles of other authors there are small problems that affect the visibility of the result:
- Some sites use “short format” versions, for example, //ajax.googleapis.com/ajax/libs/jquery/ 1 /jquery.min.js . Today, this format corresponds to jquery-1.9.1 . I consider this in the totals.
- Wordpress adds to the static parameter ? Ver = wpversion , which affects the grouping by urla.
- When studying the frequency of versions, it makes no difference to us which protocol is used - http or https.
Hidden text select SUBSTRING( url FROM POSITION("/libs/jquery/" IN url) + 13 FOR LOCATE("/jquery", url, POSITION("/libs/jquery/" IN url) + 13) - (POSITION("/libs/jquery/" IN url) + 13) ) as version, count(distinct(pageid)) as count, (100*count(distinct(pageid))/59977) as percent from requests where pageid >= 14489007 and pageid <= 14802750 and url LIKE "%//ajax.googleapis.com/ajax/libs/jquery/%.min.js" group by version order by count desc;
Version | Number of inclusions | % |
---|
1.7.2 | 8938 | 14.9024 |
1.7.1 | 6842 | 11.4077 |
1.8.3 | 5670 | 9.4536 |
1.9.1 | 5533 | 9.2252 |
1.10.2 | 5244 | 8.7434 |
1.8.2 | 3832 | 6.3891 |
1.4.2 | 3673 | 6.1240 |
1.3.2 | 2519 | 4.1999 |
1.5.2 | 2297 | 3.8298 |
1.6.4 | 1987 | 3.3129 |
1.4.4 | 1985 | 3.3096 |
1.6.2 | 1644 | 2.7411 |
1.6.1 | 1395 | 2.3259 |
1.5.1 | 1160 | 1.9341 |
1.9.0 | 964 | 1.6073 |
1.8.1 | 880 | 1.4672 |
1.10.1 | 868 | 1.4472 |
1.8.0 | 803 | 1.3388 |
2.0.3 | 508 | 0.8470 |
1.2.6 | 449 | 0.7486 |
1.7.0 | 403 | 0.6719 |
1.4.1 | 382 | 0.6369 |
1.11.0 | 363 | 0.6052 |
1.4.3 | 357 | 0.5952 |
2.0.0 | 246 | 0.4102 |
1.6.0 | 204 | 0.3401 |
1.6.3 | 193 | 0.3218 |
1.3.1 | 112 | 0.1867 |
1.5.0 | 104 | 0.1734 |
1.4.0 | 83 | 0.1384 |
1.10.0 | 79 | 0.1317 |
2.0.2 | 74 | 0.1234 |
2.1.0 | 68 | 0.1134 |
1.3.0 | 42 | 0.0700 |
2.0.1 | nineteen | 0.0317 |
1.2.3 | 13 | 0.0217 |
An interesting trend is observed in the jQuery world - version
1.7.x leads from year to year by a
wide margin .
Most popular CDNs distributing js-libraries.
Parameter | Number | % of all sites |
---|
Total number of CDN requests | 78160 | 26.8743 |
Hidden text select "Google"as name, count(distinct(pageid)) as count, (100*count(distinct(pageid))/78160) as percent from requests where pageid >= 14489007 and pageid <= 14802750 and url LIKE "%//ajax.googleapis.com/ajax/libs/%" UNION select "Yandex" as name, count(distinct(pageid)) as count, (100*count(distinct(pageid))/78160) as percent from requests where pageid >= 14489007 and pageid <= 14802750 and url LIKE "%//yandex.st/%" UNION select "Microsoft" as name, count(distinct(pageid)) as count, (100*count(distinct(pageid))/78160) as percent from requests where pageid >= 14489007 and pageid <= 14802750 and url LIKE "%//ajax.aspnetcdn.com/ajax/%" UNION select "JsDelivr" as name, count(distinct(pageid)) as count, (100*count(distinct(pageid))/78160) as percent from requests where pageid >= 14489007 and pageid <= 14802750 and url LIKE "%//cdn.jsdelivr.net/%" UNION select "Cloudflare" as name, count(distinct(pageid)) as count, (100*count(distinct(pageid))/78160) as percent from requests where pageid >= 14489007 and pageid <= 14802750 and url LIKE "%//cdnjs.cloudflare.com/ajax/libs/%" UNION select "jQuery" as name, count(distinct(pageid)) as count, (100*count(distinct(pageid))/78160) as percent from requests where pageid >= 14489007 and pageid <= 14802750 and url LIKE "%//code.jquery.com/%" group by name order by count desc;
CDN | Count | Percent |
---|
Google | 67671 | 86.5801 |
jQuery | 9222 | 11.7989 |
Cloudflare | 3996 | 5.1126 |
Yandex | 2379 | 3.0438 |
Microsoft | 1300 | 1.6633 |
Jsdelivr | 324 | 0.4145 |
As we see, the lion's share of resources is connected from
Google CDN .
Let's now look at the Google CDN profile. It will be interesting, but the result is predictable.
Profile download scripts from Google CDN
Hidden text select "jquery" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/jquery/%" UNION select "jquerymobile" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/jquerymobile/%" UNION select "angularjs" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/angularjs/%" UNION select "chrome-frame" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/chrome-frame/%" UNION select "dojo" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/dojo/%" UNION select "ext-core" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/ext-core/%" UNION select "jqueryui" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/jqueryui/%" UNION select "mootools" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/mootools/%" UNION select "prototype" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/prototype/%" UNION select "scriptaculous" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/scriptaculous/%" UNION select "swfobject" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/swfobject/%" UNION select "webfontloader" as name,count(distinct(pageid)) as count, (100*count(distinct(pageid))/67198) as percent from requests WHERE pageid <= 14802750 AND pageid >= 14489007 and url like "%//ajax.googleapis.com/ajax/libs/webfont/%" order by count;
Script | Count | Percent |
---|
jquery | 59977 | 89.2541 |
jqueryui | 12437 | 18.5080 |
webfontloader | 4624 | 6.8812 |
swfobject | 2347 | 3.4927 |
prototype | 993 | 1.4777 |
scriptaculous | 787 | 1.1712 |
mootools | 445 | 0.6622 |
angularjs | 353 | 0.5253 |
dojo | 186 | 0.2768 |
chrome-frame | 75 | 0.1116 |
ext-core | sixteen | 0.0238 |
jquerymobile | one | 0.0015 |
jQuery is really the most popular script. Bypasses the rest of the library
in order ! ..
Notice the intriguing result?
jQuery mobile is only connected on one site!This is not a mistake, I checked three times :)
Approximate Wordpress Impact
In analyzing the data, I noticed a steady pattern that introduces noise into the results. Namely, an incomprehensible parameter in static queries
:? Ver = xxx .
As it turned out, these are mostly WordPress tricks! It adds a parameter with a version to the statics.
In addition, there are several more characteristic patterns - some sites add
cache basting to all resources, including statics from CDN.
Let's go back to WordPress. I found interesting patterns that allow you to enter simple heuristics and estimate how widespread wordpress is:
- Wordpress uses the jquery-migrate plugin. This plugin is quite rare and is used to bring back obsolete jQuery features from older versions in version 1.9+.
- As mentioned above, Wordpress adds a version with a version to resources.
Using this knowledge, we obtain the following.
Hidden text select count(distinct(pageid)) as count, (100*count(distinct(pageid))/290835) as percent from requests where pageid >= 14489007 and pageid <= 14802750 and url LIKE "%jquery-migrate%.js\\?ver=%" or url LIKE "%jquery-migrate%.js\\?v=%";
Number of sites | % of the total |
---|
29819 | 10.2529 |
As you can see,
more than 10% of the most visited sites in the world use wordpress.
PS During the study, no site was hurt. But the extension can
break something. If you still decide to use it and find this behavior - write to me
in a personal .
PPS If you have interesting questions, then ask them
in the comments . I will update the article and add answers.