
The purpose of the study was to obtain an up-to-date list of all active domains in the .RU zone, by
01.01.2016 it was registered with
5040277 names. They made a decision to go through each name with a crawler and analyze the result.
Server responses were as follows:

')
Full table with response codes| rc | cnt | % |
| 200 | 2670175 | 53.0 |
| IPFAIL | 826869 | 34.9 |
| TIMEOUT | 486924 | 7.4 |
| 301 | 444719 | 7.3 |
| 404 | 191831 | 3.4 |
| 302 | 176133 | 3.2 |
| 403 | 108624 | 2.1 |
| 503 | 43330 | 0.8 |
| CHARSETFAIL | 32606 | 0.6 |
| 500 | 19603 | 0.4 |
| 401 | 6847 | 0.1 |
| 303 | 5919 | 0.1 |
| 429 | 5501 | 0.1 |
| 502 | 5340 | 0.1 |
| 402 | 4232 | 0.1 |
| 0 | 2954 | 0.1 |
| NONHTML | 1796 | 0.0 |
| 423 | 1688 | 0.0 |
| 400 | 1654 | 0.0 |
| 409 | 1125 | 0.0 |
| 307 | 1014 | 0.0 |
| 521 | 273 | 0.0 |
| 999 | 203 | 0.0 |
| 410 | 191 | 0.0 |
| 523 | 150 | 0.0 |
| 504 | 138 | 0.0 |
| 509 | 98 | 0.0 |
| 508 | 93 | 0.0 |
| 204 | 46 | 0.0 |
| 520 | 45 | 0.0 |
| 434 | 32 | 0.0 |
| CLEX | 32 | 0.0 |
| 406 | 20 | 0.0 |
| 501 | 14 | 0.0 |
| 479 | eight | 0.0 |
| 407 | eight | 0.0 |
| 418 | 7 | 0.0 |
| 405 | 7 | 0.0 |
| 451 | four | 0.0 |
| 435 | four | 0.0 |
| 304 | four | 0.0 |
| 201 | 3 | 0.0 |
| 300 | 2 | 0.0 |
| 456 | 2 | 0.0 |
| 3 | one | 0.0 |
| 507 | one | 0.0 |
| 101 | one | 0.0 |
| 126 | one | 0.0 |
| 422 | one | 0.0 |
| 557 | one | 0.0 |
| 412 | one | 0.0 |
| 413 | one | 0.0 |
| 420 | one | 0.0 |
| Total : | 5040277 | 100.0 |
IPFAIL - the domain could not be resolved (not delegated, name servers are not specified, etc.).
TIMEOUT - IP was received, but did not give anything and fell off on timeout.
CHARSETFAIL - content coding could not be recognized.
NONHTML - sites whose web servers did not interpret the scripts, but gave them to the text along with the connection details to the databases and other delights.
CLEX -
crawler exceptions by response size> 10mb.
301 redirect (permanent):
bulk - a zoo from satellite nets, alternative website addresses, and so on.
| cnt | % |
| http: // www.domain | 215289 | 48.4 |
| bulk | 144275 | 32.4 |
| http: // domain / page | 76417 | 17.2 |
| https: // domain | 7617 | 1.7 |
| https: // www.domain | 1121 | 0.3 |
| Total : | 444719 | 100.0 |
302 redirect (temporary):
bulk - all the same grids, errors, installers of various cms, etc.
| cnt | % |
| bulk | 135464 | 76.9 |
| http: // domain / page | 22658 | 12.9 |
| http: // www.domain | 10660 | 6.1 |
| https: // domain | 7168 | 4.1 |
| https: // www.domain | 183 | 0.1 |
| Total : | 176133 | 100.0 |
In the redirect through meta refresh, we also look, but this time there is nothing interesting there. The most popular way to send a user to a bunch of exploits.
All
2670175 domains that
donated 200 OK are running on
192213 IP addresses, top 10:
Here we meet really interesting guys:
180983 domains on ip 109.206.190.54 (
6.77% of all active ) are mirrors of
www.homes.ru (compared not only by ip, of course). With a huge margin go even from parking. Work in a big way.
Few average values of the content component of the main pages of the RuNet:
| Average title length | 47 |
| Average Keyword Length | 220 |
| Average words per page | 515 |
| Average page weight (in octets) | 42320 |
On 262 domains in the text occurs the word '
habrahabr '.
Links from the main pages to user profiles Habr List of domains that
donated 200 OK
dataoperator.ru/ru_domains_200_ok.zip