📜 ⬆️ ⬇️

Two-faced REQUEST_URI or in search of a correct HTTP / 1.1 server

Do you know the difference between % {REQUEST_URI} and Apache mod_rewrite from $ _SERVER ["REQUEST_URI"] in PHP?

Can you in .htaccess at the Apache level do a correct 301 redirection from a domain with the www prefix or to it?

For the last question, I still cannot offer a solution. The reason is in the HTTP / 1.1 protocol, which had to be studied in more detail when “reinventing the wheel” (creating the core for the site).
')
It's all about the HTTP header request "Host:". Under certain conditions, there can be anything, and the server should completely ignore this according to HTTP / 1.1. Most developers use the value of this field, for example, for SEO optimizations. Looking ahead, I will say that an additional proxy (for example, nginx) will solve this problem.

To illustrate the incorrect behavior of the servers, I decided to iterate over the websites of Habr companies . For a dozen sites I did it manually, and then I discovered that some sites respond to erroneous requests "correctly." After that, a small utility was written for testing, which made it possible to increase the number of test patterns and sites to check.

What hides REQUEST_URI in HTTP / 1.1?



Theory



HTTP / 1.0


I will begin with the HTTP / 1.0 protocol, which is described in RfC1945 www.w3.org/Protocols/rfc1945/rfc1945 and is dated May 1996. To get the desired page, it was enough to connect to the server and send one line:
  GET /path/to/resource.html HTTP / 1.0 

When accessing the proxy server, it was necessary to use not the absolute path, but the full address:
  GET http://domain.name/path/to/resource.html HTTP / 1.0 

This is all described in Section 5.1.2 of the Request-URI.

Appearance of the host


So that one server could serve several domain names at once, the protocol creators added the request header “Host:”, which should have contained the domain that is being accessed. Although this header is not part of the HTTP / 1.0 standard, it has become supported by some servers and clients. For example, wget sends requests via the HTTP / 1.0 protocol, but adds “Host:”.

HTTP / 1.1


In June 1999 (fourteen years ago) the HTTP / 1.1 protocol appeared, which is described in RfC2616 www.w3.org/Protocols/rfc2616/rfc2616.html . In section 14.23, the new protocol required each request header to contain a “Host” field:
A client MUST include a host header field in all HTTP / 1.1 request messages. If the user requested the URI, it would not be a valid value.


In addition, significant changes were made to the Request-URI from the request line ( section 5.1.2 ). As in the previous protocol, the full address is required when requesting proxy servers ("The absoluteURI form is REQUIRED when it is being made to a proxy."). But all servers must respond to such requests, although clients will issue such requests only to proxy servers:
For all HTTP / 1.1 servers MUST accept all URLs, even though HTTP / 1.1 requests for proxies.


I’ll draw your attention to the fact that a transition to full addresses was assumed (absoluteURI, for example, http : //www.w3.org/pub/WWW/TheProject.html ), so customers are not required to use only absolute paths (abs_path, for example, / pub /WWW/TheProject.html ). In addition, the server explicitly requires the ability to respond to customer requests with absoluteURI, so the objection that in this case the client's request is not correct, I exclude immediately, since "the client is always right."

Host to HTTP / 1.1


Changes to the Request-URI may seem harmless, but Section 5.2 contains one important requirement: “If the Request-URI is an absoluteURI, the request-part of the Request-URI. Any host header field value in the request MUST be ignored. "That is, the interpretation of the request
  GET http://domain.name/path/to/resource.html HTTP / 1.1
 Host: any_text_tut 

must match the request
  GET /path/to/resource.html HTTP / 1.1
 Host: domain.name 


Do you ignore “Host:” when querying with absoluteURI?

% {REQUEST_URI} and $ _SERVER ["REQUEST_URI"]


The documentation for mod_rewrite says the following:
THE_REQUEST
The full HTTP request line sent to the server (eg, "GET /index.html HTTP / 1.1"). This doesn’t include any additional headers sent by the browser. This value has not been unescaped (decoded), unlike most other variables below.

REQUEST_URI
The path component of the requested URI, such as "/index.html". This variable includes the query variable named QUERY_STRING.

That is, in% {REQUEST_URI} there will always be an absolute path and never a full address.

Try to solve the standard SEO task of adding “www” to a domain without it using mod_rewrite if the user sends the following request:
  GET http://domain.name/path/to/resource.html HTTP / 1.1
 Host: www.domain.name 


At the beginning of the article I asked about the difference in % {REQUEST_URI} in Apache mod_rewrite from $ _SERVER ["REQUEST_URI"] in PHP, so I’ll quote from the PHP documentation :
REQUEST_URI
The URI which was given in order to access this page; for instance, '/index.html'.

Maybe this is somewhere configurable, but my PHP / 5.3.13 returns absoluteURI when I request with the full address.

Practice


Let's now look at what happens when requests are made to real servers. I took the addresses of sites from the Habr's page (the list is changing there, took at the end of last week). Sketched a small script on Node.JS, in which the http_check function sends single requests, and full_http_check generates several requests to a single server using specific templates.

script code
var net = require('net'); var default_result = function(title) { if (title) { return {'title': 'title', 'step': 'step', 'host': 'host', 'request': 'request', 'header': 'header', 'full_response': 'full_response', 'response': 'response', 'server': 'server', 'length': 'length', 'location': 'location', 'error': 'error'}; } else { return {'title': '', 'step': '', 'host': '', 'request': '', 'header': '', 'full_response': '', 'response': '', 'server': '', 'length': '', 'location': '', 'error': ''}; } }; var format_result = function(result) { return '' + result['title'].toString() + '\t' + result['step'] + '\t' + result['host'] + '\t' + result['request'].toString() + '\t' + result['header'].toString() + '\t' + result['response'].toString() + '\t' + result['server'].toString() + '\t' + result['length'].toString() + '\t' + result['error'].toString() + '\t' + result['location'].toString() + '\t' + result['full_response'].toString(); }; var http_check = function(title, step, host, req, host_hdr) { var host_header = host_hdr || ''; var result = default_result(false); result['title'] = title; result['step'] = step; result['host'] = host; result['request'] = req; result['header'] = host_header; var dat = ''; var client = net.connect({port: 80, host: host}, function() { //'connect' listener client.on('data', function (data) { dat = dat + data; var lines = dat.toString().split('\r\n'); result['full_response'] = JSON.stringify(dat.toString().split('\r\n\r\n')[0]); result['response'] = lines[0] || false; if (lines[0].substring(0, 5) == 'HTTP/') { var i = 1; while (lines[i] != '') { var title = lines[i].match(/^([^:]+:)\s(.+)$/); if (title[1] == 'Location:') { result['location'] = title[2]; } else if (title[1] == 'Server:') { result['server'] = title[2]; } else if (title[1] == 'Content-Length:') { result['length'] = title[2]; } i++; } if (dat.indexOf('\r\n\r\n') >= 0) { client.end(); client.destroy(); } } else { client.end(); client.destroy(); } }); client.on('end', function () { console.log('client disconnected'); }); client.on('error', function (error) { console.log('ERROR: ' + error.toString()); }); client.on('timeout', function () { console.log('Timeout'); }); client.on('close', function (had_error) { result['error'] = result['error'] || had_error || ''; console.log(format_result(result)); }); client.write(req + '\r\n'); host_hdr && client.write('Host: ' + host_hdr + '\r\n'); client.write('\r\n'); }); }; var full_http_check = function(title, url) { var parts = url.match(/^http:\/\/([^\/]+)(.+)$/); // 1 // GET /path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '01', parts[1], 'GET ' + parts[2] + ' HTTP/1.1', parts[1]); // 2 // GET http://domain.name/path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '02', parts[1], 'GET http://' + parts[1] + parts[2] + ' HTTP/1.1', parts[1]); // 3 // GET /path/to/resource.html HTTP/1.0 http_check(title, '03', parts[1], 'GET ' + parts[2] + ' HTTP/1.0', ''); // 4 // GET /path/to/resource.html HTTP/1.0 // Host: domain.name http_check(title, '04', parts[1], 'GET ' + parts[2] + ' HTTP/1.0', parts[1]); // 5 // GET http://domain.name/path/to/resource.html HTTP/1.0 http_check(title, '05', parts[1], 'GET http://' + parts[1] + parts[2] + ' HTTP/1.0', ''); // 6 // GET http://domain.name/path/to/resource.html HTTP/1.0 // Host: domain.name http_check(title, '06', parts[1], 'GET http://' + parts[1] + parts[2] + ' HTTP/1.0', parts[1]); // 7 // GET http://domain.name/path/to/resource.html HTTP/1.1 // Host: void.domain.name http_check(title, '07', parts[1], 'GET http://' + parts[1] + parts[2] + ' HTTP/1.1', 'void.' + parts[1]); // 8 // GET http://domain.name/path/to/resource.html HTTP/1.1 // Host: local.fake http_check(title, '08', parts[1], 'GET http://' + parts[1] + parts[2] + ' HTTP/1.1', 'local.fake'); // 9 // GET http://domain.name/path/to/resource.html HTTP/1.1 // Host: l-IjFN=fiG(w+J2p:#.{92!m`d^? http_check(title, '09', parts[1], 'GET http://' + parts[1] + parts[2] + ' HTTP/1.1', 'l-IjFN=fiG(w+J2p:#.{92!m`d^?'); // 10 // GET http://fake.domain.name/path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '10', parts[1], 'GET http://fake.' + parts[1] + parts[2] + ' HTTP/1.1', parts[1]); // 11 // GET http://local.fake/path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '11', parts[1], 'GET http://local.fake' + parts[2] + ' HTTP/1.1', parts[1]); // 12 // GET http://l-IjFN=fiG(w+J2p:#.{92!m`d^?/path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '12', parts[1], 'GET http://l-IjFN=fiG(w+J2p:#.{92!m`d^?' + parts[2] + ' HTTP/1.1', parts[1]); // 13 // GET http://local.fake/path/to/resource.html HTTP/1.1 // Host: void.domain.name http_check(title, '13', parts[1], 'GET http://local.fake' + parts[2] + ' HTTP/1.1', 'void.' + parts[1]); // 14 // GET habr://domain.name/path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '14', parts[1], 'GET habr://' + parts[1] + parts[2] + ' HTTP/1.1', parts[1]); // 15 // GET habr://void.domain.name/path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '15', parts[1], 'GET habr://void.' + parts[1] + parts[2] + ' HTTP/1.1', parts[1]); // 16 // GET habr://local.fake/path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '16', parts[1], 'GET habr://local.fake' + parts[2] + ' HTTP/1.1', parts[1]); // 17 // GET habr://l-IjFN=fiG(w+J2p:#.{92!m`d^?/path/to/resource.html HTTP/1.1 // Host: domain.name http_check(title, '17', parts[1], 'GET habr://l-IjFN=fiG(w+J2p:#.{92!m`d^?' + parts[2] + ' HTTP/1.1', parts[1]); // 18 // GET habr://l-IjFN=fiG(w+J2p:#.{92!m`d^?/path/to/resource.html HTTP/1.1 // Host: local.fake http_check(title, '18', parts[1], 'GET habr://l-IjFN=fiG(w+J2p:#.{92!m`d^?' + parts[2] + ' HTTP/1.1', 'local.fake'); }; console.log(format_result(default_result(true))); /* http_check('IBM Fake', 'www.ibm.com', 'GET ttp://com/midmarket/ru/ru/ HTTP/1.1', 'ibm'); full_http_check('IBM', 'http://www.ibm.com/midmarket/ru/ru/'); */ full_http_check('', 'http://company.yandex.ru/about/main/'); full_http_check('JetBrains', 'http://www.jetbrains.com/products.html'); full_http_check('Box Overview', 'http://7del.net/texts/galaxy-note.html'); full_http_check('KolibriOS Project Team', 'http://kolibrios.org/en/download.htm'); full_http_check('Opera Software ASA', 'http://www.opera.com/about'); full_http_check('Apps4All', 'http://apps4all.ru/news/apple/apple-ios-7-beta.html'); full_http_check('', 'http://nordavind.ru/node/207'); full_http_check('Mail.Ru Group', 'http://corp.mail.ru/about/'); full_http_check('Microsoft', 'http://windows.microsoft.com/ru-RU/windows/home'); full_http_check('Zfort Group', 'http://www.zfort.com.ua/company/about/'); full_http_check('IBM', 'http://www.ibm.com/contact/ru/ru/'); full_http_check('UIDG', 'http://uidesign.ru/about/'); full_http_check('Intel', 'http://www.intel.ru/content/www/ru/ru/company-overview/company-overview.html'); full_http_check('Rusonyx', 'http://www.rusonyx.ru/company/reasons/'); full_http_check('', 'http://www.mosigra.ru/page/about/'); full_http_check('DevConf', 'http://devconf.ru/about/'); full_http_check('e-Legion Ltd.', 'http://www.e-legion.ru/contacts/'); full_http_check('Badoo', 'http://corp.badoo.com/company/'); full_http_check(' ()', 'http://mobile.beeline.ru/msk/setup/index.wbp'); 


Now let's take a closer look at each of the templates and the response of sites.

Request 1


The most common variant of the HTTP / 1.1 request, including the absolute path and the correct Host header. Any server should correctly respond to it, that is, we are waiting for “HTTP / 1.1 200 OK”.
  GET /path/to/resource.html HTTP / 1.1
 Host: domain.name 


All servers returned HTTP / 1.1 200 OK. Below is a table of the “Server” response header values:
CompanyHeader "Server:"
Apps4allnginx / 1.0.15
Badoonginx
Box Overviewnginx / 1.2.1
Devconfnginx / 1.0.15
e-Legion Ltd.nginx / 1.0.5
IbmIBM_HTTP_Server
IntelMicrosoft-IIS / 7.5
JetBrainsnginx
KolibriOS Project Teamlighttpd / 1.4.32
Mail.Ru Groupnginx / 1.2.5
MicrosoftMicrosoft-IIS / 7.5
Opera Software ASAnginx
Rusonyxnginx
UIDGApache
Zfort groupnginx / 1.4.1
VimpelCom (Beeline)Microsoft-IIS / 7.5
Mosigranginx / 1.4.1
Nordavindnginx / 1.0.4
Yandexnginx / 1.2.1


Request 2


A variant of the first type of request, but instead of the absolute path, we indicate the full address.
  GET http://domain.name/path/to/resource.html HTTP / 1.1
 Host: domain.name 


In response to this request, all servers again showed unanimity. "Light" requests to disassemble each server can.

Request 3


Request for HTTP / 1.0 with absolute path, without “Host:”. Should get "http / 1.0 200 OK".
  GET /path/to/resource.html HTTP / 1.0 


On the third server request, "fell down." And there is no “HTTP / 1.0 200 OK” response.
CompanyServer response
Apps4allHTTP / 1.1 301 Moved Permanently
BadooHTTP / 1.1 302 Moved Temporarily
Box OverviewHTTP / 1.1 200 OK
DevconfHTTP / 1.1 404 Not Found
e-Legion Ltd.HTTP / 1.1 301 Moved Permanently
IbmHTTP / 1.1 200 OK
IntelHTTP / 1.0 400 Bad Request
JetBrainsHTTP / 1.1 301 Moved Permanently
KolibriOS Project TeamHTTP / 1.0 404 Not Found
Mail.Ru GroupHTTP / 1.1 200 OK
MicrosoftHTTP / 1.1 200 OK
Opera Software ASAHTTP / 1.1 404 Not Found
RusonyxHTTP / 1.1 301 Moved Permanently
UIDGHTTP / 1.1 404 Not Found
Zfort groupHTTP / 1.1 404 Not Found
VimpelCom (Beeline)HTTP / 1.1 302 Redirect
MosigraHTTP / 1.1 404 Not Found
NordavindHTTP / 1.1 200 OK
YandexHTTP / 1.1 404 Not Found


Request 4


Previous query, but add “Host:”. It differs from the first request only in the protocol version.
  GET /path/to/resource.html HTTP / 1.0
 Host: domain.name 


Host had a very positive effect on the servers - everyone had the answer “200 OK”, but only the following were HTTP / 1.0: Intel and KolibriOS Project Team.

Request 5


Request for HTTP / 1.0 with full address, without “Host:”. It would be great to read “HTTP / 1.0 200 OK”.
  GET http://domain.name/path/to/resource.html HTTP / 1.0 


The picture completely coincides with the results of the previous request, but here is e-Legion Ltd. issued HTTP / 1.1 500 INTERNAL SERVER ERROR.

Request 6


Previous query, but add “Host:”. It differs from the second request only in the protocol version.
  GET http://domain.name/path/to/resource.html HTTP / 1.0
 Host: domain.name 


The results completely coincide with the fourth request, that is, “Host:” fixed an internal error on the server of e-Legion Ltd.

Request 7


Variant of the second request with the full address, but in the “Host:” we write the nonexistent subdomain. The request is absolutely correct, so the server must respond with "HTTP / 1.1 200 OK".
  GET http://domain.name/path/to/resource.html HTTP / 1.1
 Host: void.domain.name 


Request 8


Now we will specify a non-existent domain as “Host:”. Nothing has changed in the request, but some servers may not like it.
  GET http://domain.name/path/to/resource.html HTTP / 1.1
 Host: local.fake 


Request 9


The “Host:” title should be completely ignored, so we will write down arbitrary text that many passwords would envy. According to the standard, we will expect “HTTP / 1.1 200 OK”.
  GET http://domain.name/path/to/resource.html HTTP / 1.1
 Host: l-IjFN = fiG (w + J2p: #. {92! M`d ^? 


The server requests 7-9 were answered the same way:

CompanyServer responseHeader "Server:"
Apps4allHTTP / 1.1 200 OKnginx / 1.0.15
BadooHTTP / 1.1 200 OKnginx
Box OverviewHTTP / 1.1 200 OKnginx / 1.2.1
DevconfHTTP / 1.1 500 Internal Server Errornginx / 1.0.15
e-Legion Ltd.HTTP / 1.1 500 INTERNAL SERVER ERRORnginx / 1.0.5
IbmHTTP / 1.1 200 OKIBM_HTTP_Server
IntelHTTP / 1.0 400 Bad RequestAkamaihost
JetBrainsHTTP / 1.1 200 OKnginx
KolibriOS Project TeamHTTP / 1.1 200 OKlighttpd / 1.4.32
Mail.Ru GroupHTTP / 1.1 200 OKnginx / 1.2.5
MicrosoftHTTP / 1.1 200 OKMicrosoft-IIS / 7.5
Opera Software ASAHTTP / 1.1 200 OKnginx
RusonyxHTTP / 1.1 200 OKnginx
UIDGHTTP / 1.1 200 OKApache
Zfort groupHTTP / 1.1 200 OKnginx / 1.4.1
VimpelCom (Beeline)HTTP / 1.1 200 OKMicrosoft-IIS / 7.5
MosigraHTTP / 1.1 200 OKnginx / 1.4.1
NordavindHTTP / 1.1 200 OKnginx / 1.0.4
YandexHTTP / 1.1 200 OKnginx / 1.2.1


Request 10


The first of the wrong requests. Let's send the correct “Host:”, but in the full address we will add a nonexistent subdomain.
  GET http://fake.domain.name/path/to/resource.html HTTP / 1.1
 Host: domain.name 


Since requests started with errors, the results should not be scary.
CompanyServer response
Apps4allHTTP / 1.1 301 Moved Permanently
BadooHTTP / 1.1 301 Moved Permanently
Box OverviewHTTP / 1.1 200 OK
DevconfHTTP / 1.1 404 Not Found
e-Legion Ltd.HTTP / 1.1 301 Moved Permanently
IbmHTTP / 1.1 200 OK
IntelHTTP / 1.1 200 OK
JetBrainsHTTP / 1.1 301 Moved Permanently
KolibriOS Project TeamHTTP / 1.1 404 Not Found
Mail.Ru GroupHTTP / 1.1 200 OK
MicrosoftHTTP / 1.1 200 OK
Opera Software ASAHTTP / 1.1 404 Not Found
RusonyxHTTP / 1.1 301 Moved Permanently
UIDGHTTP / 1.1 404 Not Found
Zfort groupHTTP / 1.1 404 Not Found
VimpelCom (Beeline)HTTP / 1.1 302 Redirect
MosigraHTTP / 1.1 301 Moved Permanently
NordavindHTTP / 1.1 200 OK
YandexHTTP / 1.1 404 Not Found


Almost a third of the servers did not waste time trying to suggest the correct path (redirect). Unfortunately, many servers simply redirect to the main page.

Request 11


Now we will try to send a nonexistent domain.
  GET http: //local.fake/path/to/resource.html HTTP / 1.1
 Host: domain.name 


Here, the results completely coincide with the previous request, but Mosigra instead of “HTTP / 1.1 301 Moved Permanently” issued already “HTTP / 1.1 404 Not Found”.

Request 12


Will arbitrary text work as a domain at all?
  GET http: // l-IjFN = fiG (w + J2p: #. {92! M`d ^? / Path / to / resource.html HTTP / 1.1
 Host: domain.name 


The answer "HTTP / 1.1 200 OK" came from Intel and Opera Software ASA. IBM and Mosigra have returned HTTP / 1.1 404 Not Found. All the rest wrote 404 Bad Request, and the part without a header at all (possible option in HTTP / 1.0).

Request 13


A copy of the eleventh request, but also with a subdomain as “Host:”. It hardly makes sense to check other incorrect combinations.
  GET http: //local.fake/path/to/resource.html HTTP / 1.1
 Host: void.domain.name 


The results also became a copy of request 11, but surrendered to Intel and returned an “HTTP / 1.0 400 Bad Request”.

Request 14


The second request, but use the nonexistent protocol when specifying the full address. There must already be a mistake.
  GET habr: //domain.name/path/to/resource.html HTTP / 1.1
 Host: domain.name 


It turned out that quite a few sites perceive the HABR protocol:

CompanyServer response
Apps4allHTTP / 1.1 200 OK
BadooHTTP / 1.1 200 OK
Box OverviewHTTP / 1.1 200 OK
DevconfHTTP / 1.1 200 OK
e-Legion Ltd.HTTP / 1.1 200 OK
IbmHTTP / 1.1 200 OK
IntelHTTP / 1.0 400 Bad Request
JetBrainsHTTP / 1.1 200 OK
KolibriOS Project TeamHTTP / 1.1 301 Moved Permanently
Mail.Ru GroupHTTP / 1.1 200 OK
MicrosoftHTTP / 1.1 400 Bad Request
Opera Software ASAHTTP / 1.1 400 BAD_REQUEST
RusonyxHTTP / 1.1 200 OK
UIDGHTTP / 1.1 200 OK
Zfort groupHTTP / 1.1 200 OK
VimpelCom (Beeline)HTTP / 1.1 400 Bad Request
MosigraHTTP / 1.1 400 BAD_REQUEST
NordavindHTTP / 1.1 200 OK
YandexHTTP / 1.1 200 OK


Request 15


Let's try to finally break the resistance of the server and send the previous request, but with an incorrect subdomain.
  GET habr: //void.domain.name/path/to/resource.html HTTP / 1.1
 Host: domain.name 


The results are similar to the tenth query, but there are also changes:

CompanyRequest 10Request 15
Apps4allHTTP / 1.1 301 Moved PermanentlyHTTP / 1.1 301 Moved Permanently
BadooHTTP / 1.1 301 Moved PermanentlyHTTP / 1.1 301 Moved Permanently
Box OverviewHTTP / 1.1 200 OKHTTP / 1.1 200 OK
DevconfHTTP / 1.1 404 Not FoundHTTP / 1.1 404 Not Found
e-Legion Ltd.HTTP / 1.1 301 Moved PermanentlyHTTP / 1.1 301 Moved Permanently
IbmHTTP / 1.1 200 OKHTTP / 1.1 200 OK
IntelHTTP / 1.1 200 OKHTTP / 1.0 400 Bad Request
JetBrainsHTTP / 1.1 301 Moved PermanentlyHTTP / 1.1 301 Moved Permanently
KolibriOS Project TeamHTTP / 1.1 404 Not FoundHTTP / 1.1 301 Moved Permanently
Mail.Ru GroupHTTP / 1.1 200 OKHTTP / 1.1 200 OK
MicrosoftHTTP / 1.1 200 OKHTTP / 1.1 400 Bad Request
Opera Software ASAHTTP / 1.1 404 Not FoundHTTP / 1.1 400 BAD_REQUEST
RusonyxHTTP / 1.1 301 Moved PermanentlyHTTP / 1.1 301 Moved Permanently
UIDGHTTP / 1.1 404 Not FoundHTTP / 1.1 404 Not Found
Zfort groupHTTP / 1.1 404 Not FoundHTTP / 1.1 404 Not Found
VimpelCom (Beeline)HTTP / 1.1 302 RedirectHTTP / 1.1 400 Bad Request
MosigraHTTP / 1.1 301 Moved PermanentlyHTTP / 1.1 400 BAD_REQUEST
NordavindHTTP / 1.1 200 OKHTTP / 1.1 200 OK
YandexHTTP / 1.1 404 Not FoundHTTP / 1.1 404 Not Found


Request 16


Let's try to use an arbitrary domain.
  GET habr: //local.fake/path/to/resource.html HTTP / 1.1
 Host: domain.name 


The results matched the previous query.

Request 17


And for the third time we will try to replace the domain with arbitrary text.
  GET habr: // l-IjFN = fiG (w + J2p: #. {92! M`d ^? / Path / to / resource.html HTTP / 1.1
 Host: domain.name 


Already no positive response from the server. Compared to request 12, the following sites have changes:

CompanyRequest 12Request 17
IntelHTTP / 1.1 200 OKHTTP / 1.0 400 Bad Request
KolibriOS Project TeamHTTP / 1.1 400 Bad RequestHTTP / 1.1 301 Moved Permanently
Opera Software ASAHTTP / 1.1 200 OKHTTP / 1.1 400 BAD_REQUEST
MosigraHTTP / 1.1 404 Not FoundHTTP / 1.1 400 BAD_REQUEST


Request 18


And now let's try to get rid of the correct “Host:” header.
  GET habr: // l-IjFN = fiG (w + J2p: #. {92! M`d ^? / Path / to / resource.html HTTP / 1.1
 Host: local.fake 


Only one change from the previous result - the KolibriOS Project Team server began to return “HTTP / 1.1 404 Not Found” instead of “HTTP / 1.1 301 Moved Permanently”.

Query N


Write if you want to try any other query options. And you can do it yourself.

Conclusion


Let's try to sum up some results. Almost all the servers reviewed correctly responded to HTTP / 1.1 requests. DevConf, e-Legion Ltd. made an exception. and Intel. The first two use nginx, so the problem is most likely precisely in its configuration. Intel uses AkamaiGHost, which is either configured incorrectly or does not support HTTP / 1.1. I admit that one of the reasons for passing tests correctly is nginx (14 out of 19 servers used it). Due to the difference in versions, a chain of nginx / 1.0.10 and nginx / 1.4.1 was discovered in the UIDG.

You think that everything is simple? Try setting up Apache with SEO in mind so that it correctly processes requests with an erroneous “Host:” and is based only on the full address in the query string.

What is the practical meaning of the "wrong" correct requests? I doubt it will be possible to find any vulnerability. But has it really been possible for almost fifteen years that no one has learned to create correct HTTP / 1.1 servers?

PS Remember the differences between % {REQUEST_URI} in Apache mod_rewrite and $ _SERVER ["REQUEST_URI"] in PHP.

UPD1:

Request 19


On the advice of AEP, he took the second request, but added another zero byte and some string to the host. It depended on how well the server would ignore a host with a zero byte.
  GET http://domain.name/path/to/resource.html HTTP / 1.1
 Host: domain.name {zero byte} fake_and_void 

Added the following template to the script:
 http_check(title, '19', parts[1], 'GET http://' + parts[1] + parts[2] + ' HTTP/1.1', parts[1] + '\0fake_and_void_text'); 


All servers returned “HTTP / 1.1 400 Bad Request”, except IBM, Opera Software ASA and Mosigra.
When I tried to add a zero byte to the request, then apart from IBM and Opera Software, everyone reported error 400.

Source: https://habr.com/ru/post/183668/


All Articles