📜 ⬆️ ⬇️

We disassemble HTTP Range according to the standard.

In one of the projects I needed to parse the HTTP Range request in order to add support for downloading files in parts. The network is full of various examples, but I have not found a single full implementation of RFC 2616 . One code did not take into account that there may be several ranges, another that the standard allows requests larger than the document size, the third does not distinguish between syntactically correct and unreachable requests, as recommended by the standard. So I decided to write my implementation and share it with everyone. Details and example implementation in PHP under the cut.

As the standard states, a range request consists of two parts: the dimension of the range and the list of sample rules. The only dimension of ranges defined in RFC 2616 is bytes. Also, it is necessary to take into account that in the same Range header there can be several ranges indicated at once, separated by commas.

There are two options for sampling the HTTP range by the client.

The first is an indication of the initial and final position in the body of the document. The first position starts from zero. The last position MUST be greater than or equal to the first position in the request. Otherwise, according to the standard, this implementation MUST be ignored. If the last position is missing or its value is greater than or equal to the document size, then the last position is the current document size in bytes, reduced by 1. By standard, this is not an error, as it allows the client to request a part of the document without knowing its size in advance.
')
For example, for a document with a size of 10 bytes, bytes=1-9 requests 9 bytes, starting from the second and ending with the last byte of the document body.

The second is a sample of the last N bytes of the document body. If the document size is smaller than the one specified in the request, the entire document will be selected. For example, bytes=-2 requests the last 2 bytes.

After all the ranges have been processed, the server SHOULD determine if there is at least one range that contains non-zero bytes. If not, then the server SHOULD answer the client 416 (Requested range not satisfiable), otherwise 206 (Partial Content).

An implementation is considered “conditionally compatible” if it does not fulfill the conditions SHOULD. Thus, full processing of the Range request can be achieved when all the conditions of the standard are met.

An example implementation in PHP:

 <?php namespace HTTP; /* * Copyright (c) 2012, aignospam@gmail.com * http://www.opensource.org/licenses/bsd-license.php */ /** * Parse HTTP Range header * http://tools.ietf.org/html/rfc2616#section-14.35 * return array of Range on success * false on syntactically invalid byte-range-spec * empty array on unsatisfiable bytes-range-set * @param int $entity_body_length * @param string range_header * @return array|bool */ function parse_range_request($entity_body_length, $range_header) { $range_list = array(); if ($entity_body_length == 0) { return $range_list; // mark unsatisfiable } // The only range unit defined by HTTP/1.1 is "bytes". HTTP/1.1 // implementations MAY ignore ranges specified using other units. // Range unit "bytes" is case-insensitive if (preg_match('/^bytes=([^;]+)/i', $range_header, $match)) { $range_set = $match[1]; } else { return false; } // Wherever this construct is used, null elements are allowed, but do // not contribute to the count of elements present. That is, // "(element), , (element) " is permitted, but counts as only two elements. $range_spec_list = preg_split('/,/', $range_set, null, PREG_SPLIT_NO_EMPTY); foreach ($range_spec_list as $range_spec) { $range_spec = trim($range_spec); if (preg_match('/^(\d+)\-$/', $range_spec, $match)) { $first_byte_pos = $match[1]; if ($first_byte_pos > $entity_body_length) { continue; } $first_pos = $first_byte_pos; $last_pos = $entity_body_length - 1; } elseif (preg_match('/^(\d+)\-(\d+)$/', $range_spec, $match)) { $first_byte_pos = $match[1]; $last_byte_pos = $match[2]; // If the last-byte-pos value is present, it MUST be greater than or // equal to the first-byte-pos in that byte-range-spec if ($last_byte_pos < $first_byte_pos) { return false; } $first_pos = $first_byte_pos; $last_pos = min($entity_body_length - 1, $last_byte_pos); } elseif (preg_match('/^\-(\d+)$/', $range_spec, $match)) { $suffix_length = $match[1]; if ($suffix_length == 0) { continue; } $first_pos = $entity_body_length - min($entity_body_length, $suffix_length); $last_pos = $entity_body_length - 1; } else { return false; } $range_list[] = new Range($first_pos, $last_pos); } return $range_list; } class Range { private $_first_pos; private $_last_pos; public function __construct($first_pos, $last_pos) { $this->_first_pos = $first_pos; $this->_last_pos = $last_pos; } public function get_first_pos() { return $this->_first_pos; } public function get_last_pos() { return $this->_last_pos; } } ?> 

Source: https://habr.com/ru/post/138504/


All Articles