📜 ⬆️ ⬇️

mod_rewrite - just about complicated

What it is?


mod_rewrite is an Apache web server module for URL conversion. The module uses in its work rules that can be described both in the server configuration (httpd.conf) and in the .htaccess files directly in the file structure of your site. Rules are described as PCRE regular expressions.

Hello world


The simplest example. Suppose you want no one to know that your site is written in PHP and decide to mask file extensions. You can, of course, make the appropriate directive in the Apache configuration and then all files with the extension ".msl" ("My Super Language") will be processed by the PHP interpreter. But you can do it easier:
create a .htaccess file in the root of our site with the following content
RewriteEngine On
RewriteBase /
RewriteRule ^(.*)\.msl$ $1.php [QSA,L]


The first directive includes the mod_rewrite mechanism in the current folder and in all its subfolders. The second indicates to the mod_rewrite module that the current folder in the file system corresponds to the site root. The third is the URL translation rule itself.

You can read it like this:
If immediately after the beginning of the line ("^") there is an arbitrary number of any characters ("(. *)"), And we want to remember what kind of characters these are, surrounding them with brackets, then a dot ("\.") Goes (we screen point, because a single point is just any character), then the characters “msl” and the line ends ("$"), then we replace the original URL with the following: take the first memorized substring in brackets from the rule, add to it ". php ", add all additional address parameters that could be" [QSA] "and finish this, we will not use further mation if they are "[L]"
')
Everything, now you can safely change all links ending in ".php" to ".msl" and write on your blog that you have invented a new scripting language. Apache, having encountered a link to “index.msl” using mod_rewrite on the fly, converts it into “index.php” and invokes the necessary script.

And what else mod_rewrite can do?


Oh, this module can do a lot. Personally, I’m waiting for someone sufficiently advanced in magic and PCRE to write Sea Battle on mod_rewrite.

But so far this has not happened, I will show a few more options for using this wonderful module.

Suppose you decide to write a blog engine. Each user will be able to create his own blog, choose a name for it, write to his blog and read other people's posts.

Primary data filtering

Suppose that user blog addresses will look like "/ blogs / ABC /", and the script that will show the tape of a particular blog is called "viewblog.php".
A simple mod_rewirte rule will allow us to weed out incorrect blog names that attackers can use:
RewriteRule blogs/([a-z0-9_-]+)([\/]{0,1})$ viewblog.php?blogname=$1 [L]
RewriteRule viewblog.php - [F]


In square brackets, in accordance with the PCRE syntax, we define a class of characters, including numbers, Latin letters, minus and underscore characters. All addresses in which there will be some other characters will not be tested by this rule and will result in error 404. The [L] flag is required so that the mod_rewrite engine, having successfully completed the conversion, does not go further to the second rule. This flag is similar to the break statement inside the loop.

The second rule does not directly specify the address translation (the "-" symbol), but prohibits direct access to the viewblog.php script (the "[F]" flag), thereby preventing the villains from transmitting something malicious in the parameters.

By the way:
It will be a good idea to start your rules from the line
RewriteRule .htaccess - [F]
This will prevent access to the .htaccess file in case of a badly configured hosting service.


Use for caching in FS

Suppose your project grows. Hosting stops coping with loads - hundreds of bloggers, tens of thousands of views of their blogs, and even comments ...

And here mod_rewrite can come to the rescue, if for some reason you do not want to go to your server.

First, modify your viewblog.php script so that when accessing it, it not only renders the generated page to the browser, but also writes it to the file system at /blogs/ABC.html

The easiest way to do this is by using the buffering control functions. Suppose that the source code of the viewblog.php script looks like this:
<?php
$blogname = $_GET['blogname'];
if ( !Blogs::exists($blogname) )
die("No blog!");

Blogs::display($blogname);
?>


Apply output buffering and write the output to a file.
<?php
$blogname = $_GET['blogname'];
if ( !Blogs::exists($blogname) )
die("No blog!");

ob_start();
Blogs::display($blogname);
$content = ob_get_contents();
ob_end_flush();

$f = fopen(_YOUR_SITE_ROOT . "/blogs/" . $blogname . ".html", "w");
fwrite($f, $content);
fclose($f);
?>


Now it remains only to slightly modify your rules in .htaccess to get a full-fledged content caching system:

RewriteRule blogs\/([a-z0-9_-]+)([\/]{0,1})$ blogs\/$1\.html

RewriteCond %{REQUEST_FILENAME} blogs\/([a-z0-9_-]+)\.html$
RewriteCond %{REQUEST_FILENAME} !-s
RewriteRule (.*) viewblog.php?blogname=%1


In the first line, we convert the URLs of the type blogs / ABC / to blogs / ABC.html, thus redirecting Apache to the page cache file generated by us.
The next three lines represent one big rule. If there is a request for blogs / ABC.html and there is no such file in the file system, the request is redirected to the script viewblog.php

Thus, we can only provide a system for timely cleaning of the cache and the problem is solved.

Other uses


Personally, I use the mod_rewrite module similarly to the last example for generating and storing image previews in FS.

It is very easy using mod_rewrite to map subdomains to folders, for example forum.localhost.localdomain will be physically located in localhost.localdomain / forum, which is often easier for the application developer.

Irreplaceable mod_rewrite to restrict file downloads on file hosting or in a digital goods store (you have to use the symbolic link mechanism) or to ban hotlinking (by checking the referrer).

In general, this is Voodoo :)
Damn interesting Voodoo, allowing every day to discover new aspects and applications.

What else to read?


httpd.apache.org/docs/2.2/mod/mod_rewrite.html
www.codenet.ru/webmast/php/mod_rewrite.php
www.egoroff.spb.ru/portfolio/apache/mod_rewrite.html
regexp.ru

Source: https://habr.com/ru/post/83597/


All Articles