📜 ⬆️ ⬇️

Transformation and translation of websites into other languages ​​on the fly using Nginx





In my first post, I described the use of Apache Traffic Server as a caching reverse-proxy. In the reviews I was asked why not nginx? Since there was still no convenient way to transform the site content in ATS, I decided to explore the possibilities of Nginx. To solve the problem, we had to go deep into the maze of documentation, and this is what happened ...

Like last time, we will transform the site example.com into example.ru . I will not talk about setting up and installing Nginx (there are a lot of articles about it), but rather I’ll tell you about specific useful settings.
')
Global configuration - nginx.conf
worker_processes 4; #    ,      # cat /proc/cpuinfo | grep processor | wc -   http { resolver 127.0.0.1; #    .        reverse proxy.  127.0.0.1   DNS   .   ,     ,            .     resolver  .  DNS        ! include options.conf; include mime.types; #      ,  , -,  iso-8859-1.   - ,    ,  UTF-8. charset utf-8; override_charset on; source_charset iso-8859-1; charset_map iso-8859-1 utf-8 { } #  charset_map - ,     !     charset_map iso-8859-1 _ { }    . #    . ?     ! proxy_cache_path /usr/local/nginx/proxy_temp/ levels=1:2 keys_zone=cache-zone:10m inactive=10m max_size=1000M; #    proxy_store_access user:rw group:rw all:r; #   proxy_cahe  proxy_store        ,        .           . #      … include example.conf; } 

Setting up a virtual server - example.conf
 server { listen 80; server_name example.ru; access_log logs/example.ru.access.log main; error_log logs/example.ru.error.log; index index.html; root /usr/local/nginx/html/example.ru; #    : rewrite ^/(/broken_page.*) http://www.example.com/$1 permanent; #     “” ,     Nginx    reverse-proxy.      ,       ,      404.  .    Nginx  Pop-up     . #    ! #   /img/      URI    .  . location ^~ /img/ { #    ?   ,     ! root /usr/local/nginx/html/example.ru; #  ! try_files $uri $uri/ @static; } #      ,                   ,   ,   ,     @static,     .        ()   , ,  ,          .     ,         -. #   : location @static { proxy_pass http://super-cdn.com$uri; #       ,  1-2     .         : http://super-cdn.com/img proxy_store /usr/local/nginx/html/${uri}; expires max; # expires max -    ,        .                  .    ,      .   (  ,  )         nginx -s reload.          firebug     .     Nginx. #      ! access_log logs/2.access.log; error_log logs/2.error.log debug; } #      . location / { include example-transform.conf; #    proxy_pass http://www.example.com; #   proxy_redirect off; proxy_cache cache-zone; #        ? proxy_cache_min_uses 2; proxy_cache_valid 200 1h; #           : proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } } 


We remember that the pictures of the original site are located at:
  http://super-cdn.com/img 


However, running the above configuration, we see that Nginx left all links of the form
  = http: //super-cdn.com/img/* 
not proxied. It is clear that this is due to the difference of the domain. It is also clear that in such a situation we have no opportunity to slip the translated pictures, because the user takes them directly from the CDN bypassing us. So the transformation magic is required!

Open example-transform.conf
 sub_filter 'http://super-cdn.com/img/' 'http://example.ru/img/' ; sub_filter_once off; 


By default, the transformation module is not compiled! Parameter needed:
  ./configure --with-http_sub_module 


Compile, run ... Bingo !!! Fanfare !!!
Everything works, everything flies, pictures can be replaced on the go.

We happily add the first rule to replace the page content and ... find out the shocking information that the http_sub_module module allows you to perform only one replacement!

Oh Igor Sysoev, why did you hide this monstrous fact in the documentation page!
sysoev.ru/nginx/docs/http/ngx_http_sub_module.html

Oh, if we knew about such a final at the very beginning! But ... Stop! Emotions aside, because the Russians and the Chinese are brothers forever! The simple Chinese boy Weibin Yao has already solved our problem and created a module of substitutions, which is available at
code.google.com/p/substitutions4nginx

Install the module according to the instructions, open our transformation configuration and feel free to write:

 subs_filter 'http://super-cdn.com/img/' 'http://example.ru/img/' g; subs_filter '<title[^>]*>(.*?)</title>' '<title>        . !</title>' oir; 

Instead of epilogue
1. The config was written from the worker, but for obvious reasons it was not checked live. If there are difficulties I will try to help.

2. Nginx works in a non-standard little-studied mode. There is such a bug .

3. The module substitutions4nginx is not run-in, one bug was found immediately and was quickly fixed. Apparently need a crowd of testers.

4. In Apache Traffic Server and in Nginx, I found the same bug. In this case, tags with links of type a href that contain line breaks are not acidified. I suspect that the problem stretches from the PCRE library.

ZY Who can tell why the source lang = “bash” backlight does not work in the second block, that is a bun!

ZZY The problem with the illumination was determined by yeah , the parser breaks on the code:
  : // 
I leave the block without illumination, so as not to introduce unnecessary errors in the config.

Source: https://habr.com/ru/post/114845/


All Articles