Hello, dear habrazhiteli!
This is a continuation of the
introductory article on personalizing the Internet. The following briefly describes the technology on which the company's personalized Internet personalization products are based.
Avvea has developed a technology for rewriting the content of dynamic web pages. This technology is not new and is known as reverse proxy. Examples of high-quality reverse proxy servers in the business field are F5 and Juniper products. Technology development reverse proxy servers of each of these companies have been more than a decade of development and are aimed at supporting a limited number of complex applications of corporate clients.
')
An example of an amateur-level reversy-proxy is the free development Glype. There are a lot of such servers, among them the so-called anonymizers are most popular.
Consider some technical features. The main task of the reverse proxy server is to create a virtual layer between the browser interface and the client program code. And the better this problem is solved, the better the final reverse proxy server.

Our technology involves the interception and rewriting of all interface properties and methods of all active elements of a web page (HTML, JavaScript, Adobe Flash, Java and others). Thus, a kind of “virtual browser” is created inside a real browser.
If HTML is simple, and even free anonymizers do a good job with static HTML census, for example, with JavaScript, things are much more complicated. Until now, there was no single approach to solving this problem. Our development has changed the situation.
We will show the main idea of ​​the new approach on the example of the JavaScript rewriting machine of our reverse proxy server. Schematically, the machine consists of 3 parts: lexer (lexer), parser (parser) and patcher (patcher), each of which is an independent element of the system.

Lexer is the syntactic basis of the machine, which is responsible for recognizing the elements of a language in a stream of characters. The parser, based on the lexer, parses the incoming stream into its components: variables, functions, operations, the language itself, etc. The patcher is applied to the found interface properties and methods.
A few words about how it works.
Input streams are processed based on expressions - elementary, indivisible constructions of user program code from the point of view of their processing by the parser. The parser uses a specialized two-stack disassembly / assembly method. In one stack, operands, in the other, operations with their priorities. One cycle of filling stacks - one expression. Due to this, the system turned out to be single pass and streaming. This means that the code is given to the browser as it is downloaded. Those. Expressions encountered at the beginning are processed as soon as they appear completely in the patcher's buffer, without waiting for the entire code to load. All this had a positive effect on the performance and "lightness" of the system. In combination with intelligent caching of the patched code, you can sometimes observe the effect when a website rewritten by a reverse proxy server is faster than directly.
Thus, for the first time, a reverse proxy server was created that works correctly with the real Internet sites, and not with a limited set of browser applications. This means that the technology can already be successfully used.
We will write in more detail in the following articles about how to apply our technology to solve specific problems.