Jinja2 Extensions Guide

Jinja2 - Python-library for rendering templates, which is the de facto standard when writing web applications on Flask and a fairly popular alternative to the built-in Django template system. Although being strongly tied to the language, Jinja2 positions itself as a tool for designers and layout designers, simplifying the layout and separating it from development, and trying to isolate non-developers from Python as far as possible. Layout, however, is not the only possible application; For example, in my work I use Jinja2 templates to generate SQL queries.

Jinja2 is extensible , and many features (for example, internationalization and loop management) are implemented as extensions. However, the documentation on writing extensions, it seems to me, is somewhat incomplete; from the example of a simple (but carefully commented) extension, it jumps immediately to the description of the API of some Jinja2 classes, which is rather difficult to read in a row. In this article, I will try to correct this omission and create in the reader’s head a complete and clear picture of how Jinja2 works, how its extensions are arranged and how to use extensions to modify different stages of pattern processing.

Globally, Jinja2 compiles each pattern in Python executable, which accepts a context for input and returns a string - a rendered pattern. The whole process looks like this.

Loading You can store templates in the file system, in a folder with your Python package, in memory, or simply generate on the fly — first of all, Jinja2 determines which method is relevant and loads the template's sources into memory.
Tokenization . The lexer analyzer (lexer) beats the template source code for the simplest entities - tokens. An example of a token is the opening tag {% .
Parsing The parser parses the stream of tokens, isolating the syntax constructs. An example of a syntactic construct is the {{ variable }} construct that substitutes the value of a variable (it consists of three tokens — the opening {{ , variable name and closing }} ).
Optimization . At this stage, all constant expressions are evaluated. For example, the {{ 1 + 2 }} construct will be turned into {{ 3 }} .
Generation Syntax constructs that are still stored as an abstract syntax tree (AST) are converted into Python code.
Compilation The resulting Python code is compiled by the built-in compile functions. The resulting object can be run using the built-in exec function, which is what templates do when rendering.

To create an extension in Jinja2, you need to define a class that inherits from jinja2.ext.Extension . To activate an extension, it is enough to list it in the list of extensions when creating an environment (environment) or add it after creation by the add_extension method.
')
A brief illustration instead of a thousand words:

 from jinja2 import Environment from jinja2.ext import Extension class MyFirstExtension(Extension): pass class MySecondExtension(Extension): pass environment = Environment(extensions=[MyFirstExtension]) environment.add_extension(MySecondExtension) print(environment.extensions) #  -  # {'__main__.MySecondExtension': <__main__.MySecondExtension object at 0x0000000002FF1780>, '__main__.MyFirstExtension': <__main__.MyFirstExtension object at 0x0000000002FE9BA8>}

It remains to teach them to do something! To do this, we have, by and large, only three methods that can be overridden:

preprocess ;
filter_stream (whatever that means);
parse .

Well, let's start in order.

The simplest way to manage directly downloading source templates is to implement your own loader. Make it elementary: inherit from jinja2.loaders.BaseLoader , override the get_source(environment, template_name) method get_source(environment, template_name) - done. Sometimes it is even meaningful. So, if one day you were able to replace the whole template folder with one elegant function generating them, for backward compatibility with other parts of the program, you may want to write a bootloader pretending that these templates are still there (and make a sweet git rm ) .

However, this is offtopic: where are the extensions? It is clear that I can at any moment inherit from what I want and change there that I see fit! Surprisingly, the API extensions, too, just in case, there is a way to directly manage the source code of the templates.

Thus, the Extension class contains the preprocess method, which is called for each template after loading and before tokenization. The signature looks like this:

 def preprocess(self, source, name, filename=None): """ : source (String) -    name (String) -   filename (String  None) -   ( ) : String -     """

In this method, you can do anything. Technically, somewhere here you can implement the compilation of your own template language into Jinja2 templates. But why? Probably, the ability to modify the source directly may be useful to you as an auxiliary when writing non-trivial extensions. However, knowledge of the Jinja2 API or features of its implementation is not required here, so we will no longer go into the details of this stage and move on to the tokenization.

The filter_stream method is of much more interest to us. It attracts us with rich possibilities for customization, which it opens and with its mysterious name. The signature looks like this:

 def filter_stream(self, stream): """ : stream (jinja2.lexer.TokenStream) -      : jinja2.lexer.TokenStream -      """

In general, the interaction of lexical and syntactic analyzers in Jinja2 is arranged as follows. The lexical analyzer ( jinja2.lexer.Lexer ) produces a generator that jinja2.lexer.Token all tokens one after the other ( jinja2.lexer.Token ) and wraps this generator into a jinja2.lexer.TokenStream object, which buffers the stream and provides a number of convenient methods for parsing (for example, the ability to view the current token without pulling it out of the stream). Extensions, in turn, can influence this stream, and not only filter (as the name of the method suggests), but also enrich.

Tokens in Jinja2 are very simple objects. In essence, these are tuples of three named fields:

lineno - line number with token;
type - token type;
value is the string value of the token.

The various constants for the type field are defined in jinja2/lexer.py :

 TOKEN_ADD TOKEN_NE TOKEN_VARIABLE_BEGIN TOKEN_ASSIGN TOKEN_PIPE TOKEN_VARIABLE_END TOKEN_COLON TOKEN_POW TOKEN_RAW_BEGIN TOKEN_COMMA TOKEN_RBRACE TOKEN_RAW_END TOKEN_DIV TOKEN_RBRACKET TOKEN_COMMENT_BEGIN TOKEN_DOT TOKEN_RPAREN TOKEN_COMMENT_END TOKEN_EQ TOKEN_SEMICOLON TOKEN_COMMENT TOKEN_FLOORDIV TOKEN_SUB TOKEN_LINESTATEMENT_BEGIN TOKEN_GT TOKEN_TILDE TOKEN_LINESTATEMENT_END TOKEN_GTEQ TOKEN_WHITESPACE TOKEN_LINECOMMENT_BEGIN TOKEN_LBRACE TOKEN_FLOAT TOKEN_LINECOMMENT_END TOKEN_LBRACKET TOKEN_INTEGER TOKEN_LINECOMMENT TOKEN_LPAREN TOKEN_NAME TOKEN_DATA TOKEN_LT TOKEN_STRING TOKEN_INITIAL TOKEN_LTEQ TOKEN_OPERATOR TOKEN_EOF TOKEN_MOD TOKEN_BLOCK_BEGIN TOKEN_MUL TOKEN_BLOCK_END

A typical extension manipulating tokens should look something like this:

 from jinja2.ext import Extension from jinja2.lexer import TokenStream class TokensModifyingExtension(Extension): def filter_stream(self, stream): generator = self._generator(stream) return lexer.TokenStream(generator, stream.name, stream.filename) def _generator(self, stream): for token in stream: #        .  . #   -    yield token #   .

As an example, let's write an extension that changes the logic for rendering variables. Suppose you want some of your objects to behave differently when they are converted to a string by the str function when rendering in Jinja2. Let our objects have an option to define the __jinja__(self) method to be used in the templates. The easiest way to do this is by adding a custom filter that calls the __jinja__ method, and automatically substitute its call into each construct of the form {{ <expression> }} . All extension code will look like this:

 from jinja2 import Environment from jinja2.ext import Extension from jinja2 import lexer class VariablesCustomRenderingExtension(Extension): #    .         # ,       . @staticmethod def _jinja_or_str(obj): try: return obj.__jinja__() except AttributeError: return obj def __init__(self, environment): super(VariablesCustomRenderingExtension, self).__init__(environment) #    .     #      ,   . self._filter_name = "jinja_or_str" environment.filters.setdefault(self._filter_name, self._jinja_or_str) def filter_stream(self, stream): generator = self._generator(stream) return lexer.TokenStream(generator, stream.name, stream.filename) def _generator(self, stream): #     ,     # {{ <expression> }}   {{ (<expression>)|jinja_or_str }} for token in stream: if token.type == lexer.TOKEN_VARIABLE_END: #     {{ <expression> }} -  #   `)|jinja_or_str`. yield lexer.Token(token.lineno, lexer.TOKEN_RPAREN, ")") yield lexer.Token(token.lineno, lexer.TOKEN_PIPE, "|") yield lexer.Token( token.lineno, lexer.TOKEN_NAME, self._filter_name) yield token if token.type == lexer.TOKEN_VARIABLE_BEGIN: #     {{ <expression> }} -  #   `(`. yield lexer.Token(token.lineno, lexer.TOKEN_LPAREN, "(")

Usage example:

 class Kohai(object): def __jinja__(self): return "senpai rendered me!" if __name__ == "__main__": env = Environment(extensions=[VariablesCustomRenderingExtension]) template = env.from_string("""Kohai says: {{ kohai }}""") print(template.render(kohai=Kohai())) #  "Kohai says: senpai rendered me!".

Can be viewed entirely on Github .

The last and most interesting method of the Extension class available for overriding is parse .

 def parse(self, parser): """ : parse (jinja2.parser.Parser) -    : jinja2.nodes.Stmt  List[jinja2.nodes.Stmt] -  AST,     """

It works in conjunction with the tags attribute, which can be defined in the extension class. This attribute must contain multiple tags, the processing of which will be entrusted to your extension, for example:

 class RepeatNTimesExtension(Extension): tags = {"repeat"}

Accordingly, the parse method will be called when the syntax analysis reaches the construction with the beginning of the corresponding tag:

 some text and then {% repeat ... ^

At the same time, the parser.stream.current attribute indicating the token currently being processed will contain Token(lineno, TOKEN_NAME, "repeat") .

Next, inside the parse method, we need to parse our custom tag and return the result of the parsing — one or more nodes of the syntax tree. Jinja2 does not allow you to start your own node types, so you have to be content with built-in ones; Fortunately, there is a (almost) universal CallBlock node, which I will CallBlock below.

In the meantime, the logic of existing types of nodes like For us suits us, here is a set of recipes that you may want to use inside the parse method.

lineno = next(parser.stream).lineno
Usually the first line in the parse code. The call next shifts the parser to the next token after the tag name and returns the current one. We only remember the line number from it; we will need to specify it when creating nodes, so that in case of errors in the traceback, their source is correctly indicated - our custom tag. (Details on the creation of nodes will be slightly lower.)
parser.stream.expect(token_description)
Return the current token and move to the next, if the current fits the description, or fall with an error. Here the description is either a type token, or a string of the form "type:value" . So, parser.stream.expect("integer") will try to read the number and return it, or it falls; parser.stream.expect("name:in") used when parsing the for tag to make sure that the in keyword goes further in the code and skip it.
parser.stream.skip_if(token_description)
Returns True and returns to the next token if the current token fits the description; otherwise, it returns False . A typical use is the parsing of optional constructs. For example, everything is in the same parsing code for:
```
 if parser.stream.skip_if('name:if'): test = self.parse_expression() 
```
(Yes, in Jinja2, the for loop has an optional if suffix.)
expr_node = parser.parse_expression()
Attempts to parse the expression and return the corresponding node to AST or falls. It should be used for parsing tag parameters. In the example above, for uses this call to parse the filter condition; he also uses it after expect("name:in") to figure out what iteration the loop will be on.
target_node = parser.parse_assign_target(extra_end_rules=[])
Attempts to parse an lvalue, that is, an expression that can be assigned, or falls. Typical examples are a variable name, several variable names separated by commas, an expression with an index. Because Python allows free commas at the end of tuples (for example, for a, b, c, in []: pass ), this method can accept additional break conditions (for example, when parsing a list of loop variables, it calls it with extra_end_rules=["name:in"] , so that in random is not recognized as another variable).
body_nodes = parser.parse_statements(end_tokens=[], drop_needle=True)
Parsit tag internals. Assumes that parser.stream.current already points to %} (otherwise it drops), and the template parses until it hits the end of the file or the token that end_tokens one of the descriptions in end_tokens . So, the if tag calls this method with end_tokens=["name:elif", "name:else", "name:endif"] . Parameter drop_needle=True indicates that this last token should be thrown away after parsing; conveniently, if the body of your tag can end in only one way.

Parse everything you need, you may want to create one or more tree nodes in order to return them as a result of parsing. What you need to know about creating Jinja2 nodes:

All node classes are defined in jinja2.nodes and are inherited from jinja2.nodes.Node . Their list cannot be expanded.
Only nodes inheriting from jinja2.nodes.Stmt can be directly returned from parse . The rest can sometimes work, but they can break everything. So you can choose from the following classes:
```
 Assign ExprStmt Include AssignBlock Extends Macro Block FilterBlock Output Break For Scope CallBlock FromImport ScopedEvalContextModifier Continue If EvalContextModifier Import 
```

In each class inheriting from Node , a field is defined with a list of fields. You can create a node either by specifying all the fields, or by not specifying any fields (they will be initialized None and their values can be specified later). Also, when creating all nodes with a key argument, you can specify lineno ; use this to get adequate tracebacks in case of errors.
Examples:

 from jinja2.nodes import * #    #  () : template_name = Const("lib/stuff.j2") #     (     None): inc_node = Include(template_name, False, False, lineno=0) #      : inc_node = Include(lineno=0) inc_node.template = template_name inc_node.with_context = False inc_node.ignore_missing = False # ,    .  Jinja2   #      (    If  - # )    ;       #   None,  ,  , . # lineno     : inc_node = Include() inc_node.lineno = 0

Please note that the fields cannot be specified with key arguments: construction

 Include(template=template_name, with_context=False, ignore_missing=False)

will not work.

Many node fields are also nodes. So, Include will not agree to accept the string "lib/stuff.j2" as a template field - only nodes.Const("lib/stuff.j2") . If you are not sure what type this or that field is, find the code parsing the corresponding node in jinja2/parser.py - it’s easy to figure out (at least after reading this article ... it should be).

As an example of applying all of this knowledge, let's consider a simple extension that adds the {% repeat N times %}...{% endrepeat %} construct as syntactic sugar for the {% for _ in range(N) %}...{% endfor %} :

 from jinja2.ext import Extension from jinja2 import nodes class RepeatNTimesExtension(Extension): #  ,          repeat. #      -  endrepeat,  . tags = {"repeat"} def parse(self, parser): lineno = next(parser.stream).lineno #     . "store" -   ( #   "load",       ). index = nodes.Name("_", "store", lineno=lineno) #    N.       . how_many_times = parser.parse_expression() #   - ,  Jinja2   #  `range(N)`. iterable = nodes.Call( nodes.Name("range", "load"), [how_many_times], [], None, None) #      times. #     ,   . parser.stream.expect("name:times") #      {% endrepeat %}. body = parser.parse_statements(["name:endrepeat"], drop_needle=True) #   for.       #  . return nodes.For(index, iterable, body, [], None, False, lineno=lineno)

Usage example:

 if __name__ == "__main__": env = Environment(extensions=[RepeatNTimesExtension]) template = env.from_string(u""" {%- repeat 3 times -%} {% if not loop.first and not loop.last %}, {% endif -%} {% if loop.last %}    {% endif -%}  {%- endrepeat -%} """) print(template.render()) #  ",     ".

Can be viewed entirely on Github .

Since, due to the intricacies of the Jinja2 architecture, it is impossible to add new classes of nodes of the syntactical tree, we need some kind of universal node in which we could do any processing that we like. There is such a node, and this is CallBlock .

Let's first recall how the {% call %} tag works on its own. Example from official documentation :

 {% macro dump_users(users) -%} <ul> {%- for user in users %} <li><p>{{ user.username|e }}</p>{{ caller(user) }}</li> {%- endfor %} </ul> {%- endmacro %} {% call(user) dump_users(list_of_user) %} <dl> <dl>Realname</dl> <dd>{{ user.realname|e }}</dd> <dl>Description</dl> <dd>{{ user.description }}</dd> </dl> {% endcall %}

The following happens:

A temporary macro called caller . Macro body - contents between {% call... %} and {% endcall %} . A macro can either have arguments (in the example above it is one user argument) or not (if the simplified construct {% call something(...) %} ).
The macro specified after the call(...) construction is call(...) . He has access to the caller macro and may use it (or perhaps not).

However, a macro in Jinja2 is nothing more than a function that returns a string. Therefore, the CallBlock node can as well be fed the functions defined by us somewhere in the depths of our extensions.

A typical extension that uses CallBlock for word processing looks something like this:

 from jinja2.ext import Extension from jinja2 import nodes class ReplaceTabsWithSpacesExtension(Extension): tags = {"replacetabs"} def parse(self, parser): lineno = next(parser.stream).lineno #  ,  : body = parser.parse_statements( ["name:endreplacetabs"], drop_needle=True) # ! return nodes.CallBlock( self.call_method("_process", [nodes.Const(" ")]), [], [], body, lineno=lineno) def _process(self, replacement, caller): text = caller() return text.replace("\t", replacement)

How it works?

call_method is a special class method called Extension , which wraps the call to a class method on a Jinja2 node. The result can be passed as a parameter to where Jinja2 expects any expression, and in particular to where it expects exactly the function call — to CallBlock .
When the time returned from our parse CodeBlock 's method is rendered, it will call the ReplaceTabsWithSpacesExtension._process method. First, the arguments specified when calling call_method will be passed (in our case, one argument is a string of four spaces), then the same caller , which is just a Jinja2 macro and which can be simply called to get a string.
If the caller macro should be called with arguments, they should be listed in the fields of the CodeBlock node (where, in our example, there are empty lists).

See the usage example on Github .

And finally, a slightly more complicated example of an extension that uses CallBlock and one more thing that we went through today is an indenting fixer. It is known that it is almost impossible to write at least some non-trivial templates on Jinja2 so that both the source code of the template and the result look good in terms of indents. Let's try to add a tag that corrects this misunderstanding.

 import re from jinja2.ext import Extension from jinja2 import lexer, nodes #       ,  - #   __slots__ = ()   .  , Jinja2 #    - lexer.Token. class RichToken(lexer.Token): pass class AutoindentExtension(Extension): tags = {"autoindent"} #       - #    ? _indent_regex = re.compile(r"^ *") _whitespace_regex = re.compile(r"^\s*$") def _generator(self, stream): #        ,    #    .       (  #  Jinja2). last_line = "" last_indent = 0 for token in stream: if token.type == lexer.TOKEN_DATA: #   - . last_line += token.value if "\n" in last_line: _, last_line = last_line.rsplit("\n", 1) last_indent = self._indent(last_line) #  ^W  . token = RichToken(*token) token.last_indent = last_indent yield token def filter_stream(self, stream): return lexer.TokenStream( self._generator(stream), stream.name, stream.filename) def parse(self, parser): #     autoindent,     , , #   . ,      . last_indent = nodes.Const(parser.stream.current.last_indent) lineno = next(parser.stream).lineno body = parser.parse_statements(["name:endautoindent"], drop_needle=True) #      :) return nodes.CallBlock( self.call_method("_autoindent", [last_indent]), [], [], body, lineno=lineno) def _autoindent(self, last_indent, caller): text = caller() #     ,       #  last_indent.     (, ,  #       ,   ), #       . lines = text.split("\n") if len(lines) < 2: return text first_line, tail_lines = lines[0], lines[1:] min_indent = min( self._indent(line) for line in tail_lines if not self._whitespace_regex.match(line) ) if min_indent <= last_indent: return text dindent = min_indent - last_indent tail = "\n".join(line[dindent:] for line in tail_lines) return "\n".join((first_line, tail)) def _indent(self, string): return len(self._indent_regex.match(string).group())

Usage example:

 if __name__ == "__main__": env = Environment(extensions=[AutoindentExtension]) template = env.from_string(u""" {%- autoindent %} {% if True %} What is true, is true. {% endif %} {% if not False %} But what is false, is not true. {% endif %} {% endautoindent -%} """) print(template.render()) #     .

Github .

!

Jinja , , Jinja ( ).

Source: https://habr.com/ru/post/340254/

All Articles

Jinja2 Extensions Guide

More articles: