📜 ⬆️ ⬇️

Beautiful print to PDF from Django

Because This article is the result of several years of experiments, then there will be a lot of letters. But - perhaps - it will save someone many months of cycling on a rake, which are described.
In general, this is not even about Django, but about printing regulated documents from python using template engines.
To whom it is too lazy to read further - I will immediately say - the problem has not been completely resolved. But more or less working version loomed.

1. Task




2. Limitations



The first stage is the selection of the final format. After some thought from various tz. (cross-platform, guaranteed results, convertibility in) the choice fell on PDF.
Now - input formats and how to convert them.

3. Soft forms


Odf

We are talking about Open Document Format - ODS, ODT and others.
Everything is very simple here:

Place for data: either we add user-defined fields to the document - or we insert {{django}} {{tags_django}} directly into the text. In the first case, filling in these fields later from python is most likely possible, but I can’t even imagine how (or rather, everything that is presented looks extremely confusing). Therefore, simply arrange the tags as text.
In this case, filling in the fields is elementary - we simply feed the template to the Django template engine (we’ll leave picking the python libraries inside the template to the gantushnikam :-). And in order not to unzip / zip the documents with every kick - documents are saved in * .fodX (Flat X) - the only one unpacked xml. The template is fed as xml.
Obtaining a PDF — without options — using LibreOffice: feeding the demon LibreOffice (libreofficed (found somewhere at ubuntovodov)) or unoconv or handmade LO launch in daemon mode. All of these options are about the same.
Virtues


disadvantages


Summary

As an extreme backup option - suitable. But just as extreme.
HTML

Here, with the editing of templates (with hands) and the template engine (distortion) everything is clear. Only one small one remains, but the main question is how to get the PDF? Quickly, efficiently, with page breaks where necessary. And here was the most experiments.
Numerous experiments with pure python html render (such as PISA and ancestors / heirs / forks) led to one important (IMHO) conclusion: to get a guaranteed result, use a ready-made html engine. Which, as we all know, already 4 (from normal). From them it is possible to use as much as 2 in linux - gecko and webkit. Most likely, it is possible to call a gecko from python - but a) for this you need a running X (as in the case of LibreOffice) and b) [semi] I did not find the finished recipe.
There is a webkit:

Virtues


disadvantages


Summary

The main option for “soft” documents. But still, you need to look for high-quality pure python html render - without flash drives, JS and other cartoons - but with high-quality processing of CSS.
maybe

For the future, TeX, LaTeX, Lyx, docbook formats are considered - but so far there are no advantages (especially for “almost soft” forms - like the same 21001).
')

4. Hard forms


Here everything is much sadder. Especially in the light of the fact that there is already a visual editor is highly desirable.
In addition - the vast majority (if not all) of “hard” RF forms use “squares” - when the text is broken into letters - and each fits into its own square ( example ).
Let's drop the first available ones (like “drag the text onto the tiff”) and go straight to the finalists.
RML

The development of Reportlab (yes, python-reportlab is theirs) is an ordinary XML that allows you to create miracles from PDF. Because The well-known python-trml2pdf is already RIP (as the developer honestly wrote it to me) - I had to take this trml2pdf and finish it a bit, because It does not support many interesting features of RML, and religion prohibits me from buying (and even less breaking) commercial rml2pdf.
Virtues


disadvantages

Summary

Substitute option for accurate forms (especially simple ones).
PDF forms

Everything is very simple here: source in PDF - and the final result in PDF.
  1. Take the original PDF form in your left hand
  2. XFDF (unpretentious xml), processed by the built-in Django template - to the right
  3. merge them (populate) into a new PDF (“unrolled” - flatten)
  4. and give the user

The problem is only one - p.3.
To date, the native and correctly working python API for working with PDF forms has not been found (although poppler can already do something — but there is still a lot of sawing there), so the only acceptable option is iText . Through pdftk or your bike - this is already to your taste.
Virtues


disadvantages


Summary

The main option for accurate printing forms.

5. General summary


Total formed today:


Ps. How it all works - you can see here - without ODF and RML, but the latter are provided.

Source: https://habr.com/ru/post/148612/


All Articles