Unobvious features of Rotativa for generating PDF in an ASP.NET MVC application

Many developers are faced with the task of creating PDF reports for web applications, quite a natural request. I would like to bring to your attention my experience with this task when using the Rotativa library to generate reports. This, in my opinion, is one of the most convenient libraries for such a goal in its segment, but when using it I encountered several not obvious points that I want to talk about.

To be completely honest, I would like to share with you the set of rakes that I stepped on in the process of integrating this library, no doubt fast and very convenient.

In this article I will not touch the question of choosing a library. Each may have their own reasons for using this or that. I chose Rotativa, because it had everything necessary to meet customer requirements at minimum cost for setting up. Besides her, I tried three or four options.

Formulation of the problem

Web application on ASP.NET MVC, .NET version 4.6. Other features do not matter in this context, with the exception of deployment. It is assumed that the deployment will occur on Azure. This is important because some other libraries (for example HiQPdf) do not transfer installations in certain Azure configurations, this is documented.

I need to open a certain static HTML report for one link, and for the second link - a PDF version of the same report. The report itself is just a set of some tables, fields and graphs for demonstration to the user. Both versions assume the presence of a menu with navigation through report sections, the presence of tables, some graphics (colors, text size, borders).

Rotativa library application

Rotativa is applied as easily as possible in my opinion.

You already have a ready-made HTML report in the form of a template and an ASP.NET MVC controller, such as this:

[HttpGet] public async Task<ActionResult> Index(int param1, string param2) { var model = await service.GetReportDataAsync(param1, param2); return View(model); }

Install nuget Rotativa package
Add new controller for PDF report

 [HttpGet] public async Task<ActionResult> Pdf(int param1, string param2) { var model = await service.GetReportDataAsync(param1, param2); return new ViewAsPdf("Index", model); }

In essence, from now on, you have a PDF returned as a file containing all the data from the original HTML report.

I did not describe the routing here, but it is assumed that you have configured the routes to correctly call both controllers

Interestingly, this library itself is essentially a wrapper over the well-known console utility wkhtmltopdf . The speed of work at height, you can put on Azure - will work. But there are features that we will talk about.

Page Number

It is logical to assume that the customer will print the PDF and will want to see the page number. It's all very simple, thanks to the creators of Rotativa.

According to the Rotativa documentation, you can use the CustomSwitches parameter to specify the arguments that will be passed to the wkhtmltopdf utility wkhtmltopdf . Well, online tips are generous with examples. The following call adds a number to the bottom of each page:

 return new ViewAsPdf("Index", model) { PageMargins = new Rotativa.Options.Margins(10, 10, 10, 10), PageSize = Rotativa.Options.Size.A4, PageOrientation = Rotativa.Options.Orientation.Portrait, CustomSwitches = "--page-offset 0 --footer-center [page] --footer-font-size 8 };

It works great. The page number itself is transmitted using the [page] parameter; such parameters will be replaced by specific values.

In addition to [page] there are others:
[page] Replaced by the number of the pages currently being printed
[frompage] Replaced
[topage] Replaced by last page to be printed
[webpage] Replaced by the URL of the page being printed
[section] Replaced by the current section
[subsection] Replaced by current name subsection
[date] Replaced by local system
[isodate] Replaced by ISO 8601 extended format
[time] Replaced by local system
[title] Replaced by the current page object
[doctitle] Replaced by the title of the output document
[sitepage] Replaced by this page
[sitepages] Replaced by

Table of content

Large multi-page reports require content and page navigation in PDF. This is very convenient and simply vital when the number of pages in the report exceeds one hundred.

wkhtmltopdf manual contains a complete list of all parameters, among which is --toc . Seeing this parameter, the utility essentially collects all the tags <h1>, <h2>, ... <h6> according to the document and generates a table of contents based on them. Accordingly, it is necessary to provide for the proper use of these header tags in your HTML template.

But in reality, adding --toc does not lead to any consequences. As if the parameter was not. However, other parameters work. Thanks to a post on some forum, I found that this parameter needs to be passed without hyphens: toc . Indeed, in this case, the content is added as the very first page. Clicking on the line in the content takes you to the desired page of the document, page numbers are correct.

It’s not quite clear yet how to customize styles, but I haven’t done it yet.

JavaScript Execution

The next point I encountered was the need to add graphics to the report. My HTML page contains JS code that adds graphics using the dc.js library. Here is an example:

 function initChart() { renderChart(@Html.Raw(Json.Encode(Model.Chart_1_Data)), 'chartDiv_1'); } function renderChart(data, chartElementId) { var colors = ['#03a9f4', '#67daff', '#8bc34a']; var barHeight = 45; var clientHeight = data.length * barHeight + 50; var clientWidth = document.getElementById(chartId).offsetWidth; var chart = dc.rowChart('#' + chartElementId); var ndx = crossfilter(dataToRender); var dimension = ndx.dimension(d => d.name); var group = dimension.group().reduceSum(d => d.value); chart .width(clientWidth) .height(clientHeight) .margins({ top: 16, right: 16, bottom: 16, left: 16 }) .ordinalColors(colors) .dimension(dimension) .group(group) .xAxis() .scale(d3.scaleLinear().domain([0, 2]).range([1, 3]).nice()); chart.render(); }

At the same time in HTML I have a corresponding element:

 <div id="chart_C2" class="dc-chart"></div>

For this code to work, you must import the appropriate libraries: dc.js , d3.js , crossfilter.js . Calling the function initChart will create a graph and insert the received
svg to the specified item in the tree.

But the PDF does not contain a trace of graphs. As well as any other trace of executing javascript code before rendering PDF. This is quite easy to check - all you have to do is to add the elementary code for creating a simple <div> element with text, just to test the fact that JavaScript was invoked.

It was found out experimentally that the location of the JS code for wkhtmltopdf plays a significant role. Located at the end of the <html> or say at the end of the <body> JS code will not be executed. It seems that the utility simply does not notice him, or does not expect him to meet him there.

But the code inside the <head> is executed. Thus, I came to the scheme when the JavaScript code is located after the declaration of styles inside the <head> , and is invoked by the usual construction:

 <body onload="initCharts()">

In this case, the code will be executed as expected.

JavaScript Limitations

But there were no graphs in the output PDF anyway. Then I began to guess that being not a full-fledged browser, the rendering and execution engine for pdf is most likely not perfect and does not understand the latest rules. Again, by experiment, I found out that the switch functions are not perceived. And if the interpreter finds something unknown for him, then he simply stops working.

Replacing the switch functions of the form x => x.value with the more classical function(x) { return x.value; } function(x) { return x.value; } helped and all the code was executed, the resulting graph got into a PDF file.

Chart Width

Experimentally it became clear that it is necessary to clearly indicate the width of the parent element of the graph. For this, I specified the dc-chart style. It contains the width of the graph in pixels. Otherwise, the graph on the PDF will be very small, despite the fact that in the HTML version it will occupy the entire width. Specifying width in percent will work only for HTML.

Inline JavaScript / CSS

Finally, I would like to point out that many HTML-to-PDF converting libraries accept some baseUrl as a parameter. This is the URL on the basis of which the converter will complete building relative paths to get imported CSS styles, JavaScirpt files or fonts. I cannot say exactly how this works in Rotativa, but I have come to a different approach.

To speed up the initial download of the report and eliminate the very source of problems with embedding script files or styles when converting, I embed the necessary JS and CSS directly into the body of the HTML template.

To do this, create the appropriate bundles:

 public class BundleConfig { public static void RegisterBundles(BundleCollection bundles) { bundles.Add(new StyleBundle("~/Styles/report-html") .Include("~/Styles/report-common.css") .Include("~/Styles/report-html.css") ); bundles.Add(new StyleBundle("~/Styles/report-pdf") .Include("~/Styles/report-common.css") .Include("~/Styles/report-pdf.css") ); bundles.Add(new ScriptBundle("~/Scripts/charts") .Include("~/Scripts/d3/d3.js") .Include("~/Scripts/crossfilter/crossfilter.js") .Include("~/Scripts/dc/dc.js") ); } }

Add a configuration call to these bundles in Global.asax.cs

 protected void Application_Start() { ... BundleConfig.RegisterBundles(BundleTable.Bundles); }

And add the appropriate method to embed the code into the page. It must be placed in the same namespace as Global.asax.cs so that the method can be called from the HTML template:

 public static class HtmlHelperExtensions { public static IHtmlString InlineStyles(this HtmlHelper htmlHelper, string bundleVirtualPath) { string bundleContent = LoadBundleContent(htmlHelper.ViewContext.HttpContext, bundleVirtualPath); string htmlTag = $"<style rel=\"stylesheet\" type=\"text/css\">{bundleContent}</style>"; return new HtmlString(htmlTag); } public static IHtmlString InlineScripts(this HtmlHelper htmlHelper, string bundleVirtualPath) { string bundleContent = LoadBundleContent(htmlHelper.ViewContext.HttpContext, bundleVirtualPath); string htmlTag = $"<script type=\"text/javascript\">{bundleContent}</script>"; return new HtmlString(htmlTag); } private static string LoadBundleContent(HttpContextBase httpContext, string bundleVirtualPath) { var bundleContext = new BundleContext(httpContext, BundleTable.Bundles, bundleVirtualPath); var bundle = BundleTable.Bundles.Single(b => b.Path == bundleVirtualPath); var bundleResponse = bundle.GenerateBundleResponse(bundleContext); return bundleResponse.Content; } }

Well, the final touch is a call from the template:

 @Html.InlineStyles("~/Styles/report-pdf"); @Html.InlineScripts("~/Scripts/charts");

As a result, all the necessary CSS and JavaScript will be directly in HTML, although during development you can work with individual files.

Most likely, many will immediately think about the ineffectiveness of this approach in terms of caching requests by the browser. But I had two specific goals:

so that the PDF converter does not have to make requests somewhere for styles or code, and the user to wait for this, respectively;
That the first download of PDF and HTML report takes the minimum time, without having to wait for several additional request. In the context of my project, this is important;

Page breaks

Structuring a report into sections may be accompanied by requirements to start a new section from a new page. In this case, you can successfully use a simple CSS approach:

 .page-break-before { page-break-before: always; } .no-page-break-inside { page-break-before: auto; page-break-inside: avoid; }

The wkhtmltopdf utility successfully reads these classes and understands that you need to start a new page. The first class — page-break-before — tells the utility to always start a new page with this element. The second class - no-page-break-inside - should be applied to those elements that you want to keep as much as possible on the page. For example, you have successive blocks of structured information, or say tables. If two blocks fit on the page - they will be located. If the third does not fit into the page already, it will not be next. If it is larger than a page, then its transfer is inevitable. It all works adequately and conveniently.

Flex Behavior in wkhtmltopdf

Well, the last feature I noticed is related to the use of flexbox markup styles. We are all accustomed to them and almost all the markup is made flex. However, wkhtmltopdf in this regard is slightly behind. Horizontal flex options do not work (at least in my case, this did not work out. I saw on the network that it was worth duplicating flex styles as follows:

 display: -webkit-flex; display: flex; flex-direction: row; -webkit-flex-direction: row; -webkit-box-pack: justify; /* wkhtmltopdf uses this one */ -webkit-justify-content: space-between; justify-content: space-between;

But unfortunately it did not lead to the expected markup in PDF. I had to redo the layout of some elements so that the horizontal placement of blocks was in accordance with the requirements. If someone has a successful integration experience of flexs for wkhtmltopdf, please share. That would be quite helpful.

Some links:

Source: https://habr.com/ru/post/425511/

All Articles