📜 ⬆️ ⬇️

Documentation Development with DocBook



It so happened that in our projects the maintenance of technical documentation completely lies on the shoulders of the developers, according to the principle: made changes to the project code - updated the documentation. The documentation itself was a collection of Word documents that was stored with the source code under VCS. This approach to the organization of development has existed for a long time, but a couple of years ago we decided to attend to the possibility of maintaining project documentation using means other than MS Office.


There were several reasons for this:
')


All these problems were superimposed on each other and they already made this, and so not very favorite, process of updating the documentation, a punishment unbearable by severity. It happened that after a few hours of writing, you with a sense of accomplishment of your duty are trying to “flood” your changes in SVN and receive sad news that someone turned out to be faster than you or you just simply forgot to upgrade before starting work. In any case, this meant that a smoke break would have to be slightly postponed. In addition to the text, it was necessary to pay attention to the design styles, which with rather enviable regularity for some reason “broke down” (for example, the numbering of the list started from the beginning, to the place where things would continue, etc.). And not all of these “breakdowns” were easily eliminated, sometimes it began to seem that Word was living some kind of life and didn’t give a damn about your wishes for design.

Thus, our alternative to MS Word should satisfy the following criteria:
  1. Text format of document storage - for convenient work with it in VCS.
  2. Support for extensive design and styling of the document.
  3. The ability to decompose the final document into fragments - for reuse.
  4. The ability to publish the final document in various formats.


As a result of a long search, we realized that there are not so many solutions that meet our requirements: DITA and DocBook. DITA immediately seemed to us too “powerful” and difficult to go, but we decided to stop at DocBook. Generally speaking, the search for an alternative solution was very gradual and before we realized that “it was impossible to continue this life” and the full transition to DocBook took more than one day and a lot of experiments were carried out on what we had at that moment. First of all, we tried to store documents in WordML format, which to some extent solved the problem of merging changes - now the merger did not always end with a conflict, but manual resolution of conflicts in the markup was very uncomfortable. We also tried to divide the documents into fragments, thereby reducing the possibility of conflicting changes and try to implement their reuse. The idea was not very successful. And so gradually, through trial and error, all the same decided to completely switch to DocBook, since in our opinion, it should have eliminated all our problems.

What is DocBook?


Ate suddenly, who does not know, DocBook is a standard for describing a document and does nothing useful except for standardizing content. Moreover, the standard is quite old, and many, for some reason, are considered already obsolete.

Writing a document in DocBook format is very similar to working with HTML, only its own set of tags and rules for their use are used.
<book> <chapter> <title>First Chapter</title> <para>Hello world!</para> </chapter> </book> 


This example demonstrates a one-chapter book description with the title “First Chapter” containing a paragraph with the text “Hello Word!”. A complete list of tags, as well as examples of their application, can be found on the project website www.docbook.org . From myself I want to note that the set of tags for describing the content is very (even very very) large, but in our daily work we use about 20.

Convert DocBook Document



In order to bring our DocBook document into any readable or printable format, you must use a transformer (or even a conveyor of several transformers one after another), which will form the final document based on the content of the document and, usually, styles.



As a rule, DocBook -xsl is used for transformation (although there are more exotic ways). Out of the box, it already supports several document formats - html, xsl-fo, manpages, etc. If a different presentation format is required, then you can continue the chain of conversions. So, to obtain a document in PDF, the following scheme is usually used:



And here begins the most interesting. Styles implemented in DocBook-xsl by default allow you to get a normal-looking document, but usually, they still require customization.

The developers of docbook-xsl styles took care of this possibility and implemented special mechanisms for this:



Most often, to control the process of document formation, we develop our own root XSL style, the so-called “Driver”, in which all other transformation parameters are already fine tuned. Since each final format in DocBook-xsl is represented by its own set of templates, then the “driver” for each of them needs to be written separately. For example, we use two final document presentation formats (xsl-fo and htmlhelp) and, accordingly, we have two “drivers” and two sets of redefined styles.

Choosing xslt and fo processor


To work with DocBook-xsl, you need an xslt processor that supports xslt version 1.0. (There is a docbook-xsl implementation for version 2.0, but I don’t know how stable it is). At the moment there are many working solutions for a variety of platforms - so there should be no problems with this. In our projects, we use saxon, though the old version is Saxon 9.1.0.8J, since the last free support for EXSLT extensions was completely removed (necessary for profiling the document) and there was no certainty that the saxon extension to support syntax highlighting that comes with the styles will work in the new.

For the formation of documents from xsl-fo, you need a fo processor. Things are a little worse here - from the working processors, I personally tried two FOP (opensource) and XEP (RenderX XEP Engine - a bit paid). There are some more working fo processors, but personally I have not tried working with them and cannot say anything about them.

The main plus of FOP is that it is free, but there is also a minus - from the “box” it does not support the Russian language. When we first met him, we never managed to get him to work with Cyrillic. Strangely enough, there are many articles on the Internet about this, but all of them were either very old (where it was suggested to rebuild FOP with the necessary fonts) or contained errors that did not allow to achieve the desired result. In the end, everything turned out to be very simple, but our choice has already fallen on XEP. XEP works fine with Cyrillic immediately after installation and in principle does not require any additional configuration, but it costs $ 400 - and the desktop version. I don’t try to judge the difference in rendering quality, but you can compare for your own interest (in the example there are files collected by both fo-processors).

Customize the style


For high-quality customization of styles, it is necessary to know a little xsl language of the transformation, as well as the markup language of the final document. Unfortunately, we didn’t have such competence in the team at the time of the transition to DocBook, and therefore it took us enough time to set up - especially for the FO format. Although there are a large number of websites with information on this subject in the network (in my opinion “ DocBook XSL: The Complete Guide ”, which is especially valuable in my opinion), it is very difficult to get a complete picture right away. Therefore, I decided to act according to the principle “it is better to see once than hear a hundred times” and prepared an example of style for xsl-fo (about the same as we use in projects) along with the source text of this article and a customized FOP.

The only time I want to stop and which at first can be confusing is the setting of the fonts and the language of the document. By default, fonts that do not support Cyrillic are included in xsl-fo, and if you do not override these parameters or make a mistake in them (you must make sure that the fo processor is configured to work with the specified fonts), then you will most likely get unreadable output from the fo processor document. The language of the document affects the creation of autotext for the names of the elements of the book (Chapter, Book, etc.). In principle, setting only these parameters already allows you to get the "correct" document. Also, most likely there will be a desire to change the appearance of the title page of the document. This can be done with the help of a template specially prepared in docbook-xsl. To do this, you need to define your own version of the file "/fo/titlepage.templates.xml" and use the xslt processor and the template "/fo/titlepage.templates.xsl" to get a customized design style for connecting to the "driver". In some cases, the built-in configuration mechanisms are not enough, and then you have to explicitly override the original docbook-xsl styles in the driver.

Conclusion



It took us quite a long time to complete the transition to DocBook. First, it was necessary to bring to it already written documentation. Here we tried different utilities like AntiWord, but because of the large number of artifacts, it was decided to do it manually (artifacts were obtained both due to formatting errors in the source document, and due to the peculiarities of the translation scripts). We also spent a lot of time developing our own design styles, searching for the environment for developing documents (eventually settled on NotePad ++) and environment settings. It seemed a simple task, but with its implementation, we constantly ran into some problems. Unfortunately, there is not much information on DocBook, and if we talk about the Russian-speaking segment, there is practically no information at all. But in the end we were satisfied.

Since our team moved to keeping technical documentation in DocBook more than one year has passed, and we no longer have any other option for ourselves. All that we wanted to achieve the transition to DocBook - we have achieved:



Naturally, apart from the pros, there are also disadvantages:



And the purpose of this article is to bring to readers who are developing technical documentation in office programs that there are more suitable tools. For those who have already glanced in the direction of DocBook or DITA for some time, to give some impetus and tips for the transition - after all, the most difficult thing to begin! It would also be very interesting to hear what approaches are taken in other teams and their implementation experience.

Bibliography:


Example:

Source: https://habr.com/ru/post/212881/


All Articles