Creating chthonic monsters, document

Under this saying-meme, taken from the wonderful picture of Vladimir Filonov , every person who has at least a remote attitude to programming will put his signature. The whole question is how? How exactly to document something?

The following text has several goals:

To give a brief overview (read - a little pogundet on the topic) the poor state of the toolkit applicable to the chthonic monsters of the world C / C ++;
Offer your alternative solution (free-without-SMS-and-registration - the project is non-commercial and is available on GitHub under the MIT-license);
Encourage the community to chat and gather ideas;
Invite to join the project development on GitHub.

Immediately, I’ll make a reservation that although the project was created primarily as an alternative, or rather, the addition of Doxygen for Sishik and plus APIs, it is equally suitable for other languages . This allows you to create portals for documentation of diverse libraries - the libraries themselves can be written in different languages, and the documentation will show the unity of style in appearance and behavior.

Motivation

By and large, there are exactly two approaches to documenting APIs and libraries — plus or not.

The first is to write everything with pens .

It doesn't even matter what - in Help & Manual, RoboHelp, Word or another editor. Despite the fact that this traditional way is clear to everyone and is still widely used, I am deeply convinced that it is fundamentally wrong. The fact is that it generates documentation, which is irrelevant all the time and lags behind the documentation object . Support for consistency between separately created, and often also different people, documentation, and the library API, which is constantly evolving - only dead or frozen products do not evolve! - This is a huge task, only slightly easier to write primary documentation.

The second, the “right” approach, is to generate the source documentation automatically.

A specially trained parser runs through the source code, isolates specially designed comments with documentation, and builds the structure of the API public announcement tree. After that, the documentation is generated in the right format - I, as I suppose, the majority are primarily interested in HTML and PDF. The main advantage of this approach is the guaranteed coherence of ads in the source code of the API and in the final documentation. Even with the complete absence in the source code of meaningful comments with the actual “documentation” , at the end we will have an excellent snapshot of the library's API status , with the ability to “jump” over declarations and type descriptions, etc.

So, with your permission, I will concentrate on the "right" approach with autogeneration. What options do we have here? Alas, for the documentation of C / C ++, there are at present very sadly little used and really used: Doxygen and QDoc . And with these two, too, not everything goes smoothly.

Doxygen is the first truly successful project to extract comments from the code on the pluses and turn them into HTML documentation with hyperlinks, pictures of inheritance graphs, calls, etc. Unlike its direct parent, the pioneer of Doc ++, who has never received enough distribution, Doxygen is now the de facto standard for documenting C / C ++ code. And all this would be great if it were not for the two "but":

The standard HTML generated by doxigen, however more softly to say ... not burdened with elegance .

Of course, there is a place for subjectivism. I fully admit that there are not so picky people in the world who are completely satisfied with the doxigen exhaust (I would venture to suggest, however, that there will not be professional designers among them). But even if the default doxigen HTML suits someone from a visual point of view (and seriously, there are those who really like it aesthetically ? Write in the comments!), Very often you want to change and customize something that goes beyond the CSS twisting - for example, put declarations in <pre> and place indents and spaces in accordance with the coding-style adopted in this particular library. This brings us to the second, more fundamental Doxygen problem:

During its long life, Doxygen has not grown real, modular customization .

Yes, there is a Doxyfile with a bunch of variables, it is possible to change the HTML headers and CSS, but architecturally - everything is coarse into a monolithic C ++ core! Moreover, the front-end, namely, the source parsers, and the back-end are HTML, PDF, RTF and other generators (among which, thank heaven, there is also XML).

QDoc defaults to much, much prettier HTML than Doxygen. Unfortunately, if something is not needed by default , then QDoc suffers from the same innate hardness as Doxygen (growing, of course, from the same ~~well ...~~ hardness of both the parser and the generator into a monolithic positive core). In addition to its woodenness, QDoc, unlike Doxygen, has only one input parser — for the QT dialect of C ++ (with all Q_OBJECT , Q_PROPERTY , foreach , etc., strictly interpreted as keywords). And at the same time, - which is absolutely not in any gate, - does not know how to generate PDF!

Alternative

It is proposed to replace one tool with a conveyor . Instead

 Doxygen -> (HTML, PDF, ...)

... we will use the following pipeline:

 Doxygen -> (XML) -> -> - -> (reStructuredText) -> -> Sphinx -> (HTML, PDF, ...)

What do we leave old?

Developers know how to get used to documenting C / C ++ code with Doxygen comments :

 /*! \brief This is a brief documentation for class Foo. This is a detailed documentation for class Foo. ... */ class Foo { // ... }

Why invent a new syntax? We will write the documentation in the same way as before!

Doxygen is able to pull out documentation from sources and put it together with the tree of declarations in an XML database. Perfectly! This will be our front-end .

It is even easier to answer the question of what to use as a back-end - of course, Sphinx . Sphinx deservedly received enormous distribution as a tool for writing technical documentation. It produces very tasty looking HTML with support for full-blown themes (and not just CSS!), Can glue everything into one HTML sheet, generate documentation in PDF, EPUB and many other formats - all from the box! But most importantly, it is fully customizable using Python scripts, and they can be used both for tuning the appearance and for expanding the input language (which reStructuredText is for Sphinx) - namely, to add your directives and then use them in the documentation .

It remains to make friends Doxygen and Sphinx.

Build a bridge

I note that I'm not the first to try to build a bridge between Doxygen and Sphinx . The relative fame acquired project breathe , written in Python as an extension for Sphinx. At the moment, the project is not too actively picking a screwdriver, and, alas, out of the box is not suitable for serious tasks. Architecturally, it is structured as follows: it parses the XML exhaust of doxigen and creates the reStructuredText nodes of the tree in memory directly.

I decided to go a little different way. Doxyrest - this is the name of our bridge - parses the Doxigen .xml files, and then gives the parsed XML and the set of template files to the template engine (template processor). The template engine generates files with reStructuredText , and already these .rst files are transferred to the Sphinx-back-end for receiving the final documentation in the specified format.

The main feature is, of course, the use of a template engine. This allows you to fully customize the structure of the documentation: change the order and group the documented objects (classes / functions / properties, etc.), customize the style of declarations (where and how to use indents, spaces, line breaks, etc.), use arbitrary logic Difficulties to include or not include this particular object in the documentation, and so on - and all this without recompilation , just by editing the input templates!

But the main thing is that the template approach allows Doxyrest to be used for the absolute majority of any other languages , and in particular, for various DSL - for which no one will ever make specialized documentation systems. Doxygen can't parse your language? They took a language compiler, added a generation of Doxygen-like XML to an already existing AST, then corrected the templates of the output .rst files - so that the declarations in the documentation were with the correct syntax - that's all! Your language can now be documented with the help of Doxygen-comments and receive beautiful Sphinx-documentation at the output.

At the moment, Lua is used for templating (simply because I already had a ready-made and debugged Lua string templates library), but in theory nothing prevents me from adding support to other templating languages.

Templates look and work like this:

  Title ===== %{ if false then } This text will be excluded.. %{ end -- if for i = 1, 3 do } * List item $i %{ end -- for }

At the exit we will have:

  Title ===== * List item 1 * List item 2 * List item 3

Examples of using

Better to see once than hear a hundred times. Therefore, instead of concluding, I decided to simply provide links to the results of the Doxyrest work applied to different languages:

Jancy Standard Library Reference (Jancy language)
Jancy C API Reference (C language)
IO Ninja API Reference (Jancy language)
AXL Library Reference (C ++ language)

Despite the incompleteness of the substantive part of the documentation on the links above (the actual descriptions of classes, functions, etc.), all this should be enough to demonstrate the efficiency of the method.

Project page on GitHub: http://github.com/vovkos/doxyrest

The project is laid out under one of the most unstrict licenses in the world - The MIT License . See, try, join the development. And I will be happy to answer all the questions in the comments.

Source: https://habr.com/ru/post/318564/

All Articles