
- How do you keep the API help up to date?
- How can I organize and store localized versions?
- Do you check text for invalid characters and valid markup?
- How to organize a check (proofread) topics?
These and other questions I often hear from technical writers at conferences. For small amounts of documentation, it is enough to manually review the documents and update / substitute / correct everything that is needed. And if the volume of documentation increased?
')
Our documentation has grown to
more than 154,000 documents for the .NET product line alone, of which about
140,000 documents are API references . About 8-10 thousand topics are added each major release (ie, twice a year). In this article I will tell you how we cope with such volumes.
Here I will not give the names of publicly available tools, because everything that we use is self-written applications and services that are deeply integrated into our infrastructure and are poorly applicable outside of it. Therefore, in this habratopic I will share
technical solutions , not tools.
The secret to success is simple:
Store so that it is convenient
We store all documents in MS SQL Server and made an interface (CMS) for easy access to all documents and their editing, checking and previewing.
What we got:
- Topics are records in the database and we attached a lot of useful service information to them:
- the name of the author of the topic and the name of the one who is the last to this topic topic.
- creation date, last edit date, revision history.
- various statuses: whether checked by the proofreader, approved by the developer, needs improvement, etc.
- The list of topics can be displayed as a table with all its advantages:
- sorting - you can sort the documents in the desired order, for example, by the creation date.
- grouping - you can group documents, for example, by status, by authorship, etc.

- filtering - you can show only those topics that need attention by filtering all the others

- Flexible submission of documents to the database. Here are some of the most delicious buns:
- Localization. In the database, you can conveniently organize storage and access to localized documentation. To control the localization process, add various statuses to topics: translated, not translated, verified, etc. We, the truth, do not localize the documentation.
- API structure In the database, you can easily organize a class diagram, inheritance hierarchy, etc. This information can be used to generate related documents.
- Single source technology. If the same content (picture, sample code, text) should be used in several places, then this content can be stored as a separate entity and refer to it where it is needed. With a DB it becomes simply.
Automate it!
Autogenerate documents from collected libraries.
There are wonderful tools that allow you to convert documentary comments in the code into ready-made topics. These are JSDoc, JavaDoc, Doxygen, Sandcastle,
thousands of them ...
Our API is described by technical writers in the database, not the developers in the code. Therefore, we do not need to create ready-made topics from the comments in the source code. We need to create empty topics in the database.
This task is performed by a special tool - the synchronizer. It works like this:
- takes collected DLLs, through reflection pulls out signatures of all namespaces, classes, etc.
- compares signatures with those in the database.
- adds the missing, removes unnecessary: for example, if the class has a new method, the synchronizer adds an empty topic for this method to the database with the corresponding statuses.
The technical writer in the interface to the database filters out all topics except for empty ones and describes newly added classes, methods, properties, etc.
Automatically fill content where possible.
The synchronizer creates an empty topic for the new API element, and fills all the related information. Take, for example, this document:
ASPxGridView.StartRowEditing Event .
With the yellow marker I highlighted the information that the recorder fills directly for this topic. Particularly highlighted the section with the example code (orange): it should be given a link in the appropriate field. The entire contents of the example are properly drawn into the document.

The rest is automatically generated:
- The namespace of the current class and the library in which this class lies is set automatically.
- The syntax of ads in C # and VB.NET is compiled automatically from the service information.
- Additional information about the event is also automatically pulled out.
- In addition, a sign is automatically inserted with the public properties of the class that contains the event data (event args).
- As I wrote above, for example, it is enough to give a link, the entire contents of the example will pull up by itself. By the way, this same example can be referenced from another topic.
- References to the corresponding class, class members and namespace are generated automatically. The technician may add a few more links at his discretion.
Some topics, such as those containing a list of class members, are generated automatically. Here is a
list of members of the ASPxGridView class . Imagine what it would be to maintain this list manually?
Testing, continuous integration and code review
We write documents in XML-like format. In essence, documentation is also a kind of code. You can make a mistake in it: do not close the tag, enter invalid characters, etc.
Users receive documentation in more human-readable formats (HTML on the site, CHM, PDF, MSH), that is, the documentation must be assembled from source. Correcting errors accumulated over the entire preparation period for a release is long and expensive, so documentation should always be compiled and tested.
We acted in a logical way.
- Written tests to the documentation . Why not? You can automatically check the syntax in the topic headers, you can check the broken links, the closeness of all tags, the presence of bad words in the text or non-ASCII characters (Russian "C" instead of Latin "C"). Tests run on the CI server.

- On the CI server there is also a daily build with the installation of documentation. If it is not going to, then we look at the build log, take action and start rebuilding.
Code review Content review , in simple terms, proofreading and checking. The check is grammatical and factual.
- Grammar . We write documentation in English, and since we, technical recorders, are not native English speakers, our proofreading grammar text is checked by proofreaders who have English as their mother tongue. Proofreaders check the documents in the same CMS in which the technicians create the documentation.
- Factual . CMS provides the possibility of previewing the topic in the form of an HTML-page (exactly the same as on the site). A link to this page can be sent to the developer so that he can read the document and suggest improvements.
Conclusion
In the comments to habratopiku happy to answer your questions. I would be happy
to talk a
little about the various organizational and technical issues related to writing documentation, interaction with developers and users.