In my previous articles I tried to show separate fragments of the Document Generator. As it became clear from the discussions, some of its fragments exist in various implementations and are not interesting to discuss them. Indeed, why discuss separate building blocks when you don’t see the whole building. Therefore, in this article I will try to show the building as a whole, so as not to discuss its individual bricks. I will try to describe my vision of the implementation of the document generator, based on personal experience gained in one of the largest banks in Russia. I went from practice, implemented a generator in MS Word and Excel, this is what came out of this process.
1. The generator should be able to generate documents on the basis of templates. The received documents should be as ready for use as possible, they should contain the minimum number of places that require manual editing after the generation. Templates should be developed and accompanied by the business itself; IT participation should be to ensure that the information required to fill out the templates is available from the databases. In the future it will become clear that we are talking about the values ​​of the fields and answers to questions.
2. The generator should, on the basis of templates, automatically build a UI form to ensure manual input of information that is absent in the database. Thus, by manually entering the value of such a field, it will be substituted in different places of the document (com), correctly associated with other words, having the correct case, etc. Values ​​entered manually must be stored in the database and, if necessary, copied to the main database, ensuring the principle of one input. Some of the fields should be simply shown on the form without editing, only for visual inspection.
')
Some of the fields should be hidden: - these are calculated fields and fields that are filled with dashes at a certain stage of preparation of a document \ contract. The field values ​​with dashes are determined later in the process of agreeing conditions, for example, a contract. Fields are located on a UI form in a specific order on certain tabs and have the appropriate element type (control) and input mask. When issuing the Generation command, the field values ​​are verified and in case of inconsistencies the user receives an error or warning message.
Template developers, using a special development environment, set the order of the fields, their location on the UI tabs of the form, select input / output formats, mark the fields to be filled with dashes, etc. The role of IT in these processes can only be consulting. Practice shows that good documentation plus the availability of training examples, plus a trained employee, plus a friendly development environment make it possible to almost completely free IT from the preparation of templates, information entry forms and questionnaires. This means that the business itself provides the necessary documents, frees IT from the routine, eliminates intermediaries between requirements and implementation, and works practically in a temporary mode convenient for the business itself. And, of course, a business ceases to pay for the implementation of document generation to third-party developers and incompetent executors who are not in the context of the business of a particular organization and its practice.
Developing templates in addition to understanding the requirements for the final document is hard work that requires attention and diligence. The layout of the templates of the future document is not work for young, ambitious, but impatient IT people; they need women's hands, women's diligence and the tendency of men to cope with monotonous operations better than ever. Support for templates, their modification - it is still more women's bread than men's.
I myself somehow tried to mark out a document for business, and to be honest, it did not bring me joy, although I gave some great ideas for improving the process of preparing templates. Since then, I strongly began to respect those women who, not having these remarkable innovations, patiently and without complaints, dragged their difficult female share.
3. The number of templates should strive to minimize, but at the same time the templates themselves should not be huge. How many templates should be and what their sizes should be decided by the template developers themselves, i.e. business. And there are no exact rules that uniquely determine the size of the templates. But recommendations may be. One of them is that the larger the template is, the more computing resources the Generator will need to generate documents based on it. Therefore, it is not necessary to pack the whole world into one template.
In practice, the size of templates is determined by how many different documents can be generated from one template and how similar they will be to each other, whether they belong to the same topic, etc. It is not uncommon patterns of 100 pages or more.
When I saw such huge templates for the first time, I, with all my confidence in the competence of business, doubted the correctness of their choice. But in order to indicate to the business its mistakes, it is necessary to figure out what was done wrong and offer the best option, and this requires immersion in business practice, which, in turn, will require competencies from IT specialists that are not related to his current skills and skills. And therefore, with its charter, it is not necessary to go to a strange monastery. We must trust people, 100 pages, then 100.
It is clear that the developers of templates do not operate with the concepts of performance and do not consider the costs of computational resources. They are interested in the result. Nevertheless, they are interested in the generation of documents as quickly as possible.
Therefore, template marking should allow template developers to control the generation rate, and template developers, of course, use such capabilities. The idea of ​​reducing the size of the template is that the template can be large, but consist of pieces that are loaded into memory as needed. Those. In the process of compiling the template, it is specified in the markup by the compiler into pieces, which will be loaded into memory at the time of generation of the document, depending on which document of these pieces is assembled. Those. from the point of view of the developer of the template, the template can be large, but from the point of view of the Generator, the template is a small basic part plus specific pieces that are inserted into their original place if necessary. This way of working with templates leads to a reduction in generation time and, accordingly, to a reduction in consumed computing resources. Responsibility for the speed of generation lies with this approach on the developer of the template (and he is an employee of the business).
4. The generator should be able to include articles, paragraphs, parts of other documents, which are absent initially in the template. The template only marks the place where such information can be inserted. Those. the generator must be able to synthesize a document from separate paragraphs and fragments of other documents. Here, the main difficulty lies in mating a heterogeneous external textual material with the main document. For example, individual contract clauses can be stored in the library and, depending on the terms of the contract, these clauses should be inserted into the main document. When inserting, it is necessary to align the font and indents, determine the contract clause numbers, calculate references to other clauses of both the current document, and clauses of another document (for example, the general contract), etc.
All this requires, on the one hand, a certain standardization of maintaining individual library items, and on the other hand, the template creator needs to have at hand a set of markup elements that allow for fine manual tuning of non-standard text fragments (you can hardly find from which sources documents (texts, tables graphics to be inserted into the generated document.) The items library maintains and supports the business. Changing the text of the item leads to a change in the text of all documents that include In this process, it becomes clear which parts of one template always coincide with parts of another. In such cases, these parts can be transferred from the template to the library (or the file folder where common parts for different documents lie) and in case of changes ( but they are not rare today) edit not the same text in different templates, but edit it in one place.To quickly find the desired fragment, it must be equipped with a search engine.
5. The text display attributes in the finished document should be sufficient to meet the needs of the business. This is the thickness, slope, underlining and deletion of the text, its highlighting (highlight) with a certain set of colors (each color of the backlight can carry a meaning). Support for the ability to display corrections is very important, when you can see the original and revised text in the document. The document user should be able to switch from the display mode of the final document view to the edit display mode (MS Word). The generator should be able to automatically create the table of contents of the document, as well as create footnotes.
Also, the developer of the templates should have in the assortment a sufficient set of special characters that are uniquely interpreted by the user of the document, such as: all kinds of daws, boxes with and without daws, index finger, etc. The generator should be able to insert images from other documents or from files (for example, QR code), create links to sites, be able to type individual characters of a word (for example, a code word) into cells of a usually one-line table. And, of course, a set of tagging tags should be available, allowing you to create tables of any complexity, with a variety of headers and rows of the table. In addition to such tags, markup tags are attached that allow you to delete (compress) empty rows or columns of the table, if they were not filled with data after generation.
6. The generator should be able to return the result to the calling program or save the ready document to the file folder. The generator is implemented in the form of several WEB-service methods, one of which is the generation of a document or a stack of documents using a template. You can call WEB-Service from any program or DBMS. If the result is saved to a file, it is very important to have a flexible, customizable business subsystem that defines the specification of the folder itself and the file name of the document, which can be intelligent and include the date, client name, etc. The usual practice of working with client documents leads to the creation of a folder where all the documents related to the transaction, including those not prepared by the generator, are located, and it is convenient for users to use the file “wooden” structure to work with documents.
When generating batches of documents in one pattern, if the generated documents except for the field values ​​are no different, you can significantly reduce the generation time, since All preparatory operations with the template are performed only once at first, and then a batch of documents is generated using this intermediate template form. For example, a set of documents on guarantees may have the same terms of agreement for guarantors, therefore all documents should differ only in the data of the guarantors themselves, their full name, address, position, and so on. In this case, it is advantageous to use the generation of a stack of documents on one pattern, and get a gain in generation time due to the fact that the generator uses the mechanism of using the intermediate form of the pattern. Such functionality should also be supported by the generator.
7. Imagine a document, for example, a questionnaire in which there is a section that includes information about parents. And the client has no parents, and never was. In one place of the questionnaire he will check that he has no parents, and in another place the information about the parents will remain blank. After printing this form will be used extra paper. Indeed, if the client has no parents, then there should be no information about the parents in the document. And such moments that cause cognitive dissonance in documents are often a dime a dozen. The energy and labor spent on paper production is wasted, the life time of a person who views such a questionnaire with empty fields is wasted. This should not be. It could have been earlier, in the pre-computer era, but now, if there is a document generator, this waste should be stopped. And all that IT-shnik needs to do is just to get an answer in advance to the question of whether the client has parents, and depending on the answer, request information about them or exclude a section about them completely from the document. Those. Before actually filling in the fields of a document, a questionnaire must be passed, the answers to the questions of which are arguments for both the function of obtaining the list of fields required for filling in the document and the arguments of the function for generating a document using a specific template.
Answers to the questions "cut out" of the polymorphic pattern only those parts that correspond to the answers, just as Michelangelo carved out his beautiful creations from marble. The compilation of a questionnaire is a separate topic, which I will not discuss in detail in this article, so as not to “make noise” about the presentation of the topic on the document generator. I can only say that by marking the template you can automatically get a questionnaire to it, in case the markup tags unambiguously determine the answer to the question, and therefore the question itself. Those. if the answer tag for a question (this is a unique response code) belongs to only one question, then the questionnaire is compiled automatically, and the sequence of questions can be borrowed from already existing questionnaires, which is also not a complicated operation. Questions and answers to them can be as usual, compiled into the design time, and dynamic, obtained from the data in the database. Questions can have only one answer or be multi-valued. Answers to questions can affect not only the composition of the fields that appear in the UI form to be filled in, not only the final form of the document, but also subsequent questions. For example, if you had parents, then you can ask a question whether they are residents or not. But if they were not there and you are from the Andromeda nebula, then the question about the residency is meaningless. So, at least two concepts employee marking, should be mastered, it is to place in the template field and tags of answers to questions. Since ordinary people in their head calmly hold 5-6 concepts on one topic, then the task of marking so far does not seem to me too complicated. An employee who mastered the two basic notions of markup can do roughly the same thing as a programmer who wrote the program “Hello world” in a new programming language. This, of course, is not enough, but this is already something.
8. We will not be cunning. "Why so many words, so many cod." Back office worker, if he is not too lazy, can provide himself with templates himself. And why does he need a generator when he has his own templates for all typical occasions. If something changes in life, he will create a new template for himself and with the help of the Replace command he can quickly make the necessary document, for example, a standard contract. But here is a situation that does not work with this approach. Those. it works, of course, but somehow. Suppose a primary version of a contract of guarantee has been drawn up and submitted to the client, and then to lawyers for approval. As a result of the revisions, some time later, on the basis of this primary document, a new document appears in which many points of the original version are corrected, supplemented and so on. Now, on the basis of this new version of the guarantee, it is necessary to compile 10 or more documents of guarantees, where the full names, addresses, positions and other details of the guarantors should be substituted, and the rest of the document should be the same as in the new version of the guarantee.
Here a situation arises when the employee has no template back office, because for all occasions you cannot stock up on templates and you will not go through all the options of guarantees. This is the case when the generator can really show its advantage over the usual manual way of preparing documents. For this, the generator must generate such a primary contract of guarantee, which, after making changes by the client and lawyers, can itself act as a template. Then the preparation of 10 or more documents of guarantees for different guarantors will take a few seconds of the computer, will not contain errors, typos and all that anyone can do at any time. Those. not only is the time of preparation of documents saved, but their quality improves, since the computer is not wrong.
To get this functionality, the generator should, at the time of generating the primary document, create on the fly hidden markup that exists in the form of bookmarks and is not visible to people working with the text of the document. Each bookmark is uniquely correlated with the field name, which means that you can substitute any values ​​in the place of the document marked with the bookmark, but only the method of substitution will differ from the primary one. Unfortunately, if we talk about MS Word, bookmarks are not always saved in a document after making changes, it depends on the way in which another text was inserted instead of one text. In order to eliminate the consequences of the disappearance of bookmarks, one has to make additional bookmarks from which one can calculate the place of the disappeared bookmarks. Such a mechanism is quite efficient, sorrythat Microsoft is not doing anything to standardly fix this problem. Well, maybe someday, when something is blown away or the climate changes, this problem will be solved.9. Patterns that contain hidden markup, in comparison with the original patterns, are rigid. But the degree of rigidity, fortunately, can be controlled. From a rigid template, you can get a hybrid one (this is how puns are born), for this you need to substitute parts of the original template in part of the rigid template. Usually there are such places (hypervariable) in the document that allow you to return to the place the markup that was in the original template before generating the document. Naturally, such places should be marked in the template. As a result, even after edits by the client and lawyers, the template containing only bookmarks remains quite flexible. For those parts that have returned from the original (or any other) template, a questionnaire can be obtained on the fly if the markings with answers to the questions are found in the returned parts of the template.The hybrid pattern provides the degree of flexibility that in many cases turns out to meet the requirements of the business.So, I looked at some of the individual “bricks” that you need to have in order to build a “building” called “Document Generator”. I tried to focus your attention on the fact that templates were created by the business itself and IT participation was minimal. Hopefully talking about hybrid templates could also be useful.It is clear that in many cases it is enough to have a simpler typewriter for preparing documents. In those organizations where there are no questions about parents and documents are monotonous and static, you can use a slingshot instead of a gun. In the same organizations where there is an individual approach to clients and the process of preparing contracts is not automated.to a sufficient degree, this article will help determine the requirements for the developers of a software product that today, according to its functionality, should and can correspond to the description given by me in this article.