📜 ⬆️ ⬇️

We torment MS Word from our application

Each application developer sooner or later faces the task of exporting data from its application to another. Here and in front of me, she once again got up: I needed to generate messages for mailing (the mail that the postman wears). Letters should be saved in Word format. It would seem that the task is trivial, but there are some subtleties. On the Internet, there are quite a few examples of working with a word from third-party applications via a COM-call, but most of them are either examples of the “Hello world!” Level or customized for a specific task. I did not find my implementation, so I suggest that you familiarize yourself with the next bike.

Task Description


There is a database containing information about subscribers. Subscribers should be sent paper letters. The texts of letters (templates) are prepared by people from IT very distant (lawyers, marketers, and other parasites), but who know how to use the Word in one form or another (sometimes even very well). Those. it’s quite possible to explain how to insert a keyword into the text, but a more complex requirement will cause them to have cognitive dissonance.

The second point is that some letters need to be manually checked and edited if necessary ( UPD ) before printing and are in the same file (this is connected with the mechanisms for their further transmission). Those. at the place of formation, they are only prepared (and sometimes printed).

.Net happened historically, the main interface of working with the database is written on it. Actually, it is quite reasonable that the user will make calls through him. The use of office macros had to be abandoned for security reasons and the complexity of the settings.
')

Frontal solution that turned out to be unsuitable


It seemed that the task was as simple as two kopecks: take a template, insert it into the output document, replace the keywords, repeat to the end of the records. Not a ride. A letter can contain several pages, and with this approach, inhibition of a Word with an increase in the volume of the document leads to the fact that sending out 30 letters can be formed up to an hour. I had to turn my head and think.

What happened


First of all, open the template and look for the occurrences of keywords in it and remember their positions.
//
string [] keyWords = { "FNAME" , "LNAME" , "DEBT" , "MR" };
//
List<keyWordEntry> keyWordEntries= new List<keyWordEntry>();
for ( int i=0; i<sdoc.Words.Count;i++)
{
foreach ( string keyWord in keyWords)
{
if (sdoc.Words[i+1].Text.Trim()==keyWord)
{
keyWordEntries.Add( new keyWordEntry(keyWord,i+1,sdoc.Words[i+1].Text.Remove(0,keyWord.Length)));
};
};
};

* This source code was highlighted with Source Code Highlighter .

Immediately, the first tricks of work with the Word are found (or rather, they are the first in this text, and they were almost the last in the survey process): arrays of document elements (Words, Paragraphs, ets) are numbered from one; spaces after the word, the Word can easily be considered part of the word - I had to write the logic of their preservation.

We create an output document based on a template, so we can get a document with a small amount of blood with the necessary page markup, footers, styles, etc.
_Document ddoc = word.Documents.Add( ref template, ref oMissing, ref oMissing, ref oMissing);
//
ddoc.Range( ref oMissing, ref oMissing).Delete( ref oMissing, ref oMissing);

* This source code was highlighted with Source Code Highlighter .

We fill it with paragraphs by the number of records in the request:
for ( int i = 0; i < rowCount; i++)
{
ddoc.Range( ref oMissing, ref oMissing).InsertParagraphAfter();
};

* This source code was highlighted with Source Code Highlighter .

And we start to fill from the end to the beginning, what we get is a crazy increase in speed, because We turn on the index of the paragraph, but do not look for the end of the document each time. The filling itself looks like this (sdoc is a temporary document into which we substitute values, ddoc is the one that should work):
for ( int i = rowCount; i > 0; i--)
{
if (i < rowCount)
{
ddoc.Paragraphs[i].Range.InsertParagraphAfter();
ddoc.Paragraphs[i + 1].Range.InsertBreak( ref pageBreak);
};
//
foreach (keyWordEntry ke in keyWordEntries)
{
string replaceWith = "" ;
switch (ke.keyword)
{
//
default :
replaceWith = ke.keyword+ke.spacesAfter;
break ;
};
sdoc.Words[ke.position].Text = replaceWith;
};
sdoc.Range( ref oMissing, ref oMissing).Copy();
ddoc.Paragraphs[i].Range.Paste();
}

* This source code was highlighted with Source Code Highlighter .

Basically, it remains to save the received document and correctly complete the Word process.

Just a couple of words to follow up: the characters '.', ',', '*' And all the others, the Word considers a separate word, and if you need to insert, for example, a date, the logic will be slightly more complicated.

Sample code can be downloaded here.

Source: https://habr.com/ru/post/106425/


All Articles