📜 ⬆️ ⬇️

We generate OfficeOpenXML documents in 5 minutes

Often it is necessary to generate a report on a server in an OpenXML format from an application on ASP.NET.

There are several familiar ways to do this:
  1. “Found, linked, zayuzal” - go to Google, looking for a library to generate docx or xlsx, connect, understand, generate. This is customary, but long.
  2. "Fu" - use COM. This is not recommended, it requires Microsoft Office installed on the server, not very thread-safe, not friendly with x64 and generally old-fashioned.
  3. “B” - to deal with the format, to collect from XML and zazipat. Brutal.
  4. "Microsoft way" - this method is described under the cut.


Small introduction


OfficeOpenXML is what you save documents by default in Word and Excel: docx and xlsx. The file is a zip archive. You can rename it to zip, open it with the archiver and see what's inside:
OfficeOpenXML Folder View
Reports in OOXML are well perceived and edited by conventional means. I would not recommend in serious applications to be limited to this format, but I advise you to support it.

Training


We will need:
Download OpenXMLSDKTool from the Microsoft website and install it:
')
Setup

Go


Launch the Open XML SDK 2.0 Productivity Tool:
Productivity tool
This tool is very simple and can do two small but important operations:
But first things first.

Code generation


We load our document into the program and click on the “Reflect Code”:
Reflect code

On the left we see the structure of the document - the same files that are present in the archive, and the presentation of their contents.
The nodes in the tree can be selected: on the right you can see the contents of the node in the form of XML and the code that can generate this particular piece. In my example, one paragraph is visible from the body of the document. It just lives in word / document.xml.
If we select the root of the tree (the document itself), we obtain the code for the entire document.

Now let's search for this code.
  1. We do the project in the Visual Studio. Let it be a simple console C # application
  2. Add a reference to the assembly of DocumentFormat.OpenXml:
    Add Reference
    I have it in the GAC. If you do not want to put it there, you can add a link to the file itself. Separately, you can download it in the same place where OpenXMLSDKTool was, but by reference OpenXMLSDKv2.msi
  3. We add reference on WindowsBase
  4. Add the file "GeneratedClass.cs"
  5. We copy there the code from the window, from the ReflectedCode window
  6. Close the file, save it, go to Program.cs
  7. We write the Main method:
    new GeneratedCode.GeneratedClass().CreatePackage( @"D:\Temp\Output.docx" );
  8. Run
Everything. The code for generating the document is ready. The document will look exactly the same as it looked before you saved it in Word. Quick, isn't it?

What's inside?

What is inside the generated class?
First, there is only one open method:
public void CreatePackage( string filePath) {
using (WordprocessingDocument package = WordprocessingDocument.Create(filePath, WordprocessingDocumentType.Document)) {
CreateParts(package);
}
}

Here the text is inserted, which will be in the document:
private void GenerateMainDocumentPart1Content(MainDocumentPart mainDocumentPart1) {
Run run2 = new Run() { RsidRunProperties = "00184031" };
Text text2 = new Text();
text2.Text = " , , , ." ; // . ?
}

As can be seen from the names of private methods in the code, the OpenXml document consists of parts. For the generation of each part made a separate method.
The most inquisitive, of course, smiling wickedly, inserted a picture into the document.
Pictures are stored directly in this file, in the form of base64, here:
#region Binary Data
//...
#endregion

Tying bows

Refactoring images and replacing static content with dynamic content will be left to the reader as an exercise.
But the method that generates not a file, but an array of bytes - to return to the client from asp.net without temporary files:
public byte [] CreatePackageAsBytes() {
using ( var mstm = new MemoryStream()) {
using (WordprocessingDocument package = WordprocessingDocument.Create(mstm, WordprocessingDocumentType.Document)) {
CreateParts(package);
}
mstm.Flush();
mstm.Close();
return mstm.ToArray();
}
}

Everything, the code for generation of the report in the docx format is ready.
It remains to replace the content on the dynamic. We did not do all this for the sake of giving the same thing all the time, right? And add the link “Download in Word format” to the page.

Document comparison


So, we have generated the code for the document. They added a lot of data there, corrected it, implemented it in production. And here we need to change the font and text in the report. How to do it? It is a lot of code to search in it for a long time.
It turns out that everything is very simple; the feature of comparing documents will help us:
  1. Put next to the old and new documents
  2. Open the Open XML Productivity Tool, select "Compare files ...":
    Compare Dialog
  3. Open the files and click OK. Before us is the result of the comparison:
    Result

    On the lines with the names of the files, you can poke and see what the differences are:
    Comparison Details

    In MoreOprions, it is chosen what to ignore when comparing.
    View Part Code shows the code of the part whose XML you see.
    Already to compare XML and the code of work will not make.

By the way, this feature is still very convenient to use if you are just familiar with the format of OpenXML: add something to the document and see what has changed. It will help those who chose the way "b", which was mentioned at the beginning of the article.

Data


findings


I believe that using DocumentFormat.OpenXml for generating reports in web applications is the right choice. Useful tools from the SDK will allow you to not waste time in vain.

What to read


About OpenXML SDK: msdn.microsoft.com/en-us/library/bb448854 ( office 14).aspx
About OpenXML (if anyone is not familiar with it): en.wikipedia.org/wiki/Office_Open_XML

Good luck! Thanks for attention.

Source: https://habr.com/ru/post/109820/


All Articles