📜 ⬆️ ⬇️

Excel data loading and unloading using OpenXML

I decided to write an article on how to upload data to an Excel file using a template and read data from Excel.
It all started with the fact that at work, they gave instructions to leave the MS Office for free analogues.
We already had a download system based on the Microsoft.Office library. Interop.Excel ”and many ready-made templates for uploading certain reports.
Therefore, it was necessary to find a free library that works with the office. And to make so that the unloading worked on the same system as before. For I did not want to redo all the templates and structure.
So I came across OpenXML. And I thought, now I will quickly find a solution on the Internet and everything is ready (since not enough time was allocated for this). But I did not find a suitable solution, so I decided to write this article, for those who will have the same problem.
The library itself, can be downloaded free of charge from the site of Micrisoft (I used in the OpenXML project sdk 2.5 “OpenXMLSDKV25.msi”)
here
After downloading “OpenXMLSDKV25.msi”, install and go to the folder
“C: \ Program Files \ Open XML SDK \ V2.5 \ lib” there is a library that we need, we will connect it to the project (it will be described below how).
The project was written in Visual Studio 2010 (Framework 4.0).
Below is an example of a template (made for the test) “C: \ Templates \ template.xlsx”.

image

And an example of the uploaded data (as it will look as a result, after unloading).

image
')
Keywords:
DataField: - Means that our baths will be removed from the DataTable at this place.
DataField: [display field name]
Label: - Means that this place will display data that you need to insert once from the dictionary
Label: [key name in dictionary]
And this is the file from which we will read the data “C: \ Loading \ ReadMePlease.xlsx”.

image

Now we will create in VS2010, the Solution in which there will be 4 projects:
1) OpenXmlPrj is a console project for running a test.
2) Interfaces is a project of the type “Class Library” that will store our data interfaces for uploading.
3) The Framework is a project of the type “Class Library”, here all the work with Excel will be done.
4) Converter is a project of the type “Class Library”, for converting our data into a DataTable (as work happens with a DataTable).
image
Now in the “Framework” project we create two folders and connect a link to the OpenXML and WindowsBase library:
“Create” - to work with data upload.
“Load” - to work with data loading.
“Lib” - in the folder, add the OpenXML library.
In the folder “Create” create 4 classes.
1) Worker - this will be our main processor.
class Create.Worker
using System; using System.Collections.Generic; using System.Diagnostics; using System.Globalization; using System.IO; using System.Linq; using System.Text.RegularExpressions; using DocumentFormat.OpenXml; using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Spreadsheet; namespace Framework.Create { /// <summary> ///  Excel  /// </summary> public class Worker { /// <summary> ///      /// </summary> private const String TemplateFolder = "C:\\Templates\\"; /// <summary> ///    (    ) /// </summary> private const String SheetName = "1"; /// <summary> ///   /// </summary> private const String FileType = ".xlsx"; /// <summary> /// ,     /// </summary> public static String Directory { get { const string excelFilesPath = @"C:\xlsx_repository\"; if (System.IO.Directory.Exists(excelFilesPath) == false) { System.IO.Directory.CreateDirectory(excelFilesPath); } return excelFilesPath; } } public void Export(System.Data.DataTable dataTable, System.Collections.Hashtable hashtable, String templateName) { var filePath = CreateFile(templateName); OpenForRewriteFile(filePath, dataTable, hashtable); OpenFile(filePath); } private String CreateFile(String templateName) { var templateFelePath = String.Format("{0}{1}{2}", TemplateFolder, templateName, FileType); var templateFolderPath = String.Format("{0}{1}", Directory, templateName); if (!File.Exists(String.Format("{0}{1}{2}", TemplateFolder, templateName, FileType))) { throw new Exception(String.Format("     \n\"{0}{1}{2}\"!", TemplateFolder, templateName, FileType)); } //    ( templateName)  ,   ,    var index = (templateFolderPath).LastIndexOf("\\", System.StringComparison.Ordinal); if (index > 0) { var directoryTest = (templateFolderPath).Remove(index, (templateFolderPath).Length - index); if (System.IO.Directory.Exists(directoryTest) == false) { System.IO.Directory.CreateDirectory(directoryTest); } } var newFilePath = String.Format("{0}_{1}{2}", templateFolderPath, Regex.Replace((DateTime.Now.ToString(CultureInfo.InvariantCulture)), @"[^a-z0-9]+", ""), FileType); File.Copy(templateFelePath, newFilePath, true); return newFilePath; } private void OpenForRewriteFile(String filePath, System.Data.DataTable dataTable, System.Collections.Hashtable hashtable) { Row rowTemplate = null; var footer = new List<Footer>(); var firsIndexFlag = false; using (var document = SpreadsheetDocument.Open(filePath, true)) { Sheet sheet; try { sheet = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().SingleOrDefault(s => s.Name == SheetName); } catch (Exception ex) { throw new Exception(String.Format("        \"{0}\"!\n",SheetName), ex); } if (sheet == null) { throw new Exception(String.Format("    \"{0}\"!\n",SheetName)); } var worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheet.Id.Value); var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>(); var rowsForRemove = new List<Row>(); var fields = new List<Field>(); foreach (var row in worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>()) { var celsForRemove = new List<Cell>(); foreach (var cell in row.Descendants<Cell>()) { if (cell == null) { continue; } var value = GetCellValue(cell, document.WorkbookPart); if (value.IndexOf("DataField:", StringComparison.Ordinal) != -1) { if (!firsIndexFlag) { firsIndexFlag = true; rowTemplate = row; } fields.Add(new Field(Convert.ToUInt32(Regex.Replace(cell.CellReference.Value, @"[^\d]+", "")) , new string(cell.CellReference.Value.ToCharArray().Where(p => !char.IsDigit(p)).ToArray()) , value.Replace("DataField:", ""))); } if (value.IndexOf("Label:", StringComparison.Ordinal) != -1 && rowTemplate == null) { var labelName = value.Replace("Label:", "").Trim(); if (!hashtable.ContainsKey(labelName)) { throw new Exception(String.Format("   \"{0}\"", labelName)); } cell.CellValue = new CellValue(hashtable[labelName].ToString()); cell.DataType = new EnumValue<CellValues>(CellValues.String); } if (rowTemplate == null || row.RowIndex <= rowTemplate.RowIndex || String.IsNullOrWhiteSpace(value)) { continue; } var item = footer.SingleOrDefault(p => p._Row.RowIndex == row.RowIndex); if (item == null) { footer.Add(new Footer(row, cell, value.IndexOf("Label:", StringComparison.Ordinal) != -1 ? hashtable[value.Replace("Label:", "").Trim()].ToString() : value)); } else { item.AddMoreCell(cell, value.IndexOf("Label:", StringComparison.Ordinal) != -1 ? hashtable[value.Replace("Label:", "").Trim()].ToString() : value); } celsForRemove.Add(cell); } foreach (var cell in celsForRemove) { cell.Remove(); } if (rowTemplate != null && row.RowIndex != rowTemplate.RowIndex) { rowsForRemove.Add(row); } } if (rowTemplate == null || rowTemplate.RowIndex == null || rowTemplate.RowIndex < 0) { throw new Exception("     ,  !"); } foreach (var row in rowsForRemove) { row.Remove(); } var index = rowTemplate.RowIndex; foreach (var row in from System.Data.DataRow item in dataTable.Rows select CreateRow(rowTemplate, index, item, fields)) { sheetData.InsertBefore(row, rowTemplate); index++; } foreach (var newRow in footer.Select(item => CreateLabel(item, (UInt32)dataTable.Rows.Count))) { sheetData.InsertBefore(newRow, rowTemplate); } rowTemplate.Remove(); } } private Row CreateLabel(Footer item, uint count) { var row = item._Row; row.RowIndex = new UInt32Value(item._Row.RowIndex + (count - 1)); foreach (var cell in item.Cells) { cell._Cell.CellReference = new StringValue(cell._Cell.CellReference.Value.Replace(Regex.Replace(cell._Cell.CellReference.Value, @"[^\d]+", ""), row.RowIndex.ToString())); cell._Cell.CellValue = new CellValue(cell.Value); cell._Cell.DataType = new EnumValue<CellValues>(CellValues.String); row.Append(cell._Cell); } return row; } private Row CreateRow(Row rowTemplate, uint index, System.Data.DataRow item, List<Field> fields) { var newRow = (Row)rowTemplate.Clone(); newRow.RowIndex = new UInt32Value(index); foreach (var cell in newRow.Elements<Cell>()) { cell.CellReference = new StringValue(cell.CellReference.Value.Replace(Regex.Replace(cell.CellReference.Value, @"[^\d]+", ""), index.ToString(CultureInfo.InvariantCulture))); foreach (var fil in fields.Where(fil => cell.CellReference == fil.Column + index)) { cell.CellValue = new CellValue(item[fil._Field].ToString()); cell.DataType = new EnumValue<CellValues>(CellValues.String); } } return newRow; } private string GetCellValue(Cell cell, WorkbookPart wbPart) { var value = cell.InnerText; if (cell.DataType == null) { return value; } switch (cell.DataType.Value) { case CellValues.SharedString: var stringTable = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault(); if (stringTable != null) { value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText; } break; } return value; } private void OpenFile(string filePath) { if (!File.Exists(filePath)) { throw new Exception(String.Format("    \"{0}\"!", filePath)); } var process = Process.Start(filePath); if (process != null) { process.WaitForExit(); } } } } 


2) Footer - will contain lines and their cells coming after our data.
class footer
 using System; using System.Collections.Generic; using DocumentFormat.OpenXml.Spreadsheet; namespace Framework.Create { public class Footer { /// <summary> ///  /// </summary> public Row _Row { get; private set; } /// <summary> ///    /// </summary> public List<CellForFooter> Cells { get; private set; } public Footer(Row row, Cell cell, String cellValue) { _Row = new Row((Row)row.Clone()) { RowIndex = row.RowIndex }; var _Cell = (Cell)cell.Clone(); _Cell.CellReference = cell.CellReference; Cells = new List<CellForFooter> { new CellForFooter(_Cell, cellValue) }; } public void AddMoreCell(Cell cell, String cellValue) { var _Cell = (Cell)cell.Clone(); _Cell.CellReference = cell.CellReference; Cells.Add(new CellForFooter(_Cell, cellValue)); } } } 


3) CellForFooter - contains the coordinates of the cell and its value, used in Footer.
CellForFooter class
 using System; using DocumentFormat.OpenXml.Spreadsheet; namespace Framework.Create { public class CellForFooter { /// <summary> ///  /// </summary> public Cell _Cell { get; private set; } /// <summary> ///  /// </summary> public String Value { get; private set; } public CellForFooter(Cell cell, String value) { _Cell = cell; Value = value; } } } 


4) Field - will contain the index of the line where the DataField is located, the coordinates of the cells with the DataField and the name of the field whose value is to be output.
Field class
 using System; namespace Framework.Create { public class Field { /// <summary> ///   /// </summary> public uint Row { get; private set; } /// <summary> ///   /// </summary> public String Column { get; private set; } /// <summary> ///  ,   /// </summary> public String _Field { get; private set; } public Field(uint row, String column, String field) { Row = row; Column = column; _Field = field; } } } 


In the folder “Load” create 2 classes.
1) Worker - this will be our main processor.
class Load.Worker
 using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text.RegularExpressions; using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Spreadsheet; namespace Framework.Load { /// <summary> ///    Excel /// </summary> public class Worker { /// <summary> ///   (   ) /// </summary> private const String SheetName = "1"; /// <summary> ///      .xlsx /// </summary> /// <param name="path"></param> /// <returns></returns> public System.Data.DataTable ReadFile(String path) { CheckFile(path); return OpenDocumentForRead(path); } private System.Data.DataTable OpenDocumentForRead(string path) { System.Data.DataTable data = null; using (var document = SpreadsheetDocument.Open(path, false)) { Sheet sheet; try { sheet = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().SingleOrDefault(s => s.Name == SheetName); } catch (Exception ex) { throw new Exception(String.Format("        \"{0}\"!\n", SheetName), ex); } if (sheet == null) { throw new Exception(String.Format("    \"{0}\"!\n", SheetName)); } var relationshipId = sheet.Id.Value; var worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId); var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>(); var firstRow = true; var columsNames = new List<ColumnName>(); foreach (Row row in sheetData.Elements<Row>()) { if (firstRow) { columsNames.AddRange(GetNames(row, document.WorkbookPart)); data = GetTable(columsNames); firstRow = false; continue; } var item = data.NewRow(); foreach (var line in columsNames) { var coordinates = String.Format("{0}{1}", line.Liter, row.RowIndex); var cc = row.Elements<Cell>().SingleOrDefault(p => p.CellReference == coordinates); if (cc == null) { throw new Exception(String.Format("    \"{0}\"!", coordinates)); } item[line.Name.Trim()] = GetVal(cc, document.WorkbookPart); } data.Rows.Add(item); } } return data; } private System.Data.DataTable GetTable(IEnumerable<ColumnName> columsNames) { var teb = new System.Data.DataTable("ExelTable"); foreach (var col in columsNames.Select(columnName => new System.Data.DataColumn { DataType = typeof(String), ColumnName = columnName.Name.Trim() })) { teb.Columns.Add(col); } return teb; } private IEnumerable<ColumnName> GetNames(Row row, WorkbookPart wbPart) { return (from cell in row.Elements<Cell>() where cell != null let text = GetVal(cell, wbPart) where !String.IsNullOrWhiteSpace(text) select new ColumnName(text, Regex.Replace(cell.CellReference.Value, @"[\0-9]", ""))).ToList(); } private string GetVal(Cell cell, WorkbookPart wbPart) { string value = cell.InnerText; if (cell.DataType == null) { return value; } switch (cell.DataType.Value) { case CellValues.SharedString: var stringTable = wbPart.GetPartsOfType<SharedStringTablePart>() .FirstOrDefault(); if (stringTable != null) { value = stringTable.SharedStringTable .ElementAt(int.Parse(value)).InnerText; } break; } return value; } private void CheckFile(String path) { if (String.IsNullOrWhiteSpace(path) || !File.Exists(path)) { throw new Exception(String.Format("  \"{0}\",  !", path)); } } } } 


2) ColumnName - will be the name of the column for the loaded data.
class ColumnName
 using System; namespace Framework.Load { public class ColumnName { /// <summary> ///  ,    /// </summary> public String Name { get; private set; } /// <summary> ///   /// </summary> public String Liter { get; private set; } public ColumnName(string name, string liter) { Name = name; Liter = liter; } } } 


In the “Interfaces” project we will create the interface of our data IDataForTest.
IDataForTest interface
 using System; namespace Interfaces { public interface IDataForTest { String A { get; } String B { get; } String C { get; } } } 


In the “Converter” project, create a class
ConvertToDataTable - to convert our data into a DataTable.
class ConvertToDataTable
 using System; using System.Collections; using System.Collections.Generic; using System.Data; using Interfaces; namespace Converter { public class ConvertToDataTable { public DataTable ExcelTableLines(IEnumerable<IDataForTest> lines) { var dt = CreateTable(); foreach (var line in lines) { var row = dt.NewRow(); row["AAA"] = line.A; row["BBB"] = line.B; row["CCC"] = line.C; dt.Rows.Add(row); } return dt; } public Hashtable ExcelTableHeader(Int32 count) { var head = new Dictionary<String, String> { { "Date", DateTime.Today.Date.ToShortDateString() }, { "Count", count.ToString() } }; return new Hashtable(head); } private DataTable CreateTable() { var dt = new DataTable("ExelTable"); var col = new DataColumn { DataType = typeof(String), ColumnName = "AAA" }; dt.Columns.Add(col); col = new DataColumn { DataType = typeof(String), ColumnName = "BBB" }; dt.Columns.Add(col); col = new DataColumn { DataType = typeof(String), ColumnName = "CCC" }; dt.Columns.Add(col); return dt; } } } 


In the project “OpenXmlPrj”
There will be a class for executing the program.
Program class
 using System; using System.Collections.Generic; using System.Data; namespace OpenXmlPrj { class Program { static void Main(string[] args) { //   var myData = new List<DataForTest> { new DataForTest("a1","b1","c1"), new DataForTest("a2","b2","c2"), new DataForTest("a3","b3","c3"), new DataForTest("a4","b4","c4"), new DataForTest("a5","b5","c5") }; var ex = new Converter.ConvertToDataTable(); //ex.ExcelTableLines(myData) -     DataTable //ex.ExcelTableHeader(myData.Count) -    Label //template -     -  new Framework.Create.Worker().Export(ex.ExcelTableLines(myData), ex.ExcelTableHeader(myData.Count), "template"); Console.WriteLine("Excel File Has Created!\nFor Read Data From Excel, press any key!"); Console.ReadKey(); //"C:\\Loading\\ReadMePlease.xlsx" -   ,      (  DataTable) var dt = new Framework.Load.Worker().ReadFile("C:\\Loading\\ReadMePlease.xlsx"); var myDataFromExcel = new List<DataForTest>(); //  ,    DataTable foreach (DataRow item in dt.Rows) { myDataFromExcel.Add(new DataForTest(item)); } Console.WriteLine("---------- Data ---------------------"); //   foreach (var line in myDataFromExcel) { Console.WriteLine("{0} | {1} | {2}", line.A, line.B, line.C); } Console.WriteLine("Done. Press any key, for exit!"); Console.ReadKey(); } } } 


And the class for our data is “DataForTest”.
class DataForTest
 using System; using System.Data; using Interfaces; namespace OpenXmlPrj { public class DataForTest : IDataForTest { public String A { get; private set; } public String B { get; private set; } public String C { get; private set; } public DataForTest(String a, String b, String c) { A = a; B = b; C = c; } public DataForTest(DataRow item) { A = item["MyFieldA"].ToString(); B = item["MyFieldB"].ToString(); C = item["MyFieldC"].ToString(); } } } 


And the project “OpenXmlPrj” should include links to the following projects: Interfaces, Framework, Converter
Conditions for creating a template:
1. Excel sheet, must necessarily be called “Sheet1” (well, or if you want to rename, then in the code you will need to change the name too).
2. Names after DataField: must strictly coincide with the names of the columns in the DataTable.
3. The template should be saved in “.xlsx” format.
Conditions for the file from which we will read data:
1. Excel sheet, must necessarily be called “Sheet1” (well, or if you want to rename, then in the code you will need to change the name too).
2. The first line should contain the names of the columns by which we will then parse the data.

The link to the source in GitHub.

Source: https://habr.com/ru/post/203522/


All Articles