📜 ⬆️ ⬇️

Simple export to Excel XML

When developing an electronic document management system, it was necessary to implement functions for exporting data in popular formats. In particular, in Microsoft Excel format. Export requirements were fairly simple — export data with a minimum of formatting, i.e. no merged cells, font games, etc. Export Formats XLSX and Excel XML.



In this case I will tell about Excel XML .

So, in any system operating with tabular data, a wound or later there is a need to export data. Export objectives are different:

Based on the goals, a reasonable conclusion was asked that when exporting data, you should try to save or specify the type of data in the respective columns so that the application that opens the file does not try to apply formatting at its discretion. To put it another way, dates must be dates, numbers must be numbers, lines must be strings.
')

Task


Technical requirements formulated trivially:

Making the export mechanism in the form of an autonomous class is an obvious requirement, the fulfillment of which will allow in the future to share the class with other developers and use it in new projects.

Implementing a set of functions in a class for writing cell values ​​and a row is the main requirement, which implies the creation of functions for writing cell values ​​of the specified types and the ability to write a finished row to a file.

Ability to work with an unlimited amount of data - of course, the export class itself cannot be responsible for the recorded volume, but it must provide functions for writing data to disk and freeing up RAM for the next piece of data.

In addition to the requirements described, it was necessary to add service functions:


Implementation


First of all, when creating a class, I check the final file name and request the number of columns and rows. The file must have the correct name, and the folder in which it will be saved must exist. Everything is as usual.
The Excel XML format allows you to save information about the user who created it in the file; therefore, when creating the header, I record the name of the organization, the information about the user and the date the file was created.

public function writeDocumentProperties($organization = null, $user = null) { fwrite($this->file, '<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">'); if (!is_null($user)) { fwrite($this->file, '<Author>'.$user->description.'</Author>'); fwrite($this->file, '<LastAuthor>'.$user->description.'</LastAuthor>'); } $dt = new Datetime(); $dt_string = $dt->format('Ymd\TH:i:s\Z'); fwrite($this->file, '<Created>'.$dt_string.'</Created>'); fwrite($this->file, '<LastSaved>'.$dt_string.'</LastSaved>'); if (!is_null($organization)) fwrite($this->file, '<Company>'.$organization->name.'</Company>'); fwrite($this->file, '<Version>12.00</Version>'); fwrite($this->file, '</DocumentProperties>'); } 

However, it is in this function that entities of the workflow system are used - organization (organization) and user (user). Replacing these entities with, say, string values ​​is not a problem.

The most interesting in the title is information about styles. In Excel XML, they are implemented very conveniently, so I simply create a table with styles for rows, date / time, and hyperlinks.

 public function writeStyles() { fwrite($this->file, '<Styles>'); //default style fwrite($this->file, '<Style ss:ID="Default" ss:Name="Normal"><Font ss:Color="#000000"/></Style>'); //Datetime style fwrite($this->file, '<Style ss:ID="DateTime"><NumberFormat ss:Format="General Date"/></Style>'); fwrite($this->file, '<Style ss:ID="Date"><NumberFormat ss:Format="Short Date"/></Style>'); fwrite($this->file, '<Style ss:ID="Time"><NumberFormat ss:Format="h:mm:ss"/></Style>'); //Hyperlink style fwrite($this->file, '<Style ss:ID="Hyperlink" ss:Name="Hyperlink"><Font ss:Color="#0000FF" ss:Underline="Single"/></Style>'); //Bold fwrite($this->file, '<Style ss:ID="Bold"><Font ss:Bold="1"/></Style>'); fwrite($this->file, '</Styles>'); } 


The preparatory work is finished, you can go to the data record. Opening a worksheet is just a couple of tags, just at this moment information about the number of columns and rows is used.

 public function openWorksheet() { fwrite($this->file, '<Worksheet ss:Name="Export">'); fwrite($this->file, strtr('<Table ss:ExpandedColumnCount="{col_count}" ss:ExpandedRowCount="{row_count}" x:FullColumns="1" x:FullRows="1" ss:DefaultRowHeight="15">', array('{col_count}'=>$this->colCount, '{row_count}'=>$this->rowCount))); } 

But the recording of rows is a more interesting process. The class should work quickly and process an unlimited amount of data, because there can be thousands of records or even a million! If you want speed, work with memory, if you want an unlimited amount of data, work with a disk. To reconcile the requirements, implemented the resetRow and flushRow functions.
The first one clears the current row, after which it can again be filled with data, and the second one writes the current row to an open file on the disk. Sharing them helps balance the speed and amount of memory used.

 public function resetRow() { $this->currentRow = array(); } public function flushRow() { fwrite($this->file, implode('', $this->currentRow)); unset($this->currentRow); } 

Each cell is recorded by a function corresponding to the data type, namely appendCellxxx, where xxx is the data type. Valid data types are Num, String, Real, DateTime, Date, Time, Link. An example of a function for recording a numeric value:

 public function appendCellNum($value) { $this->currentRow[] = '<Cell><Data ss:Type="Number">'.$value.'</Data></Cell>'; } 

After recording all the data, it remains to close the worksheet and the workbook.

Application


The use of the described class is based on the export of data using the CArrayDataProvider provider. However, assuming that the volume of exported data can be very large, a special iterator CDataProviderIterator is used , which iterates over the returned data by 100 records (you can specify a different number of records).

 public function exportExcelXML($organization, $user, &$filename) { $this->_provider = new CArrayDataProvider(/*query*/); Yii::import('ext.AlxdExportExcelXML.AlxdExportExcelXML'); $export = new AlxdExportExcelXML($filename, count($this->_attributes), $this->_provider->getTotalItemCount() + 1); $export->openWriter(); $export->openWorkbook(); $export->writeDocumentProperties($organization, $user); $export->writeStyles(); $export->openWorksheet(); //title row $export->resetRow(); $export->openRow(true); foreach ($this->_attributes as $code => $format) $export->appendCellString($this->_objectref->getAttributeLabel($code)); $export->closeRow(); $export->flushRow(); //data rows $rows = new CDataProviderIterator($this->_provider, 100); foreach ($rows as $row) { $export->resetRow(); $export->openRow(); foreach ($this->_attributes as $code => $format) { switch ($format->type) { case 'Num': $export->appendCellNum($row[$code]); /*other types*/ default: $export->appendCellString(''); } } $export->closeRow(); $export->flushRow(); } //close all $export->closeWorksheet(); $export->closeWorkbook(); $export->closeWriter(); //zip file $export->zip(); $filename = $export->getZipFullFileName(); } 

In my case, each row is written to disk, which, so far, is quite acceptable, but in the future, it may require changes. For example, it would be reasonable to save not every row, but every ten or even a hundred rows at a time. Then the export rate will increase.

Speed


By the way, I learned from my own experience how important it is to assume the possibility of the existence of large amounts of data during a batch operation, such as export.
Initially, I tried to export data using a CActiveDataProvider , which required about 240 seconds when exporting 1000 records! Changing the query to use CArrayDataProvider reduced the export time of 1000 entries to 0.5 seconds!
Especially for this publication measured export figures.
Exported 1,626 records with 9 attributes that represent information about closed incidents (see ITSM ).

The original view of the exported table

(sorry, the picture disappears after publication)

Result

(sorry, the picture disappears after publication)

Export figures

The size of the final file: 1 312 269
Compressed file size: 141,762
Elapsed time: approximately 0.5 seconds

Anyone interested, can get the source code of my class AlxdExportExcelXML for free. Only you need to remember to correct the function writeDocumentProperties , in order to get rid of the entities of the organization and user document management system, or use your similar entities with appropriate properties.

Source: https://habr.com/ru/post/235973/


All Articles