📜 ⬆️ ⬇️

Simple export to Excel XLSX

In continuation of the topic begun in the previous article , I want to share my experience in data export, in particular, in the XLSX format.



So, who cares how to fill out XLSX without large and complex libraries, please under the cat.

Recently, I faced the task of exporting an unpredictable amount of tabular data in XLSX format. Like any sensible programmer, first of all it is useful to look for ready-made solutions.
Almost immediately stumbled upon the PHPExcel library. A powerful solution, with a bunch of different functions and features. Rummaging a little more found reviews of programmers about it. In particular, the forums have complaints about the speed and failure to work with a large amount of data. Marked the library as one of the solutions and began to look further.
I found some more libraries for working with XLSX, but they were all either forgotten, because they were not updated for 2-3 years, or they were necessarily drafted by third-party libraries, or they used DOM to work with files, which I didn’t really like. Each time, bumping into the next library and studying the mechanisms of its work, I caught myself thinking that all this was “from a cannon on a sparrow.” I do not need such a difficult decision!
I admit honestly, having studied superficially each of the solutions found, did not install and test one. I needed a simpler and more reliable solution like a tank.
')

Task


In general, since I did not find anything suitable, then it is necessary to formulate technical requirements for what is needed. The requirements, as expected, turned out to be trivial:

Separately dwell only on the last paragraph. As you know, XLSX is a regular zip archive that you can unpack and see that it consists of several files and directories. Inversely, it can be packaged and renamed to XLSX. If all changes are correct, then Microsoft Excel will open the file without problems.

Implementation


Initially, I really wanted to create all the files that make up XLSX, code, but, fortunately, I quickly realized the meaninglessness of my idea. And a different, more correct and simple solution was born. Using Microsoft Excel, you need to create an XLSX file in the form in which it is needed in the end, but without data, in other words, a template, and then, using code, just add data!
In this case, the class will have to unpack the template in a separate directory, make changes to /xl/worksheets/sheet1.xml and package the contents of the directory back to XLSX.

The class declaration contains public variables:
$ templateFile - template file name
$ exportDir is the folder into which the template will be unpacked, of course, with the necessary access rights.

The class constructor takes the name of the future file, the number of columns and rows. Then it checks that the file name is correct, the folder for unpacking the template exists and forms the full name of the final folder for unpacking the template.
After creating the class, you can unpack the template and open it on the sheet1.xml record. In fact, I do not just append to the file, but completely overwrite it. Once I took an initial line from it, I made a change in it in the dimension tag, which reflects the size of the exported range, and write it to a file.

public function openWriter() { if (is_dir($this->baseDir)) CFileHelper::removeDirectory($this->baseDir); mkdir($this->baseDir); exec("unzip $this->templateFullFilename -d \"$this->baseDir\""); $this->workSheetHandler = fopen($this->baseDir.'/xl/worksheets/sheet1.xml', 'w+'); fwrite($this->workSheetHandler, '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"><dimension ref="A1:'.chr(64+$this->colCount).$this->rowCount.'"/><sheetData>'); } 


The resetRow and flushRow functions ensure the speed of work and the ability to work with large amounts of data. They are responsible for clearing the current row in memory and writing the current row to disk.
But the preservation of the values ​​of cells with different types was not such a simple task.

Line entry

It would seem difficult to write a string value to a file. However, XLSX is not so simple. All strings inside XLSX are stored in a separate /xl/sharedStrings.xml file. Cells are not written string values, and their ordinal numbers - indices. A sensible solution in terms of reducing file size.

But this solution is inconvenient from the point of view of the program filling the template. If we fulfill this requirement, then I would have to perform a separate pass through all string values ​​in the data array, exclude duplicate values, save them in sharedStrings.xml, index them, and enter their indices instead of values ​​in the original array. Slow and uncomfortable.

It turns out that you can bypass the requirement and store the string values ​​of the cells directly in the cells. But in this case the recording format will be different:

 public function appendCellString($value) { $this->curCel++; if (!empty($value)) { $value = htmlspecialchars($value, ENT_QUOTES, 'UTF-8'); $value = preg_replace( '/[\x00-\x13]/', '', $value ); $this->currentRow[] = '<cr="'.chr(64+$this->curCel).$this->numRows.'" t="inlineStr"'.($this->isBold ? ' s="7"' : '').'><is><t>'.$value.'</t></is></c>'; $this->numStrings++; } } 


Record number

There were no difficulties with writing integers or fractional numbers. It's simple:

 public function appendCellNum($value) { $this->curCel++; $this->currentRow[] = '<cr="'.chr(64+$this->curCel).$this->numRows.'"><v>'.$value.'</v></c>'; } 


Record date and time

Date and time are stored as the number of seconds past since January 1, 1970, divided by the number of seconds in a day. Moreover, in the calculation made a mistake with the definition of a leap year. In general, without going into details that are easy to find on the network, in order to correctly calculate the date, we had to declare two constants in the class:
ZERO_TIMESTAMP - date offset in Excel format from UNIX_TIMESTAMP
SEC_IN_DAY - seconds in days.
After calculating the date and time value, the integer part of the fraction is the date, the fractional part is the time:

 const ZERO_TIMESTAMP = 2209161600; const SEC_IN_DAY = 86400; public function appendCellDateTime($value) { $this->curCel++; if (empty($value)) $this->appendCellString(''); else { $dt = new DateTime($value); $ts = $dt->getTimestamp() + self::ZERO_TIMESTAMP; $this->currentRow[] = '<cr="'.chr(64+$this->curCel).$this->numRows.'" s="1"><v>'.$ts/self::SEC_IN_DAY.'</v></c>'; } } 

After recording all the data, it remains to close the worksheet and the workbook.

Application


As before, the use of the described class is based on the export of data using the CArrayDataProvider provider. Assuming that the volume of exported data can be very large, a special iterator CDataProviderIterator is used , which iterates over the returned data by 100 records (you can specify a different number of records).

 public function exportXLSX($organization, $user, &$filename) { $this->_provider = new CArrayDataProvider(/*query*/); Yii::import('ext.AlxdExportXLSX.AlxdExportXLSX'); $export = new AlxdExportXLSX($filename, count($this->_attributes), $this->_provider->getTotalItemCount() + 1); $export->openWriter(); $export->resetRow(); $export->openRow(true); foreach ($this->_attributes as $code => $format) $export->appendCellString($this->_objectref->getAttributeLabel($code)); $export->closeRow(); $export->flushRow(); $rows = new CDataProviderIterator($this->_provider, 100); foreach ($rows as $row) { $export->resetRow(); $export->openRow(); foreach ($this->_attributes as $code => $format) { switch ($format->type) { case 'Num': $export->appendCellNum($row[$code]); /*other types*/ default: $export->appendCellString(''); } } $export->closeRow(); $export->flushRow(); } $export->closeWriter(); $export->zip(); $filename = $export->getZipFullFileName(); } 

Anyone interested, can get the source code of my class AlxdExportXLSX completely free of charge.

Source: https://habr.com/ru/post/236107/


All Articles