📜 ⬆️ ⬇️

Implementing a quick import from Excel to PHP

We continue to talk about the technologies used in our email-marketing service Pechkin-mail.ru . One of the key tasks of any service related to customer data is to download this data to the service. For Pechkin, it is very important to quickly and easily download the address databases containing email addresses, first names, last names and other additional data.

What to use as a tool?


We used Microsoft Excel as the base standard used when importing address databases. This is explained simply:


The problem of users when importing address databases was removed. But there is already a problem directly developing.


Our pain as developers


Excel is not an open-source development, but a proprietary solution. The data format, especially in newer versions after 2007 (xlsx), is non-trivial. PHP is used at Pechkin, so we began searching for libraries that will allow us to solve this problem. But here we encountered the problem that a number of libraries do not allow reading xlsx:

')
The PHPExcel library attracted our attention. We used it several years ago in the SMS24X7.ru sms-mailing service . Petya Sokolov ( Petr_Sokolov ), our talented developer, wrote a wrapper for this library, correcting a number of its flaws and bugs.

The library is certainly interesting and developed. But for Pechkin, it became impossible to use it in a couple of years, when both we and our customers grew up - its disastrous demands on resources and the huge time of parsing files. For example, there are frequent cases of loading on the service of address databases> 100,000 lines with a complex structure. And if the file is already 500,000 lines and “weighs” more than 30MB?

And then we let go ...


In the process of searching, we stumbled upon the libxl commercial library, seeing the results of the “handicraft benchmark” at Stackoverflow.

The library is written in C ++, and thanks to the excellent object-oriented extension for PHP from Ilia Alshanetsky , it is easy to learn and integrate (for example, it took about 3 hours to rewrite our current solution from PHPExcel to LibXL). What is very cool, considering that, unfortunately, there is no documentation from the developer, and it is necessary to use the Reflection extension.

The installation process is very simple.

cd /usr/local/src/ wget http://libxl.com/download/libxl.tar.gz tar zxfv libxl.tar.gz cd libxl-3.5.4.1/ ln -s include_c include cd ../ wget https://github.com/iliaal/php_excel/archive/master.zip unzip master.zip cd php_excel-master/ ./configure --with-excel --with-libxl-libdir=../libxl-3.5.4.1/lib64 --with-libxl-incdir=../libxl-3.5.4.1/include_c make make test make install 


As a result of the compilation, you will receive the excel.so file in the / usr / lib / php5 / 20090626 / folder. Now it’s enough to create the file /etc/php5/conf.d/excel.ini with the contents.
 extension=excel.so 


Check whether the module is installed and restart the web server.
 php -m | grep excel service lighttpd restart 


In the code, everything is also very simple. You load the file and read the necessary cells. For example, like this:
 $doc = new ExcelBook(); $doc->loadFile('example.xls'); for($r=$sheet->firstRow();$r<=$sheet->lastRow();$r++){ for($c=$sheet-> firstCol();$c<=$sheet-> lastCol();$c++){ echo ': '.$r.' : '.$c.' : '.$sheet->read($r,$c).'<br />'; } } 


Performance Results


The absence of the need for RAM (in the process of downloading the file and reading it) was pleasantly pleased.


And here is the increase in the speed of downloading the excel file and its reading on various sizes of address bases.


These tests were carried out on xlsx files with N subscribers in one hundred hundred with email. The real address bases are even bigger and more complex and the advantage in speed and memory consumption looks even more significant.

The cost of the library is $ 199 for a developer license, but believe me, it's worth it. Of course, we recommend to anyone who is faced with the problem of importing Excel files into their service.

Source: https://habr.com/ru/post/224207/


All Articles