📜 ⬆️ ⬇️

Not very fair PHP DOC file generation

Each task has several solutions. And sometimes for the sake of speed, one has to choose not the most beautiful one, but working and fulfilling the goals set for him. So, on one not very beautiful day, it became necessary to implement the following function: each (almost) page of the site should have automatically generated copies in DOC and PDF formats. With the preservation of all tables and images inside the content. And if with PDF everything is relatively simple (tcpdf is our friend and brother), then a confusion has arisen with DOC. Under the cut - an example of solving this problem. Successively such solutions came to mind:That's actually from the third option and the final solution was born: create a document format MHT, integrate images into it and save with the extension DOC. For generation, a simple library was used, taken from here. The code does not pretend to beauty and versatility, moreover, it has problems that are not relevant for that site, but the main thing is that it works, and it is enough to understand the topic.

An example of the use of the written function:

$link="m.habrahabr.ru/post/136811/"; CreateDOC($link,"test.doc"); 

')

And here is the source code of the function:

 function CreateDOC($link,$filename) { //  ,  $base_link=$link; $base_link=explode("/",$link); unset($base_link[count($base_link)-1]); $base_link[]=""; $base_link=implode("/",$base_link); //   $get_text=file_get_contents($link); // ,       mht $MhtFileMaker = new MhtFileMaker(); //    //  ,   ,   FlexIDK preg_match_all('@<img(.*)?src="([^"]+)"@ui', $get_text, $matches); foreach ($matches[4] as $img) { $img_tmp=$img; $img_tmp_old=$img; //,  ?  ! if (strpos($img_tmp,"http")===FALSE) $img_tmp=$base_link.$img_tmp; //      $img_array=explode("//",$img_tmp); $img_name_only=$img_array[1]; $img_name_only=explode("/",$img_name_only); unset($img_name_only[0]); $img_name_only=implode("/",$img_name_only); //     ( ) $get_text=str_replace($img_tmp_old,$img_name_only,$get_text); //     $MhtFileMaker->AddFile($img_tmp, $img_name_only, NULL); }; // ,     $MhtFileMaker->AddContents("index.html","text/html",$get_text); //  $MhtFileMaker->MakeFile($filename); }; 
Naturally, it is possible to write on the basis of this a much more universal and direct class, but for our purposes this was enough. The main thing is that this solution works, and quickly enough. I hope it will seem useful to someone.

Update: tested the resulting file in the comments - normally it opens only in Microsoft Word 2003 and higher, in third-party products (OpenOffice and others) problems arise. Also in the comments are links to many other, more correct methods of conversion.

Update 2: Updated source code - FlexIDK suggested a more successful regular season, choosing the paths of pictures without any extra characters.

Source: https://habr.com/ru/post/136999/


All Articles