⬆️ ⬇️

Writing PHP extension

And let's take a look at PHP today from a different point of view, and write an extension to it. Since on this topic there have already been publications on Habré ( here and here ), we will not delve into the reasons for why this may be useful and for what can be used in practice. This article will explain how to build simple extensions under Windows using Visual C ++ and under Debian using GCC. I will also try to light up a little work with PHP arrays inside extensions and compare the performance of the algorithm written in native PHP and using code written in C.





Compile under Win32



So let's start with Windows. As you know , PHP developers use Visual C ++ 9 or Visual Studio 2008 to compile their creation under Windows. Therefore, we will use Visual Studio 2008, the free Express version is also suitable, as indeed, probably, later and earlier versions of the studio.



What we need:First, create a Win32 Console Application type project and select a DLL in the Application type. Now we have to configure all dependencies and paths for the linker:Find the stdafx.h file in the project and replace its contents with the following:

#ifndef STDAFX #define STDAFX #define PHP_COMPILER_ID "VC9" //        PHP,  Visual C++ 9.0 #include "zend_config.w32.h" #include "php.h" #endif 


If you try to compile the project at this stage, you will get an error saying that main \ config.w32.h is missing. You can get it either by running the main \ configure.bat script, or you can pull it out of the sources, for example, PHP 5.2. In this case, do not forget to edit all the paths in this file and uncomment the "#define HAVE_SOCKLEN_T" directive. Now the project should compile without errors.



Now let's write hello world, add the following to our cpp file:



 PHP_FUNCTION(test); const zend_function_entry test_functions[] = { PHP_FE(test, NULL) {NULL, NULL, NULL} }; zend_module_entry test_module_entry = { STANDARD_MODULE_HEADER, // #if ZEND_MODULE_API_NO >= 20010901 "test", //   test_functions, //    NULL, // PHP_MINIT(test), Module Initialization NULL, // PHP_MSHUTDOWN(test), Module Shutdown NULL, // PHP_RINIT(test), Request Initialization NULL, // PHP_RSHUTDOWN(test), Request Shutdown NULL, // PHP_MINFO(test), Module Info ( phpinfo()) "0.1", //    STANDARD_MODULE_PROPERTIES }; ZEND_GET_MODULE(test) PHP_FUNCTION(test) { RETURN_STRING("hello habr", 1); //  PHP-,   ,         } 


Now we will connect this module in PHP and try to run something like this:

 php -r "test();" 
To which we should get the answer “hello habr”.





Compilation under * nix



In * nix, everything turned out to be easier as always. I will show with the example of Debian, I think that under other systems the process will not be different.

We will need:Let's create somewhere a directory for our extension. Well, for example / test. There we will create two empty files:

 config.m4
 test.c


The first is needed for the magic compilation of the extension, and the second is its source code. In config.m4 we will write the following:

 PHP_ARG_ENABLE(test, Enable test support) if test "$PHP_TEST" = "yes"; then AC_DEFINE(HAVE_TEST, 1, [You have test extension]) PHP_NEW_EXTENSION(test, test.c, $ext_shared) fi 


Inside test.c add

 #include "php.h" 


And after this deadline, copy the contents of the cpp-file from the Windows version.

Now we go to the console and:

 # phpize //        # ./configure //  makefile # make //  # make install //  .so    PHP  


That's all. Now you can open php.ini, add your extension there:

 extension = test.so


And check its performance team

 php -r "test();" 




Argument handling and return values





First, let's look at how you can take arguments:

 char* text; int text_length; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &text, &text_lenght) == FAILURE) { return; } 


The third parameter specifies the expected type ( here you can see all the options), in this case it is char * or int. Also under the link you can find options for combining types and specifying the number of arguments. All of the following parameters are variables in which the passed values ​​will be written. When passing a string, the string itself and its length are transmitted.

If the number of arguments passed to your function does not match, E_WARNING will be thrown, and you can return some value, for example, an error message.



You can return both simple types and complex ones. Let's get acquainted with the formation of the returned array. To indicate that the array will be returned, it must be initialized:

 array_init(result); 


To add values ​​to an array, you must use functions depending on which index and value is added to the array. For example:

 add_next_index_long(result, 42); // $result[] = 42; add_assoc_bool(result, "foo", 1); // $result['foo'] = true; add_next_index_null(result); // $result[] = NULL; 


A full list of features can be found here.



If you are interested in someone, I can consider an example of working with objects in the next article (a classic example of extending objects is mysqli). There is a very good article on this topic.





Performance



To check the performance, I chose a somewhat synthetic example: counting the occurrence of each character in a string. In other words, we need to get a function that takes a string as a parameter, and returns an array in which the number of uses of each character in a given string is indicated. This example will demonstrate working with large strings.



I got this implementation, do not kick much for the code, I still write more in PHP than in C:



 PHP_FUNCTION(calculate_chars) { char* text; int text_length; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &text, &text_length) == FAILURE) { return; } array_init(return_array); int table[256] = { 0 }; for (int i = 0; i < text_length; i++) { table[((unsigned char*)text)[i]]++; } char str[2]; str[1] = '\0'; for (int i = 0; i < 256; i++) { if (table[i]) { str[0] = (char)i; add_assoc_long(return_array, str, table[i]); } } } 


This code produces the following result:

 user> php -r "print_r( calculate_chars('example') );" Array ( [a] => 1 [e] => 2 [l] => 1 [m] => 1 [p] => 1 [x] => 1 } 


And now let's compare the speed of execution of this code and the same for native PHP:



 $map = array(); for ($i = 0; $i < $length; $i++) { $char = $text[$i]; if (isset($map[$char])) { $map[$char]++; } else { $map[$char] = 1; } } 


I will compare the execution time of both solutions using the microtime function. Take a line of 100 characters, a line of 5000 characters, and a line of 69000 characters (I took the book A Message from the Sea, written by Charles Dickens, I hope he will forgive me for this), and for each option we will chase both solutions several thousand times . The results are shown in the table below. Testing was conducted on my not very strong home laptop and VDS with Debian on board, and yes, I clearly understand that the results may depend on the configuration, operating system version, PHP, atmospheric pressure and wind direction, but I wanted to show only approximate numbers .

The full code of the test script can be downloaded here . The sources and binaries of the extensions themselves can be downloaded here (win) and here (nix) .

Number of iterationsPHP code / Win32PHP code / DebianPHP extension / Win32PHP extension / DebianWin32 winDebian win
1. Line of 100 characters1,000,00084.7566 sec72.5617 sec8.4750 sec4.4175 sec10 times16.43 times
2. 5000 character string10,00039.1012 sec31.7541 sec0.5001 sec0.134 sec78.19 times236.98 times
3. Line of 69000 characters100052.3378 sec44.0647 sec0.4875 sec0.0763 sec107.36 times577.51 times


findings



Judging the performance of the module compared to the interpreted code, we see that tangible results can be obtained on large amounts of data and on small quantities of iterations. That is, for frequently used, but not very resource-intensive algorithms, it does not make sense to put them into compiled code. But for algorithms that work with large amounts of data, this may be practical. Also, based on my measurements, you can see that the results of the PHP code are comparable on different systems (I remind you that these were two different machines), but the results of the extension work are very different. From this I personally conclude that there are some features of the compilation that I do not know. However, I strongly doubt that someone is using a Windows server for PHP projects. Although I also very much doubt that someone will run right now to rewrite something in C, this article is still more just for fun than a guide to action. I just wanted to show that writing a PHP extension is very simple, and can sometimes be very useful.



UPD1. Comparison with count_chars

In the comments asked an interesting question: what if to compare with the performance of the count_chars function?

I increased the number of iterations a hundred times, and drove the same test, but using this function. You can see that on Debian, the results were almost equal, and under Windows there is an interesting situation: the larger the data, the more my module merges in performance. Let me remind you that the idea of ​​the test was not to write a bicycle, but to take an algorithm for working with large amounts of data.

Number of iterationscount_chars / win32count_chars / debianextension / Win32extension / DebianWin32 winDebian win
1. Line of 100 characters10,000,00067.5245 sec47.8104 sec81.8185 sec43.8091 sec0.83 times1.09 times
2. 5000 character string1,000,00022.4693 sec12.8959 sec47.2514 sec12.9577 sec0.48 times0.99 times
3. Line of 69000 characters100,00015.0681 sec7.661 sec46.9598 sec7.7387 sec0.32 times0.99 times




Materials

Source: https://habr.com/ru/post/125597/



All Articles