📜 ⬆️ ⬇️

Combining C ++ and Python. Subtleties Boost.Python. Part one

Boost.Python is in all respects a wonderful library that fulfills its purpose for 5+, whether you want to make a C ++ module for Python or if you want to build a Python script binding for a native application written in C ++.
The most difficult thing in Boost.Python is an abundance of subtleties, since C ++ and Python are two languages ​​full of possibilities, and therefore at the junction they have to take into account all the nuances: pass an object by reference or value, give Python a copy of an object or an existing class, convert to the internal Python type or to the wrapper written in C ++, how to pass the object constructor, overload the operators, hang non-existent in C ++, but are necessary in Python methods.
I do not promise that in my examples I will describe all the subtleties of the interaction of these fundamental languages, but I will try to immediately cover as many frequently used examples as possible so that you don’t climb into every documentation for every little thing and see all the necessary basics here, or at least get a basic idea about them .

Table of contents



Introduction


We presume that you already have a handy toolkit for building a dynamically linked library in C ++, and also a Python interpreter installed.
You will also need to download the Boost library, and then compile it, following the instructions for your Windows or Linux operating system .
In a nutshell, on Windows, all actions are reduced to two lines on the command line. Unzip the downloaded Boost archive to any place on the disk, go there in the command line and type in succession two commands:
bootstrap b2 --build-type=complete stage 

To build x64, add the argument address-model = 64
If you already have a Boost library, but you did not install Python, or you downloaded and installed a fresh Python interpreter and want to build only Boost.Python, this is done with the additional key --with-python
That is, the entire line for building only Boost.Python with 64-bit addressing looks like this:
 b2 --build-type=complete address-model=64 --with-python stage 

It is worth noting that the x64 assembly should be ordered if you have Python x64 installed. Also, modules for it will need to be assembled with 64-bit addressing.
The --with-python key will save you a lot of time if you from the Boost library have nothing but Boost.Python functionality.
If you have several interpreters installed, I highly recommend reading the detailed documentation on building Boost.Python
After the build, you will have the Boost.Python libraries compiled in the Boost \ stage \ lib folder, we will need them very soon.

We configure the project on a C ++


We create a project to create a dynamically linkable library in C ++, I suggest calling it example.
After creating the project, you need to specify additional INCLUDE Python \ include directories and Boost root, as well as directories for searching the Python \ libs and Boost \ stage \ lib libraries
Under Windows, in the settings of Post-build events you should also set the renaming of $ (TargetPath) into a module with the example.pyd extension in the project root.
It may also be worth copying the assembled Boost.Python libraries to the directory with the module being built.
Connecting the module after running the interpreter in the same directory will be reduced to one command:
 from example import * 

Do not forget about the build for x64 if you build for 64-bit Python.

Normal class with simple fields.


So, let's get our new project three files at once:
some.h
some.cpp
wrap.cpp
')
In some.h and some.cpp files we describe some wonderful class Some, which we wrap for Python in the example module in the wrap.cpp file — for this, you should include <boost / python.hpp> in the wrap.cpp file and use the BOOST_PYTHON_MODULE macro (example ) {...}, also for brevity, it is not at all superfluous to use using namespace boost :: python. In general, our future module will look like this:
 #include <boost/python.hpp> ... using namespace boost::python; ... BOOST_PYTHON_MODULE( example ) { ... } ... 


In the file some.h we should build the declaration of our wonderful class. To explain most of the basic mechanisms, we need only two fields:
 private: int mID; string mName; 

Suppose a class contains a description of something that has a name and an integer identifier. Oddly enough, this simple class will cause a lot of difficulties, due mainly to the standard string class, method overloads, a constant reference, and the NOT_AN_IDENTIFIER static property, which of course we will also introduce:
 public: static const int NOT_AN_IDENTIFIER = -1; 

Of course, this constant is needed as an identifier for the object created by the default constructor, we also describe another constructor that defines both fields:
  Some(); Some( int some_id, string const& name ); 

In the file some.cpp we will describe the implementation of these constructors, I will not describe the implementation in the future, but let's write the constructors together:
 Some::Some() : mID(NOT_AN_IDENTIFIER) { } Some::Some( int some_id, string const& name ) : mID(some_id), mName(name) { } 

Simultaneously with the appearance of the Some class, the Python class wrapper will appear in the wrap.cpp file:
 BOOST_PYTHON_MODULE( example ) { class_<Some>( "Some" ) .def( init<int,string>( args( "some_id", "name" ) ) ) ; } 

It uses shameless optical illusion and the template boost :: python :: class_, which creates a class description for Python in the specified module using Python C-API, which is terribly complex and incomprehensible when describing methods, and therefore completely hidden behind the declaration of the simple def () method on every line.
The default constructor and the copy constructor are created for the default object, unless otherwise indicated, but we will touch upon this a little lower.
Already, you can assemble a module, import it from the Python interpreter, and even create an instance of the class, but we still cannot read its properties or call methods until it is physically absent.
Let's fix it, create the “richest” API of our class miracle. Here is the full code of our header file some.h:
 #pragma once #include <string> using std::string; class Some { public: static const int NOT_AN_IDENTIFIER = -1; Some(); Some( int some_id, string const& name ); int ID() const; string const& Name() const; void ResetID(); void ResetID( int some_id ); void ChangeName( string const& name ); void SomeChanges( int some_id, string const& name ); private: int mID; string mName; }; 


Since the implementation of the methods has also turned out to be rather short, let's quote the code some.cpp:
 #include "some.h" Some::Some() : mID(NOT_AN_IDENTIFIER) { } Some::Some( int some_id, string const& name ) : mID(some_id), mName(name) { } int Some::ID() const { return mID; } string const& Some::Name() const { return mName; } void Some::ResetID() { mID = NOT_AN_IDENTIFIER; } void Some::ResetID( int some_id ) { mID = some_id; } void Some::ChangeName( string const& name ) { mName = name; } void Some::SomeChanges( int some_id, string const& name ) { mID = some_id; mName = name; } 

Well, it's time to describe the wrapper in the wrap.cpp file:
The first method Some :: ID () turns around without any problems:
  .def( "ID", &Some::ID ) 

But the second with the result in the form of a constant link to the string already shows that everything is not so simple:
  .def( "Name", &Some::Name, return_value_policy<copy_const_reference>() ) 

As you can see, you can specify how Python should interpret the return value if the method in C ++ returns a pointer or reference. The fact is that the atrocious Garbage Collector (GC) loves to delete everything that is unattended, so no one will give you a simple way to declare a return pointer or reference method, it will sadly end at the compilation stage, because the GC should know what to do with the returned value. for the developer it will be very sad if he starts to delete the contents of an object in C ++. There are several options for return_value_policy for different cases , the most important of which are as follows:

Understanding how this or that return_value_policy works in detail comes with time, experiment, try, read the documentation and tamper. For a standard string, the reference is almost always copy_const_reference or copy_non_const_reference depending on the constancy of the return, just remember, because String by value is converted at the Python level into an object of the built-in class str , and by reference you must explicitly specify return_value_policy .

I intend to overload the Some :: ResetID method to complicate the task of passing a pointer to a method in .def ():
  .def( "ResetID", static_cast< void (Some::*)() >( &Some::ResetID ) ) .def( "ResetID", static_cast< void (Some::*)(int) >( &Some::ResetID ), args( "some_id" ) ) 


As you can see, you can specify with what name the method argument will be created in Python. As you know, the name of the argument in Python is more important than in C ++. I recommend to specify the names of the arguments for each wrapper method that takes parameters:
  .def( "ChangeName", &Some::ChangeName, args( "name" ) ) .def( "SomeChanges", &Some::SomeChanges, args( "some_id", "name" ) ) 


It remains to describe the constant property NOT_AN_IDENTIFIER with a static property:
  .add_static_property( "NOT_AN_IDENTIFIER", make_getter( &Some::NOT_AN_IDENTIFIER ) ) 

Here we use the special function boost :: python :: make_getter, which generates a get-function by class property.
This is what our wrapper looks like:
 #include <boost/python.hpp> #include "some.h" using namespace boost::python; BOOST_PYTHON_MODULE( example ) { class_<Some>( "Some" ) .def( init<int,string>( args( "some_id", "name" ) ) ) .def( "ID", &Some::ID ) .def( "Name", &Some::Name, return_value_policy<copy_const_reference>() ) .def( "ResetID", static_cast< void (Some::*)() >( &Some::ResetID ) ) .def( "ResetID", static_cast< void (Some::*)(int) >( &Some::ResetID ), args( "some_id" ) ) .def( "ChangeName", &Some::ChangeName, args( "name" ) ) .def( "SomeChanges", &Some::SomeChanges, args( "some_id", "name" ) ) .add_static_property( "NOT_AN_IDENTIFIER", make_getter( &Some::NOT_AN_IDENTIFIER ) ) ; } 

If you write a simple test script like this (Python 3.x):
 from example import * s = Some() print( "s = Some(); ID: {ID}, Name: {Name}".format(ID=s.ID(),Name=s.Name()) ) s = Some(123,'asd') print( "s = Some(123,'asd'); ID: {ID}, Name: {Name}".format(ID=s.ID(),Name=s.Name()) ) s.ResetID(234); print("s.ResetID(234); ID:",s.ID()) s.ResetID(); print("s.ResetID(); ID:",s.ID()) s.ChangeName('qwe'); print("s.ChangeName('qwe'); Name:'%s'" % s.Name()) s.SomeChanges(345,'zxc') print( "s.SomeChanges(345,'zxc'); ID: {ID}, Name: {Name}".format(ID=s.ID(),Name=s.Name()) ) 

We will see the conclusion:
 s = Some(); ID: -1, Name: '' s = Some(123,'asd'); ID: 123, Name: 'asd' s.ResetID(234); ID: 234 s.ResetID(); ID: -1 s.ChangeName('qwe'); Name:'qwe' s.SomeChanges(345,'zxc'); ID: 345, Name: 'zxc' 


Pythonize class wrapper


So, the class with all the methods wrapped, but happiness did not come. If you try to execute Some (123, 'asd') from the Python command line, you will not see descriptions of the fields and the object in general, since we did not get the __repr__ method, as well as the conversion to the string, the same print (Some (123, 'asd' )) It will be terribly non-informative, since we did not get the __str__ method. It is also obvious that it does not make sense to work with properties through methods on C ++ on Python, we do not have the ability to start property in C ++, you can and need to start them in Python. However, how do we attach methods to a ready-made C ++ class intended for Python?
Very simple: remember that in Python, methods are no different from functions that take a reference to self, the class instance, as the first parameter. We start such functions directly in wrap.cpp in C ++ and describe them as methods in a wrapper:
 ... string Some_Str( Some const& ); string Some_Repr( Some const& ); ... BOOST_PYTHON_MODULE( example ) { class_<Some>( "Some" ) ... .def( "__str__", Some_Str ) .def( "__repr__", Some_Repr ) ... 

The functions themselves can be described for example like this:
 string Some_Str( Some const& some ) { stringstream output; output << "{ ID: " << some.ID() << ", Name: '" << some.Name() << "' }"; return output.str(); } string Some_Repr( Some const& some ) { return "Some: " + Some_Str( some ); } 


The properties of the identifier and the name are even simpler, since the set and get methods for them are already described in the class:
  .add_property( "some_id", &Some::ID, static_cast< void (Some::*)(int) >( &Some::ResetID ) ) .add_property( "name", make_function( static_cast< string const& (Some::*)() const >( &Some::Name ), return_value_policy<copy_const_reference>() ), &Some::ChangeName ) 


When describing the properties, however, there were two subtle points:
1. For the set-method of the property some_id there was an explicit cast to the type of the method that accepts an int, since There is another method overload.
2. For the get-method of the name property, we used the construction boost :: python :: make_function, which allowed us to hang return_value_policy on the result of the method that returns a constant reference to string.

We execute print (Some (123, 'asd')) and just Some (123, 'asd') from the command line after from example import * and see what is suspiciously similar to the built-in Python dict: {ID: 123, Name: 'asd' }
Why not get the property initializing a copy of Some from the standard dict and back?
Let's get another pair of pitonistic functions and get the as_dict property:
 ... dict Some_ToDict( Some const& ); void Some_FromDict( Some&, dict const& ); ... BOOST_PYTHON_MODULE( example ) { class_<Some>( "Some" ) ... .add_property( "as_dict", Some_ToDict, Some_FromDict ) ; ... } ... dict Some_ToDict( Some const& some ) { dict res; res["ID"] = some.ID(); res["Name"] = some.Name(); return res; } void Some_FromDict( Some& some, dict const& src ) { src.has_key( "ID" ) ? some.ResetID( extract<int>( src["ID"] ) ) : some.ResetID(); some.ChangeName( src.has_key( "Name" ) ? extract<string>( src["Name"] ) : string() ); } 

Here we used the class boost :: python :: dict, for access at the C ++ level to the standard dict Python.
There are also classes for accessing str, list, tuple, they are called respectively. Classes behave in C ++ in the same way as in Python in terms of operators, only return for the most part boost :: python :: object, from which you still need to extract the value through the function boost :: python :: extract.

In conclusion, the first part


In the first part, a completely standard class with a default constructor and a default copy constructor was considered. Despite some subtleties with working with strings, and overloading methods, the class is quite standard.
Working with Boost.Python is quite simple, the wrapper of any function is usually reduced to a single line that looks like a similar method declaration in Python.
In the next part, we will learn how to wrap classes that are not so trivially created, create a class based on the structure, wrap enum, and learn about other important return_value_policy <reference_existing_object> in practice.
In the third part, we will consider type converters to standard Python types directly without wrappers using the example of an array of bytes. Let's learn to forward exceptions of a certain type from C ++ to Python and back.
The topic is quite extensive.

Link to the project


The draft of the first part for Windows is laid out here .
The MSVS v11 project is configured to build with Python 3.3 x64. The compiled .dll Boost.Python of the appropriate version is attached.
But nothing prevents you from collecting the files some.h, some.cpp, wrap.cpp with any other build tool with reference to any other version of Python.

useful links


Boost.Python Documentation
Policies return values ​​by reference in Boost.Python
Getting Started with Boost for Windows
Getting started with Boost for * nix
Boost.Python build subtleties

Source: https://habr.com/ru/post/168083/


All Articles