Using Python in a multithreaded C ++ application and true multithreading in Python

All more or less knowledgeable Python developers know about such a terrible thing as GIL. The global blocker of the entire process as long as Python is running in one of the threads. It gives stream-security methods comparable to sadism, because any implicit lock in a multithreaded application of death is similar, all that was based on parallel execution, dies in torment, stumbling over and over again on the GIL lock.
It is known that to this day, because of this sad fact, C ++ programmers use Python wrappers for the most part only in single-threaded applications, and Python programmers try to convince everyone that they live well enough.
It would seem that if the stream is generated in C ++, it does not know about any GIL, use Python without blocking and rejoice. The developer’s joy, however, will end on the second thread that has requested the realm of global variables without blocking.
However, there is a path leading to a brighter future!
This path was originally in a language like Perl, it is also supported in the C-API of the Python language and I won’t mind why such a mechanism is not included in one of the standard Python modules! The method essentially reduces the use of various Python sub-interpreters in different streams, and using your GIL for everyone (!!!) without any shamanism and magic, simply by calling several functions and a standard set of C-API of the Python language!

Honest multithreading in Python

All of the following is based on the new GIL introduced in Python 3.2, debugged and running on Python 3.3. However, for earlier versions, the same Python 2.7, it is proposed to use the same API, the very behavior of GIL is not as important as its launch from the generated interpreter.
So let's start, we need the main thread, in which we simply initialize everything and run a number of threads that work with various Python functions, both native Python and written in C ++. We will do everything from C ++, we will work with the libraries boost :: python and boost :: thread. If you do not have the BOOST library yet, or you use pure C instead of C ++, this is not scary, most of the work goes on C, and BOOST is used only for clarity and ease of development, all the same can be done on pure C using the Python API and the OS API for working with threads.
At the beginning of working with Python, you need to initialize the interpreter, enable the GIL mechanism, and enable multithreading in GIL, while maintaining the state of the main thread:

Py_Initialize(); //   Python PyEval_InitThreads(); //    Python   GIL mGilState = PyGILState_Ensure(); //   GIL     mThreadState = PyEval_SaveThread(); //       GIL //  GIL     Python

Of course, initialization also means releasing resources; it is most convenient to create a class with a constructor and destructor, where the destructor restores the state of the thread, releases the GIL, and terminates the interpreter (including the work of the sub-interpreters):

  //  GIL       PyEval_RestoreThread( mThreadState ); //        GIL PyGILState_Release( mGilState ); //   GIL    Py_Finalize(); //     ,   - Python

So far, everything is obvious to all who have ever worked with GIL from the C-API of the Python language. For the main thread, all that is needed is to act as a dispatcher, without blocking the GIL and not disturbing the other threads to do their work. This is how the class should look like:

 class PyMainThread //      { public: PyMainThread() //         { Py_Initialize(); //   Python PyEval_InitThreads(); //    Python   GIL mGilState = PyGILState_Ensure(); //   GIL     mThreadState = PyEval_SaveThread(); //       GIL //  GIL     Python       } ~PyMainThread() //         { //  GIL       PyEval_RestoreThread( mThreadState ); //        GIL PyGILState_Release( mGilState ); //   GIL    Py_Finalize(); //     ,   - Python } private: PyGILState_STATE mGilState; //   GIL PyThreadState* mThreadState; //     };

Actually the work in the function main () or its analog is reduced to the following scheme:

  PyMainThread main_thread; //       boost::thread_group group; //  ,    ,   Python   GIL for( int id = 1; id <= THREAD_NUM; ++id) group.create_thread( ThreadWork(id) ); group.join_all();

Everything finished with primitive, the people are eager for magic ... oh yes, I promised that it would not be.

Work in every thread

If we now try to simply make time.sleep (1) in each child stream we will get a drop on the second thread.
We will be saved by the magic function Py_NewInterpreter (!!!), in which everything would be fine, but using it requires blocking GIL (!) And it would be scary if it were not for the fact that GIL comes and goes, but the sub-interpreter remains. And already in it you can block it as much as you like with GIL, it will have exactly one thread itself - the one in which it was created:

  mMainGilState = PyGILState_Ensure(); //     mOldThreadState = PyThreadState_Get(); //      mNewThreadState = Py_NewInterpreter(); //    - PyThreadState_Swap( mNewThreadState ); //        - mSubThreadState = PyEval_SaveThread(); //      GIL mSubGilState = PyGILState_Ensure(); //   GIL   -

This is also best done in a special class constructor, and the destructor has the following code, respectively:

  PyGILState_Release( mSubGilState ); //  GIL  - PyEval_RestoreThread( mSubThreadState ); //         Py_EndInterpreter( mNewThreadState ); //   - PyThreadState_Swap( mOldThreadState ); //       PyGILState_Release( mMainGilState ); //   GIL

The code for the entire class is shown below:

 class PySubThread //        { public: PySubThread() //    - Python      { mMainGilState = PyGILState_Ensure(); //     mOldThreadState = PyThreadState_Get(); //      mNewThreadState = Py_NewInterpreter(); //    - PyThreadState_Swap( mNewThreadState ); //        - mSubThreadState = PyEval_SaveThread(); //      GIL mSubGilState = PyGILState_Ensure(); //   GIL   - } ~PySubThread() //         - Python { PyGILState_Release( mSubGilState ); //  GIL  - PyEval_RestoreThread( mSubThreadState ); //         Py_EndInterpreter( mNewThreadState ); //   - PyThreadState_Swap( mOldThreadState ); //       PyGILState_Release( mMainGilState ); //   GIL    } private: PyGILState_STATE mMainGilState; //  GIL   Python PyThreadState* mOldThreadState; //       PyThreadState* mNewThreadState; //     - PyThreadState* mSubThreadState; //      GIL PyGILState_STATE mSubGilState; //  GIL   - Python };

As you can see, the work on initialization in each thread is no longer as trivial and primitive as in the main thread. However, we have a complete PROFIT for each stream. Suppose that in each individual sub-interpreter we have to re-import the modules, however, we get almost complete isolation of the Python data for each stream!

Test the result

So let's check how right we are. Let's get our Python module written in C ++ for completeness of sensations and provide the function analogue of time.sleep:

 #include <boost/python.hpp> #include <boost/thread.hpp> using namespace boost::python; using namespace boost::this_thread; using namespace boost::posix_time; void wait( double sec ) //  ,   time.sleep  Python { int msec = static_cast<int>( sec * 1000 ); //    sleep( millisec( msec ) ); //      } BOOST_PYTHON_MODULE( ctimer ) //  boost::python    ctimer { def( "wait", wait, args("sec") ); //  Python   ctimer.wait(sec) }

We collect DLL and we rename into the Python module ctimer.pyd, if we are under Windows. We put the received ctimer module for execution of the main application. We will use ctimer.wait along with the standard time.sleep.
We get a class functor for work in each separate flow:

 class ThreadWork //  -     boost::thread { public: ThreadWork( int id ) //      : mID( id ) { } void operator () ( void ) //       { cout << "Thread#" << mID << " <= START" << endl; PySubThread sub_thread; //    - Python for( int rep = 1; rep <= REPEAT_TIMES; ++rep ) { //      Python cout << "Thread#" << mID << " <= Repeat#" << rep << " <= import time; time.sleep(pause)" << endl; object time = import( "time" ); // import time time.attr( "sleep" )( PAUSE_SEC ); // time.sleep(pause) //      C++ cout << "Thread#" << mID << " <= Repeat#" << rep << " <= import ctimer; ctimer.wait(pause)" << endl; object ctimer = import( "ctimer" ); // import ctimer ctimer.attr( "wait" )( PAUSE_SEC ); // ctimer.wait(pause) } cout << "Thread#" << mID << " <= END" << endl; //  - Python    } private: int mID; //      };

Run the application and enjoy! Threads work in parallel with modules in Python, each thread has small sub-python buttons that have blocked each GIL and work freely together without interfering with each other.
Cheers, comrades!
')
A link to the MSVS 2012 project with source codes (as many as two .cpp files) and compiled DLLs and EXEs for Python 3.3 x64 can be downloaded here (290 KB)

useful links

Working with the Python Interpreter
API for working with threads, interpreter and GIL
Boost.Python Documentation
Boost.Thread Documentation

Source: https://habr.com/ru/post/167261/

All Articles

Using Python in a multithreaded C ++ application and true multithreading in Python

Honest multithreading in Python

Work in every thread

Test the result

useful links

More articles: