CXXI: Bridge between the worlds of C # and C ++

In Mono runtime, there are a lot of tools for interacting with code in non-.NET languages, but there has never been anything sane for interacting with C ++ code.

But this will soon change thanks to the work of Alex Corrado, Andrea Gaite and Zoltan Varg.

In short, the new technology allows C # /. NET developers:
')

Easily and transparently use C ++ classes from C # or any other .NET language
Create instances of C ++ classes from C #
Call C ++ class methods from C # code
Call C ++ inline methods from C # code (provided that the library is compiled with the -fkeep-inline-functions flag or if you compile an additional library with their implementations)
Inherit C ++ Classes from C #
Override virtual methods of C ++ classes with methods on C #
Use instances of such mixed C ++ / C # classes in both C # code and C ++ code

CXXI ( note lane: reads "sexy") is the result of two months of work under the auspices of Google's Summer of Code in order to improve the interaction of Mono with C ++ code.

Alternatives

I remind you that Mono provides several mechanisms for interacting with code in non-.NET languages, mostly inherited from the ECMA standard. These mechanisms include:

Two-way Platform Invoke technology (P / Invoke), which allows managed code (C #) to call functions from unmanaged libraries, and make callbacks to code for these libraries back to managed code.
COM Interop allows code running in Mono to transparently call unmanaged C or C ++ code as long as the code complies with some COM conventions (these conventions are quite simple: standard vtable markup, implementation of the Add, Release and QueryInterface methods, as well as using standard set of types that can be marshalled between Mono and COM library).
General call interception technology , which allows intercepting a call to an object's method and then independently exploring what to do with it.

But when it comes to using C ++ objects in C #, the choice is not very encouraging. For example, suppose you want to use the following C ++ class from C #:

class MessageLogger { public: MessageLogger (const char *domain); void LogMessage (const char *msg); }

One way to provide this class to C # code is to wrap it in a COM object. This may work for some high-level objects, but the wrapping process is very tedious and routine. You can see how this uninteresting activity looks like here .

Another option is to rivet adapters, which can then be called up via P / Invoke. For the class above, they will look something like this:

 /* bridge.cpp,   bridge.so */ MessageLogger *Construct_MessageLogger (const char *msg) { return new MessageLogger (msg); } void LogMessage (MessageLogger *logger, const char *msg) { logger->LogMessage (msg); }

The part on C # looks like this:

 class MessageLogger { IntPtr handle; [DllImport ("bridge")] extern static IntPtr Construct_MessageLogger (string msg); public MessageLogger (string msg) { handle = Construct_MessageLogger (msg); } [DllImport ("bridge")] extern static void LogMessage (IntPtr handle, string msg); public void LogMessage (string msg) { LogMessage (handle, msg); } }

Sit half an hour to compile such wrappers and want to kill the author of the library, the compiler, the creators of C ++, C #, and then completely destroy this mortal and imperfect world.

Our PhyreEngine # was the .NET bindings for C ++ API to PhyreEngine from Sony. The code writing process was quite tedious, so we did something like a code generator on our knees.

In addition, the methods described above do not allow you to override methods of classes with C ++ code in C #. More precisely, you can do this, but it will require writing a large amount of code manually, taking into account a lot of special cases and a lot of callback calls. Bindings will very quickly become almost unsupported (we ran into this ourselves by doing bindings to PhyreEngine).

The above ordeals and prompted the creation of CXXI.

How does it work

Accessing C ++ classes is a complex issue. I will briefly describe the features of the implementation of C ++ code that play a large role for CXXI:

Object Layout (object layout): a binary representation of an object in memory, may differ on different platforms.
VTable markup: a list of pointers to implementations of virtual methods used by the compiler to determine the address of a method depends on the virtual methods of the class and its parents.
Decorated names : non-virtual methods that are not included in the vtable. The compiler generates the usual "sishnye" functions, whose name is calculated based on the type of the return value and the types of arguments. The decoration scheme depends on the compiler.

For example, we have the following class:

 class Widget { public: void SetVisible (bool visible); virtual void Layout (); virtual void Draw (); }; class Label : public Widget { public: void SetText (const char *text); const char *GetText (); };

The C ++ compiler for these methods will generate the following names ( note lane: meaning compilers like GCC and Intel C ++ Compiler for Linux, studio will produce something unreadable like? H @@ YAXH @ Z; in the case of GCC you can use the c + utility + filt):

__ZN6Widget10SetVisibleEb
__ZN6Widget6LayoutEv
__ZN6Widget4DrawEv
__ZN5Label7SetTextEPKc
__ZN5Label7GetTextEv

Here is the code

  Label *l = new Label (); l->SetText ("foo"); l->Draw ();

It will be compiled into something like this (represented as C code):

  Label *l = (Label *) malloc (sizeof (Label)); ZN5LabelC1Ev (l); //    Label _ZN5Label7SetTextEPKc (l, "foo"); //    Draw (l->vtable [METHOD_PTR_SIZE*2])();

In order for CXXI to support such things, it needs to know the exact location of the methods in the vtable, to know where and how each of the methods is implemented, and to know how to reach them by the decorated name.

The diagram below shows how a C ++ library becomes available to C # and other .NET languages.

In fact, your C ++ code is compiled twice. The C ++ compiler generates an unmanaged library for you, and the CXXI toolkit generates binders.

Generally speaking, CXXI needs from your C ++ code only header files, and only those that you need to wrap for use in C #. So if you only have a proprietary library and header files for it, CXXI will still be able to generate binding files.

The CXXI toolkit creates a regular .NET library ( note lane: this is exactly the Dneet library that contains MSIL and nothing else - no unmanaged code) that you can safely use from C # and other .NET languages. This library exposes C # classes with the following properties:

When you create an instance of a C # class, its constructor creates an instance of the corresponding C ++ class.
These classes can be basic for other C # classes, all methods marked as virtual can be overridden by C # code.
Multiple inheritance of C ++ classes is supported: the generated C # class implements a set of type conversion operators that allow you to reach various C ++ base classes.
Overridden methods can use the C # “base” keyword to call methods of the C ++ base class.
You can override any virtual class methods, including in case of multiple inheritance.
Also, there is a constructor that accepts IntPtr, in case you want to use an instance of the C ++ class already created by someone else.

The CXXI conveyor consists of three components shown in the diagram on the right.

The GCC-XML compiler is used to parse your C ++ code and extract the necessary information from it. The generated XML is then processed by the CXXI utilities to generate a set of partial classes in C # containing the actual bridges to the classes in C ++

This is then combined with any additional code you would like to add (for example, several overloaded methods to improve the API, implementation of ToString, Async methods, etc).

The output is a .NET assembly that works with the native library.

It should be noted that this assembly does not contain the very map of the markup of objects in memory. Instead, the CXXI binder determines this based on the ABI used at the time of the execution and the corresponding transformation rules. Thus, you need to compile the bindings only once, and then quietly use them on different platforms.

Examples

The project code on GitHub contains various tests and a bunch of examples. One of them is minimal binding to Qt.

What else is left to implement

Unfortunately, the CXXI project is not over yet, but this is already a good start for a tangible improvement in the interaction between code in .NET and C ++.

Currently, CXXI does all the work in runtime, generating adapters via System.Reflection.Emit as necessary, which allows you to dynamically determine the ABI used by the C ++ library compiler.

We are also going to add support for static compilation, which will allow us to use this technology in C # writing for PS3 and iPhone.

CXXI currently supports ABI GCC and has initial support for ABI MSVC. We will be happy to help with the implementation of ABI support for other compilers and with the completion of support for MSVC.

Currently, CXXI only supports deleting objects created by itself. All other objects are considered to belong to the world of unmanaged code. Support of the operator delete for such objects would also be useful.

We also want to better document the pipeline and the runtime API, as well as improve the binding itself.

From translator

This method compares favorably with writing tons of glue-code in C ++ / CLI, then all the work is done for you, and it all turns out to be cross-platform. It is also worth noting that an article about a similar method of kicking class methods in C ++ flashed on Habré that year, though there was a lot done by hand. However, according to the author, using wrappers generated on the fly turned out to be one and a half times faster than COM Interop (on runtime from MS).
Oh yes. This is not reflected in the article, but judging by the testcases on the github you can refer to the fields of C ++ objects.
How much is it usable? Theoretically, you can take any positive lib right now and generate bindings to it (in the case of Windows, you will need to compile it in Cygwin). And it will work fine if there are no methods in it that return newly created instances of objects, since they cannot be deleted at the moment, however, in Qt, QObject has a deleteLater () slot, so there should not be any problems. Practically, the generic fell when trying to generate binders to Irrlicht, and GCCXML fell on OGRE without mastering something from std :: tr1. Generally speaking, GCCXML should be abandoned in favor of clang, since GCCXML is updated very well, very rarely, and it works, as it turned out, crooked. But in the examples there are working binders to some QtGui classes (incomplete, nobody has done the QObject infrastructure with all the meta information and signal slots yet).

Source: https://habr.com/ru/post/142503/

All Articles