📜 ⬆️ ⬇️

Compile C \ C ++ code in WebAssembly

WebAssembly is a new binary format into which web applications can be compiled. It is designed and implemented right at the moment when you read these lines and the developers of all major browsers move it forward. Everything changes very quickly! In this article, we will show the current state of the project with a fairly deep immersion in the tools for working with WebAssembly.

In order for WebAssembly to work, we need two main components: tools for assembling code into a WebAssembly format binary and browsers that can download and execute this binary. Both are not yet fully created and very much depend on the completion of work on the WebAssembly specification , but in general these are separate components and their development goes in parallel. This separation is a good thing, it will allow compilers to create WebAssembly applications that can work in any browser, and browsers run WebAssembly programs no matter what compiler they were created with. In other words - we get an open competition of development tools and browsers that will continuously move all this forward, bringing an excellent choice to the end user. In addition, this separation allows teams of toolkit and browser developers to work in parallel and independently.

A new project on the side of the WebAssembly toolkit, which I want to talk about today, is called Binaryen . Binaryen is a compiler library for supporting WebAssembly in compilers, written in C ++. If you are not personally working on a WebAssembly compiler, then you probably do not need to directly know anything about Binaryen. If you use any WebAssembly compiler, then it probably uses Binaryen under the hood - we will look at the examples below.

')
The core of the Binaryen library is intended for parsing and generating WebAssembly, as well as presenting its code as an abstract syntax tree (AST). Based on these features, the following useful utilities are created:



About Binaryen you can see these slides .

Once again I remind you that WebAssembly is in the stage of active design, which means that the input and output formats of Binaryen (.wast, .s) are not final. Binaryen is constantly updated with the WebAssembly specification update. The degree of cardinality changes with time decreases, but no one, of course, can guarantee any compatibility.

Let's take a look at a few areas where Binaryen might be useful.

Compiling in WebAssembly using Emscripten

Emscripten can compile C code in asm.js, and Binaryen (using the asm2wasm utility) can compile asm.js into WebAssembly, so the Emscripten + Binaryen bundle gives us a complete set of tools for compiling C and C ++ code into WebAssembly. You can run asm2wasm on asm.js code, but it's easier to let Emscripten do it for you, like this:

emcc file.cpp -o file.js -s 'BINARYEN=”path-to-binaryen”' 


Emscripten will compile the file file.cpp and the output will give you a javascript file, plus a separate .wast file with WebAssembly. Under the hood, Emscripten will compile the code in asm.js, run asm2wasm for it and save the result. This is described in more detail on the Emscripten wiki .

But wait, what's the point of compiling something in WebAssembly, if browsers don't support it yet? Great question! :) Yes, we can not yet run this code in any browser. But we can already test something with it. So, we want to check if the binary Emscripten + Binaryen has created the correct binary. How to do it? To do this, we can use wasm.js, which Emscripten integrated into the output .js file received by the emcc command (see above). wasm.js contains the Binaryen port in Javascript, including the interpreter. If you run file.js (in node.js or in a browser), then you will get the result of performing WebAssembly. This allows us to actually confirm that the compiled WebAssembly binary works correctly. You can look at an example of such a program, plus a couple more examples are in the repository for test purposes.

Of course, we are not yet standing on solid ground with all these tools. The test environment is weird. C ++ code is compiled into WebAssembly and then executed in a WebAssembly interpreter, which is itself written in C ++ but ported to JavaScript. And while there are no other ways to run it all. But we have several reasons to believe the results:



All this shows that we already have some result, we can compile the code in C and C ++ in WebAssembly and even run it somehow.

Note that WebAssembly is just another new feature, and, distracting from it, everything else in Emscripten still works: Emscripten allows you to use libc and syscalls, OpenGL / WebGL code, integration with browsers, integration with node.js and t .d As a result, projects that are already using Emscripten will be able to switch to WebAssembly simply by adding a new command line parameter. And this will allow C ++ projects to be compiled into WebAssembly and work in browsers without any effort.

Using new experimental backend for LLVM for WebAssembly with Emscripten

We have just seen a new important stage in the development of Emscripten, which gave him the opportunity to create WebAssembly modules and even test their work. But the work does not stop there: it was just the use of the current asm.js compiler, along with the asm2wasm utility. There is a new backend to LLVM for WebAssembly (or rather, not yet, but it is being actively written) - right in the main development branch of LLVM. And, although it is not yet ready for real use, over time it will become a very important tool. Binaryen supports the format of its output.

The LLVM backend for WebAssembly, like most LLVM backends, creates an assembler code, in this case in a special .s format. This format is close to WebAssembly, but not directly identical to it — it is more like the C compiler output (linear list of instructions, one instruction per line) than the abstract WebAssembly syntax tree. This .s-file can be converted to WebAssembly in a rather trivial way (in general, Binaryen includes the s2wasm utility, which does just that - see how simple it is). You can run it by itself, or use Emscripten for this, which now supports the new WASM_BACKEND option, which you can use like this:

 emcc file.cpp -o file.js -s 'BINARYEN=”path-to-binaryen”' -s WASM_BACKEND=1 


Note that you also need to use the BINARYEN option, since s2wasm is part of Binaryen. When all these options are specified, Emscripten uses the new backend for WebAssembly instead of using the asm.js compiler. After calling the backend and receiving a file in it in the .s-format, Emscripten will call s2wasm for conversion to WebAssembly. Some examples of programs that you can already build in a similar way can be found on the Emscripten project.

Thus, we have two ways to assemble a WebAssembly module using Binaryen:
  1. Emscripten + asm.js backend + asm2wasm , which is already working right now and should be a relatively simple and acceptable option
  2. Emscripten + new backend for WebAssembly + s2wasm , which is not yet fully working version, but with the development of backend for WebAssembly will come to the fore.


The goal at the moment is to make the transition from the first method to the second one as less complicated as possible. Ideally, everything should be reduced to replacing one argument on the command line.

Thus, we have a clear plan:

  1. Use Emscripten to generate asm.js code (today)
  2. Moving to WebAssembly generation via asm2wasm (already available, but browsers are not yet ready)
  3. Moving to WebAssembly generation via new LLVM backend (as soon as it is ready)


Each step gives new advantages to users (speed!) And practically does not cause difficulties for developers.

In conclusion, I want to say that although this article is written about Binaryen in the context of its use with Emscripten, it is still a separate library for WebAssembly for general use. If you have ideas for creating some tools for working with WebAssembly, you can take the Binaryen library and work with it, without looking at Emscripten, LLVM or something else.

Source: https://habr.com/ru/post/273957/


All Articles