About WebAssembly in our time heard, I think, almost everything. If you have not heard, then on Habré there is a
wonderful introductory material about this technology.

Another thing is that very often you can find comments like “Hooray, now we will write frontend in C ++!”, “Let's
rewrite React in Rust ” and so on, and so forth ...
Interview with Brendan Ike very well reveals the idea of WebAssembly and its purpose: WASM is not a complete replacement for JS, but only a technology that allows you to write resource-critical modules and compile them into portable byte code with a linear memory model and static typing: performance or simplify the transfer of existing code for web applications working with multimedia, online games, and other "heavy" things.
With a great desire, you can implement the GUI, for example, imgui library is ported to
WASM , there are
advances in porting Qt to WASM (
once and
twice ).

But most often a simple question is voiced:
')
“But still, is it possible to work with DOM from WebAssembly?”
So far, the categorical answer sounds like “No, it cannot”, the more accurate and correct one sounds like “You can, using Javascript functions”. And this article, in fact, is a story about the results of my little research on how to do this as conveniently and efficiently as possible.
What is the problem?
Let's see how the generation of page elements in general and working with them from scripts takes place using the example of the Blink web engine (Chromium) and the V8 JS engine. In fact, almost any DOM element within Blink is embodied as a C ++ object inherited from HTMLElement, inherited from Element, inherited from ContainerNode, inherited from Node ... in fact, this is not the whole chain, but in our case it is not important. For example, for a tag when parsing HTML and building a tree, an object of class
HTMLImageElement will be created:
class CORE_EXPORT HTMLImageElement final : public HTMLElement, ... { public: static HTMLImageElement* Create(Document&); ... unsigned width(); unsigned height(); ... String AltText() const final; ... KURL Src() const; void SetSrc(const String&); void setWidth(unsigned); void setHeight(unsigned); ... }
In order to control the lifetime of objects and their removal, older versions of Blink used smart pointers with reference counting, in modern versions a Garbage Collector called
Oilpan is used .
To access the page elements from JavaScript, the object is described as
IDL to specify which fields and methods of the object will be available in JavaScript:
[ ActiveScriptWrappable, ConstructorCallWith=Document, NamedConstructor=Image(optional unsigned long width, optional unsigned long height) ] interface HTMLImageElement : HTMLElement { [CEReactions, Reflect] attribute DOMString alt; [CEReactions, Reflect, URL] attribute DOMString src; ... [CEReactions] attribute unsigned long width; [CEReactions] attribute unsigned long height; ... [CEReactions, Reflect] attribute DOMString name; [CEReactions, Reflect, TreatNullAs=EmptyString] attribute DOMString border; ...
after which we can work with them from JavaScript code. The V8 JS engine also has its own Garbage Collector, and with C ++, Blink objects are wrapped in special wrappers, which are called
Template Objects in V8 terminology. As a result, Blink and V8 garbage collectors monitor the page's lifetime.
Now let's imagine how WebAssembly modules fit into this case. At the moment, what is happening inside the WASM browser for the “dark forest”. For example, if we take an element from the document, pass the pointer to the WASM module, save the link to it there, and then call removeChild for it, according to Blink, the object will no longer indicate any link and the object should be deleted - because the environment does not know that the element pointer is still stored inside WASM. What this situation may lead to, I think, is not difficult. And this is just one example.
Working with garbage-collected objects is in the Roadmap of WebAssembly development and a special
Issue on this issue was launched on github, plus there is a
document with details of proposals for implementing all of this.
So, the WebAssembly code is completely isolated in its “sandbox”, and today it is impossible to transfer a pointer to any object of the DOM tree in a normal way, it is impossible to call any method directly in the same way. The only correct way to interact with any objects of the DOM tree or use any other browser API is to write JS functions, transfer them to the
imports field of the
WebAssmebly module and call them from WASM code.
helloworld.c:
void showMessage (int num); int main(int num1) { showMessage(num1); return num1 + 42; }
helloworld.js:
var wasmImports = { env: { showMessage: num => alert(num) } };
In the generated bytecode, everything is simple and transparent: the external function is imported, the argument is put on the stack, the function is called, and the result of the addition is stored on the stack.
(module (type $FUNCSIG$vi (func (param i32))) (import "env" "showMessage" (func $showMessage (param i32))) (table 0 anyfunc) (memory $0 1) (export "memory" (memory $0)) (export "main" (func $main)) (func $main (
It is possible to check in practice how the whole thing works using WasmFiddle:
https://wasdk.imtqy.com/WasmFiddle/?l7d05It would seem that all is well, but with the complication of the problem problems arise. But what if we need to transfer from WASM code not a number, but a string (despite the fact that only 32-bit and 64-bit integers and 32-bit and 64-bit floating-point numbers are supported in WebAssembly)? But what if we need to perform a lot of very different manipulations with the DOM and browser API calls, and it is extremely inconvenient to write a separate JS function in each case?
This is where Emscripten comes to the rescue.
Emscripten was originally designed as an LLVM backend for compiling in asm.js. In addition to compiling directly in asm.js and WASM, it also contains “wrappers” that emulate the functionality of various libraries (libc, libcxx, OpenGL, SDL, etc.) through the API available in the browser, and its own set of support functions that facilitate porting applications and the interaction of WASM and JS code.
The simplest example. As is known, only i32, i64, f32, f64 can be arguments and results when calling functions from or from a WASM module. The WebAssembly module has linear memory that can be mapped to JS as Int8Array, Int16Array, etc. Therefore, in order to get the value of any non-standard type (string, array, etc.) from WASM to JS, we can put it in the address space of the WASM module, pass the pointer outwards, and already in JS code pull the necessary bytes from the array and convert them to the required object (for example, the strings in JS are stored in UTF-16). When transferring data “inside” of a WASM module, we should, on the contrary, “outside” put them into a memory array at a specific address, and only then use this address in C / C ++ code. For these purposes, Emscripten has a large set of support functions. So, besides
getValue () and
setValue () (reading and writing heap values of a WASM application by pointers), there is, for example, the function
Pointer_stringify (), which converts C-strings to JavaScript string objects.
Another handy feature is the ability to inline javascript code directly into C ++ code. The compiler will do the rest for us.
#include <emscripten.h> int main() { char* s = "hello world"; EM_ASM({ alert(Pointer_stringify($0)); }, s); return 0; }
After compilation, we get a .wasm file with compiled bytecode and a .js file containing the startup code of the wasm module and a huge number of various auxiliary functions.
Directly, our inline macro EM_ASM JS-code has become a .js-file in the following structure:
var ASM_CONSTS = [function($0) { alert(Pointer_stringify($0)); }]; function _emscripten_asm_const_ii(code, a0) { return ASM_CONSTS[code](a0); }
In the bytecode, we have almost the same thing as in the previous example, only when calling a function, the number 0 is also put on the stack (the function identifier in the
ASM_CONSTS array), as well as a pointer to a string constant (char *) in the address space The WASM module, equal in our case to 1024. The
Pointer_stringify () method in the javascript code retrieves data from the “heap” represented in JS as Uint8Array, and performs the conversion from the UTF8 array to a String object.
A closer look confuses the fact that the string constant (char *) located at 1024 for some reason contains not only the text “hello world” with a zero byte, but also a double of the inline JS code. I cannot explain the reason for the appearance of this inside the compiled wasm file, I would be grateful if someone would share their suggestions in the comments.
(import "env" "_emscripten_asm_const_ii" (func $_emscripten_asm_const_ii (param i32 i32) (result i32))) (data (i32.const 1024) "hello world\00{ alert(Pointer_stringify($0)); }") (func $_main (
In any case, the conclusion suggests itself that calling the JavaScript functions from the WASM code is not the fastest thing in terms of performance. At a minimum time will take away the overhead of the interaction of WASM-code and JS-interpreter, type conversion when passing arguments, and much more.
Cheerp
Reading the articles and studying the documentation for various libraries, I was caught by the
Cheerp compiler, which generates WASM code from C / C ++, and according to a loud assurance on the landing page on the official site, providing “no-overhead access to HTML5 DOM”. My inner skeptic, however, said that there is no magic. To begin with we try to compile the simplest example from the documentation:
#include <cheerp/clientlib.h> #include <cheerp/client.h> [[cheerp::genericjs]] void domOutput(const char* str) { client::console.log(str); } void webMain() { domOutput("Hello World"); }
At the output we get a .wasm file, a quick look at which tells us that there is no data section with a string constant.
(module (type $vt_v (func )) (func (import "imports" "__Z9domOutputPKc")) (table anyfunc (elem $__wasm_nullptr)) (memory (export "memory") 16 16) (global (mut i32) (i32.const 1048576)) (func $__wasm_nullptr (export "___wasm_nullptr") (local i32) unreachable ) (func $_Z7webMainv (export "__Z7webMainv") (local i32) call 0 ) )
Looking in JS:
function f() { var a = null; a = h(); console.log(a); return; } function h() { var a = null, d = null; a = String(); d = String.fromCharCode(72); a = a.concat(d); d = String.fromCharCode(101); a = a.concat(d); d = String.fromCharCode(108); a = a.concat(d); d = String.fromCharCode(108); a = a.concat(d); d = String.fromCharCode(111); a = a.concat(d); d = String.fromCharCode(32); a = a.concat(d); d = String.fromCharCode(87); a = a.concat(d); d = String.fromCharCode(111); a = a.concat(d); d = String.fromCharCode(114); a = a.concat(d); d = String.fromCharCode(108); a = a.concat(d); d = String.fromCharCode(100); a = a.concat(d); return String(a); } function _asm_f() { f(); } function __dummy() { throw new Error('this should be unreachable'); }; var importObject = { imports: { __Z9domOutputPKc: _asm_f, } }; instance.exports.i();

Honestly, it’s not at all clear why, in such a case, you need to generate a WASM module, since nothing happens in it except for one external function call, and all the logic is located in the JS code. Anyway.
An interesting feature of the compiler are the bindings to all standard DOM objects in C ++ (judging by the documentation, obtained by auto-generating code from IDL), which allow you to write C ++ code “directly” that manipulates the necessary objects:
#include <cheerp/client.h> #include <cheerp/clientlib.h> using namespace client; int test = 1; [[cheerp::genericjs]] void domOutput(int a) { const char* str1 = "Hello world!"; const char* str2 = "LOL"; Element* titleElement=document.getElementById("pagetitle"); titleElement->set_textContent(a / 2 == 0 ? str1 : str2); } void webMain() { test++; domOutput(test); test++; domOutput(test); }
Let's see what we did after compilation ...
function j(b) { var a = null, c = null; a = "pagetitle"; a = document.getElementById(a); a = a; a.textContent; if (b + 1 >>> 0 < 3) { c = "Hello world!"; a.textContent = c; return; } else { c = "LOL"; a.textContent = c; return; } } function e(f, g) { var b = 0, c = 0, a = null, t = null; a = String(); b = f[g] | 0; if ((b & 255) === 0) { return String(a); } else { c = 0; } while (1) { t = String.fromCharCode(b << 24 >> 24); a = a.concat(t); c = c + 1 | 0; b = f[g + c | 0] | 0; if ((b & 255) === 0) { break; } } return String(a); } var s = new Uint8Array([112, 97, 103, 101, 116, 105, 116, 108, 101, 0]); var r = new Uint8Array([76, 79, 76, 0]); var q = new Uint8Array([72, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 33, 0]); function _asm_j(b) { j(b); } function __dummy() { throw new Error('this should be unreachable'); }; var importObject = { imports: { __Z9domOutputi: _asm_j, } }; ... instance.exports.p();

I think I was wrong about magic. String constants were in JavaScript code in Uint8 arrays, and when the script is run, they are converted to String by a sequence of character calls to
String.concat () . Given that the same lines are slightly higher than the direct text in the JavaScript code. Test increment is done in WASM code; in the function that sets the text content of the DOM element, you can meet the wonderful “a = a” and call the textContent getter without using its result; checking the parity for the parity as a result of the work of the optimizer degenerated into the outgoing brain “b + 1 >>> 0 <3” (yes, exactly, with a bit shift by 0 positions).
Can this be called “zero overhead DOM-manipulations”? Even if we take into account that, in fact, all the same, all the manipulations are performed in the same way through JS (it was difficult to expect something else, in fact), at best you can talk about “zero overhead” compared to pure JS, and strange dances With a tambourine around the performance lines, they obviously will not add, as well as the joy of debugging all this.
As they say, do not believe advertising. But It is worth noting that the project is still actively developing. When I forgot to set the
[[cheerp :: genericjs]] attribute for the void domOutput (int a) function, when compiled with the “wasm” target, the compiler simply dropped from SIGSEGV. I started Issue on github of developers about this problem, the next day they explained to me what the error was, and literally a week later in the master branch there was a fix for this problem. It may be worth watching Cheerp in the longer term.
Stdweb
Speaking of compilers and libraries created for interaction between WASM, JS and WebAPI, not to mention
Stdweb for Rust .
It allows JS code to be inlined into Rust code with support for closures and provides wrappers for DOM objects and browser APIs as close as possible to what JS is used to see:
let button = document().query_selector( "#hide-button" ).unwrap(); button.add_event_listener( move |_: ClickEvent| { for anchor in document().query_selector_all( "#main a" ) { js!( @{anchor}.style = "display: none;"; ); } });
The delivery immediately included examples of the implementation of various things on Rust / WASM, of which the greatest interest is TodoMVC. It can be run through the cargo-web team.
cargo web start –target-webasm-emscriptenas a result, we get a web server on port 8000 with our application.
After compilation, we see the same Emscripten helpers in the .js file, but much more interesting (bearing in mind what was in the previous paragraph) is how the JS code call from the WASM module and the work with objects .
Exactly the same as in the second example (compiling C ++ using Emscripten), the ASM_CONSTS array is filled with functions of approximately the following form:
var ASM_CONSTS = [ function($0) { Module.STDWEB.decrement_refcount( $0 ); }, function($0, $1, $2, $3) { $1 = Module.STDWEB.to_js($1); $2 = Module.STDWEB.to_js($2); $3 = Module.STDWEB.to_js($3); Module.STDWEB.from_js($0, (function() { var listener = ($1); ($2). addEventListener (($3), listener); return listener; })()); }, function($0) { Module.STDWEB.tmp = Module.STDWEB.to_js( $0 ); }, function($0, $1) { $0 = Module.STDWEB.to_js($0); $1 = Module.STDWEB.to_js($1); ($0). appendChild (($1)); }, function($0, $1, $2) { $1 = Module.STDWEB.to_js($1); $2 = Module.STDWEB.to_js($2); Module.STDWEB.from_js($0, (function() { try { ($1). removeChild (($2)); return true; } catch (exception) { if (exception instanceof NotFoundError) { return false; } else { throw exception; } } })()); }, function($0, $1) { $1 = Module.STDWEB.to_js($1); Module.STDWEB.from_js($0, (function() { return ($1). classList; })()); }, function($0, $1, $2) { $1 = Module.STDWEB.to_js($1); $2 = Module.STDWEB.to_js($2); Module.STDWEB.from_js($0, (function(){ return ($1). querySelector (($2)); })()); }, function($0) { return (Module.STDWEB.acquire_js_reference( $0 ) instanceof HTMLElement) | 0; }, function($0) { return (Module.STDWEB.acquire_js_reference( $0 ) instanceof HTMLInputElement) | 0; }, function($0) { $0 = Module.STDWEB.to_js($0);($0). blur (); },
In other words, for example,
let label = document().create_element( "label" ); label.append_child( &document().create_text_node( text ) );
will be implemented with the help of helpers
function($0, $1, $2) { $1 = Module.STDWEB.to_js($1); $2 = Module.STDWEB.to_js($2); Module.STDWEB.from_js($0, (function() { return ($1). createElement (($2)); })()); }, function($0, $1, $2) { $1 = Module.STDWEB.to_js($1); $2 = Module.STDWEB.to_js($2); Module.STDWEB.from_js($0, (function() { return ($1). createTextNode (($2)); })()); }, function($0, $1) { $0 = Module.STDWEB.to_js($0); $1 = Module.STDWEB.to_js($1); ($0). appendChild (($1)); },
moreover, as you can see, it is not translated into one holistic JavaScript method, and the “pointers” to the objects used are constantly transmitted between WASM and JS code. Given that WASM code cannot work with JS objects directly, this trick is done in a rather interesting way, and you can look at the implementation in the
source code of stdweb.
When transmitting a JS / DOM object to WASM, the object is added to the “key-value” containers in JS, storing the correspondences like “JS object ← → unique RefId” and vice versa, where the unique RefId is essentially an auto-creme number:
Module.STDWEB.acquire_rust_reference = function( reference ) { ... ref_to_id_map.set( reference, refid ); ... id_to_ref_map[ refid ] = reference; id_to_refcount_map[ refid ] = 1; ... };
At the same time, it is checked that this object has never been transmitted (otherwise, it will not create a new record, but the reference count will be increased). An object type identifier is written to the memory of the WASM application (for example, 11 for Object, 12 for Array), followed by the entry RefId of the object. When transferring an object in the opposite direction from the map, the necessary object is simply retrieved by a unique ID and used.
Without tests, it is impossible to say exactly how strongly the calls to JS functions for each of WASM, type conversion (and string conversion), coupled with constant searches for objects in tables, slow down work, but in general, this approach to the interaction between “worlds” seems to me much more beautiful than the incomprehensible mix of code from the previous examples.
asm-dom
Well, the most delicious thing in the end:
asm-dom . This is a virtual DOM library (for more details on the Virtual DOM concept,
see the article on Habré), inspired by the JavaScript VDOM library
Snabbdom and intended for developing SPA (Single-page applications) in C ++ / WebAssembly.
The code for describing page elements looks like this:
VNode* newVnode = h("div", Data( Callbacks { {"onclick", [](emscripten::val e) -> bool { emscripten::val::global("console").call<void>("log", emscripten::val("another click")); return true; }} } ), Children { h("span", Data( Attrs { {"style", "font-weight: normal; font-style: italic"} } ), std::string("This is now italic type") ), h(" and this is just normal text", true), h("a", Data( Attrs { {"href", "/bar"} } ), std::string("I'll take you places!") ) } ); patch( emscripten::val::global("document").call<emscripten::val>( "getElementById", std::string("root") ), vnode );
There is also
gccx , a converter that generates code like the one above from CPX, which, in turn, is an analog of JSX, known by many for ReactJS, which allows describing components right inside C ++ code:
VNode* vnode = ( <div onclick={[](emscripten::val e) -> bool { emscripten::val::global("console").call<void>("log", emscripten::val("clicked")); return true; }} > <span style="font-weight: bold">This is bold</span> and this is just normal text <a href="/foo">I'll take you places!</a> </div> );
The “distillation” of VirtualDOM into a real DOM, like the interaction between the WASM code and the Web API, occurs either through HTML generation and setting the innerHTML properties of objects, or similarly to the previous example:
var addPtr = function addPtr(node) { if (node === null) return 0; if (node.asmDomPtr !== undefined) return node.asmDomPtr; var ptr = ++lastPtr; nodes[ptr] = node; node.asmDomPtr = ptr; return ptr; }; exports['default'] = { … 'appendChild': function appendChild(parentPtr, childPtr) { nodes[parentPtr].appendChild(nodes[childPtr]); }, 'removeAttribute': function removeAttribute(nodePtr, attr) { nodes[nodePtr].removeAttribute(attr); }, 'setAttribute': function setAttribute(nodePtr, attr, value) { ...
Also on the Github project there is a link to
performance tests as compared to the JS VDOM library of Snabbdom, which show that in some test cases the WASM option loses to JS, in some it overtakes it, and only in one test run on Firefox see serious acceleration. In principle, such results are not surprising, considering the fact that JS calls are still used to update the “real” DOM tree, plus when executing the JS code, the “garbage” from remote objects remains hanging in the heap until the Garbage Collector is triggered , and asm-dom honestly deletes objects immediately if necessary, which also affects the performance.

The author of the library in README.md himself laments that so far the GC / DOM integration in WebAssembly is impossible, but is optimistic in anticipation of the implementation of this functionality - let's hope that then asm-dom will shine in all its glory.
Useful links:
- Introduction to WASM
- Interview with Brendan Ike about WebAssembly
- Native ImGui in the Browser
- Qt for WebAssembly
- Emscripten Documentation
- Cheerp - the C ++ compiler for the Web
- Stdweb: A standard library for the client-side Web
- asm-dom: A minimal WebAssembly virtual DOM to build C ++ SPA