Integration of HTML engine in native Windows application - choice and architecture
How we translated work with HTML in 1C: Enterprise from Internet Explorer to WebKit
The ability to display HTML in 1C forms appeared in the 1C: Enterprise platform in 2003 version 8.0. To work with HTML, the platform used the Internet Explorer browser engine (1C: Enterprise at that time worked only under Windows). The browser engine was used by the platform for utilitarian purposes. For example, to write from scratch a full-fledged element for editing text a la Word - with the possibility of various color and font solutions, inserting pictures, etc. - a very difficult task. And if you use HTML for this purpose and use the Internet browser engine as a display tool, the task is greatly simplified. Also, using the engine, a number of other mechanisms (for example, displaying reference information ) and elements (for example, the Scheduler ) were implemented.
Well, the opportunity for developers of applied solutions to display using non-standard HTML, by the standards of the world of accounting systems, the design sometimes allowed to bring a variety of nice highlights to the interface of business applications. ')
As time went on, the platform began to support Linux first, and then macOS. For working with HTML in these operating systems, Internet Explorer was not suitable for obvious reasons; we used WebKitGTK + in Linux, and Cocoa- based library in macOs. Thus, the unity of the code base for different operating systems (which we try to maintain for the client code at the level of 95%) in this area was violated. Well, the IE engine by this time was the source of a number of problems. Problems:
IE engine has closed source code - it means:
Unable to customize the engine to the needs of the platform
Impossible to debug and understand the processes occurring inside
Unable to eliminate bugs and errors by updating the version of the engine
The engine is not suitable for the implementation of modern tasks of web programming.
Performance problems on weak machines
So, the translation of working with HTML in the 1C: Enterprise version for Windows from the IE engine to something else suggested itself. What to choose?
To begin, we formulated the requirements for the engine:
Support for modern web programming technology
Open source code for flexible engine settings and understanding of the logic of its work.
High performance on low-end computers
It is desirable that the engine required a small number of third-party libraries to work.
Engine selection
What to choose from? We started, of course, with WebKit , with which we have already worked in versions of the platform for Linux and macOS.
WebKit was developed by Apple in the early 2000s based on the open source engines KHTML and KJS . Safari was created on the basis of WebKit, Chrome later (later, Chrome switched from WebKit to Blink , which was again based on WebCore code from WebKit).
WebKit source code - open, licensed under the LGPL . WebKit is written for macOS, there are several ports for Windows:
WebKit AppleWin
This is the port that WebKit developers propose to build under Windows by default. Was made by Apple employees in the mid-end zero. It uses the CoreGraphics graphic library, which is a simplified version of the library for macOS, ported to Windows. To execute JavaScript, the port uses the JavaScriptCore library with the same API used in the implementation of the Linux platform. This makes it a prime candidate for use.
WebKit WinCairo
This port uses the Cairo library for graphics. This port of Apple for some time actively developed as an analogue of the main port of AppleWin. The advantage of this port is that it is less dependent on macOS-specific libraries that CoreGraphics requires. In addition, the port uses the same graphics library ( Cairo ) as the WebKitGTK + engine that we use in the Linux platform implementation, which is good for standardizing the behavior of our code.
QtWebKit
Another implementation of the WebKit engine for Windows, now independent of the developers of the engine itself. Qt is a popular cross-platform library with QtGui's own graphic library. This port also uses the JavaScriptCore library to handle javascript, however it does have disadvantages:
Strong dependency on the core Qt components that, if used in third-party software, will need to be shipped with it.
A different set of interfaces for working with components for rendering HTML compared to WebKitGTK and its own logic for working with them.
WebKitGtk + for Windows
We have already used WebKitGtk + in the Linux version of the platform. But the option to use it in Windows was excluded because of the complexity of the assembly, poor documentation of this process and the lack of constant support for this development line from the developers of WebKitGTK +.
Chromium (Blink)
The first and only non-WebKit-like engine, which was considered as a candidate for solving the problem. It was rejected because of the large differences in the logic of the components for rendering HTML compared to WebKitGTK + and another library for working with JavaScript ( V8 ).
What to choose?
After research, AppleWin and WinCairo reached the final. To make the final choice, we studied how WebKit works.
WebKit engine structure
Usually different WebKit ports are different in two things. The first is directly implementation for a specific OS using OS-specific components. The second is the graphic library. The figure below describes the differences in this sense between the WebKit ports. For Windows, WebKit developers wrote ports on an adapted CoreGraphics and Cairo library.
Simplified model of the engine: three traditional mechanisms for formatting a web page - HTML, JavaScript and CSS - are input to the engine, and it forms and displays the page from them:
The engine itself consists of several components:
WTF (Web Template Framework, not what you might have thought ): here you can find your own implementations of data structures for the engine functioning, as well as work with streams
JavaScriptCore: component, as the name implies, for working with the JavaScript language
WebCore: all the work with DOM, styles, HTML parsing and XML is written here. Here is done all the main "magic" of the engine
Platform: performs technical actions to interact with the network, placing data in a database, decoding images, working with media
WebKit and WebKit2 API - linking all components and providing access to them
The relationship between the WebKit components and OS-specific features is shown in the figure below. As you can see, there are quite a few specific points that need to be implemented for each OS separately. Although JavaScriptCore allows you to use yourself in each port without separate implementations.
How a web page is formed
From the network comes a response to a request to the server with the data to download. The loader sends data to the parser, which, interacting with the component for JavaScript, forms the DOM and the style sheet. Next, the generated data is transferred to the rendering tree and displayed as a graphic context. The page itself also consists of individual components. The WebCore component implements the Page class, which allows access to the entire page. The Page has a main frame - MainFrame, there is always a document in the frame. In the main frame there can be any number of other frames, also with documents inside. For each frame, some events are separately formed, as well as specific graphical and JavaScript contexts.
Simplified HTML parser works like this. From the set of received bytes from the server, the decoder forms a set of characters for parsing. Characters are converted into tokens or tokens, which usually represent the elementary parts of a code with meta-information about what the text is, whether it is part of a language syntax or content. Then from tokens nodes are formed to build a DOM tree. The tree builder from the node set forms a full-fledged object model of the web page document.
Final choice
Applewin
Pros:
Implemented in a graphical library that runs on macOS — the main target platform for WebKit developers.
Minuses:
The lack of implementation of the printing mechanism
A large number of dependencies
Wincairo
Pros:
The same graphic library (Cairo) as used in the Linux port of the 1C platform
Minuses:
Essential for our tasks not detected
WinCairo won. For development, the latest WebKit version available at that time is 605.1.11.
Implementation
Although the engine is pretty well covered with unit tests (about 30,000 for all components of the engine are written by the engine authors), there are errors and shortcomings in implementations for non-core OSs (i.e., for everything that is not macOS). These implementation gaps were gradually detected as the engine was developed and tested as part of the 1C: Enterprise platform.
Loading HTML code via Drag & Drop
When dragging text into the window, it was found that if the text being dragged contains non-ASCII characters, then hieroglyphs are inserted into the final document. The error manifested itself only in the Windows implementation of the engine, because it worked with the OS-specific mechanism for dragging elements. It turns out that the text was not decoded from UNICODE to UTF-16 before passing the insert event to the handler.
Change behavior by Shift + Enter
In most text editors (including Microsoft Word), this combination inserts a line break. The standard behavior of WebKit is the insertion of a new paragraph (as if simply pressing Enter). We changed the engine, making the behavior more familiar to users.
Organization mechanism Undo & Redo.
WebKit provides an API for implementing your own undo and redo actions. Its scheme is as follows: when a user performs an action that is discrete from the point of view of the engine (for example, moving to a new paragraph, formatting in italics, insertion), WebKit informs the developer about this using API methods so that he can register this action.
In the process of testing the implemented mechanism, an unpleasant thing turned out: the engine does not report changes in the structure of tables. Commands for adding and removing cells and changing the colSpan attribute were added, which became parts of compound actions, such as adding / deleting a column or a row in a table. Such composite commands are registered on the same undo & redo stack, and together with the commands from the engine ensure the correct operation of the mechanism.
Paste from Excel
Those who worked with the Windows clipboard and Excel may know that, firstly, when copying from Excel to HTML clipboard format, only the tags of cells and rows are placed in the copied fragment, but not the tag of the table itself, secondly, No styles are transferred to cells from an Excel document. Because of this, inserting, for example, in an editable element in Chrome, a color table looks like this:
Original:
In Chrome:
Both of these factors are not taken into account by the developers of WebKit. The openness of the engine code allowed us to refine the insertion mechanism, and now the fragment of the table inserted into the HTTP Field of the Document is close to the original:
Generate Italic Fonts
If Windows does not have an italic version of a non-standard font, most text editors can generate such a font from its regular version. However, WebKit did not know how to do this and misled the developers a couple of times: how is it that, in the HTML code of the document, we put the text in the <i> tag, but despite this the text remained straightforward. The reason is in the selection algorithm of the WebKit engine of the desired font in the WinCairo port we use - in case the italic version is not present, the engine uses the regular version. This behavior has been replaced by the use of the italic font generated by the Cairo graphic library.
Errors when decoding images and animations
Errors in the behavior of the engine were found when working with graphic elements. When downloading some pictures in PNG format, there was a distortion of the image, and sometimes its absence. The reason for this behavior could not be clarified, since the error occurs when decoding images in the depths of the libpng library.
It was empirically found out that when linking the libpng library in a dynamic way instead of static, the problem is eliminated. By the way, in the current version of the engine, linking is done in this way. It was decided to do the same.
Another problem was the work of the engine when loading animations in GIF format. The error was periodically reproduced when the page was loaded with such animations and caused the program to crash. The error was caused by the lack of synchronization when working with the buffer in which the next frame of the animation is placed. The issue was resolved using internal WebKit synchronization tools.
Spelling support
In the assembly with Internet Explorer, in Windows version 8 and newer, the editable HTML field could enable spell checking. For this it was enough to make the attribute “spellcheck” equal to “true”. WebKit had different solutions for different ports: in Linux it is Enchant library, in macOS it is its own mechanism, familiar to all users of Apple products. But for Windows there is no implementation, but an API is provided for its own solution. We used the Windows Spell Checking API , available since Windows 8, and implemented a mechanism similar to the assembly with Internet Explorer. By the way, now in the formatted document in 1C: Enterprise native clients this functionality also works. Before version 8.3.14, it was disabled due to poor performance of Internet Explorer.
Windows XP support
Some of our clients are still working on Windows XP, and in the near future the OS is not going to upgrade. Sad for us as developers, but true. So - we need to support them. And then an unpleasant surprise awaited us: the WebKit developers for about a year did not support the engine in WinXP. Attempting to build a version of the engine with a set of build tools for WinXP did not lead to success - the developers of WebKit use libraries that are available only from versions of Windows Vista and later.
What to do? The options were as follows:
Leave the WinXP implementation with the Internet Explorer engine, and use WebKit in older Windows versions
Take to develop an earlier version of the WebKit engine that works in WinXP, and use this version in all operating systems.
Use a suitable version of WebKit in WinXP, and use a fresh engine in older versions of Windows
Port the current version of the engine on WinXP yourself and use it everywhere
The question was not an easy one. The first option allowed the use of the latest version of the WebKit engine, but would force the return of the old implementation with Internet Explorer. In such a decision it would be difficult to ensure the error-free operation of the program, and the code itself would be very complicated. The second option provided the same behavior on all Windows OS, but would not leave us the opportunity for development - updating the engine for correcting errors and getting new features from the developers of the engine in later versions. The third option allowed us to use the current version of the engine in older versions of Windows, but greatly complicated the installation logic and ensuring the same version behavior in all OSs. The fourth option looked preferable to all the others, but it was impossible to predict the complexity and the possibility of such a solution in general.
We decided to take the risk and implement the fourth option, the most correct from an architectural point of view (using a single source code for the engine on all versions of Windows). The ported version of WebKit works differently on WinXP and newer versions of Windows:
I had to abandon the means of the new DirectX (d3d11) in favor of the old DirectX9 (d3d9) and adapt its header files to the younger version of the SDK.
Functions from the new SDK when executed on new versions of Windows are called at the address (obtained via GetProcAddress ).
In WinXP, Thread local storage is used to transfer data between threads in the engine, and Fiber local storage is used in new versions.
Total
So, now we have in the 1C: Enterprise platform from version 8.3.14 (release - end of 2018) HTML will be supported on the highest level - HTML5, OpenGL, etc. Both the quantity and quality of the highlights that can be introduced into solutions on our platform are limited only by the imagination of the developer. And, of course, the client's operating system — on WinXP, many tasty HTML5 buns cannot work, for obvious reasons.
Now on Windows, applications on the 1C: Enterprise platform will be able to show this:
But, using the "goodies" of HTML in application solutions, do not forget common sense. Using HTML is advisable and recommended for specialized tasks (displaying help, techniques, descriptions of goods, ...), but not for implementing business logic tasks (input / output of structured information). To do this, you need to use the standard 1C: Enterprise interface mechanisms that provide automatic support for access rights, functionality management, adaptation to the device form factor, support for user settings and operation of many other mechanisms, without which the full-fledged work of a business application becomes almost impossible.