📜 ⬆️ ⬇️

How ONLYOFFICE has reconciled two generations of Microsoft formulas

When we were developing our document editors, we wanted to give the user the ability to conveniently work with any object. One of the obstacles that arose on the way to editing everything right and right in this window was formulas, or rather their duality. Students of technical universities faced this phenomenon with almost one hundred percent probability: we are talking about the existence of “old” (binary .doc) and “new” (progressive XML) formulas in the editors of the MS Office package.

In this article we will describe how this problem is solved in the ONLYOFFICE editors. The answer is simple: K - “Conversion”. We convert old formulas into editable new formulas and are very happy with our idea. Why we went this way and how the conversion works, read on.



Formula dualism


')
In Microsoft editors, you can still type the formulas of two different formats.

The old format of formulas is the formulas that were created in MS Office until 2007 using the Microsoft Equation add-on. For example, to create such a formula in Word, the user calls a third-party editor through the menu ( Insert -> Object -> Microsoft Equation ). This command opens the formula editor, which is actually a shortened version of the Math Type program from Design Science.

So the old formulas are OLE objects. Word simply gives a certain area of ​​the document to another application, without even knowing what the application does in it. After closing, MS Equation Word treats the formulas created in it as pictures embedded in text. They can not be edited in the text itself - you need to call the formula editor again.

In 2007, the transition to docx began. Together with him, Microsoft got its own formula editor, the possibilities of which are much wider. First, it has more mathematical symbols and patterns. Secondly, the new editor allows Word to work with formulas as part of the text, and not as with pictures. Thus, the new formula editor is the WYSIWYG editor.

It seemed that the life of mathematicians and all those who needed formulas had to be simplified with the advent of the new editor. But there was a problem. The transition to the docx format did not happen instantly - there remained a large array of documents in the doc format. Moreover, many stubbornly continue to save documents in doc. And this is not surprising - many still have old computers and old versions of MS Office.

So, users still save files in both docx and doc. In addition, there are a huge number of doc format documents that will never be converted to docx, and a huge number of people who have to deal with these documents and the formulas created in them.

How different document editors solve a problem





ONLYOFFICE approach



Naturally, we wanted to support both types of formulas. At the same time, it seemed logical to us to take the new formulas as the basis. Our main format is still docx.

We did not want to make two editors, like Microsoft, - it is resource intensive and hopeless. In addition, we can ideally display the old formulas stored in docx. It would be very strange to write a separate editor for them, and this is what we decided: let the user do everything he wants with the new formulas, and show the old ones in the form of vector images. But this does not mean that now he will not be able to change anything in them - one elegant movement with the mouse (also known as a double click) and the old formulas will be converted into new, fully editable ones.

How does the conversion



The old formula is stored in the docx document in two versions - as a wmf vector image and as an OLE object (a binary with the old formula). If the old formula came to us in the document, we show it as a picture, such as recorded Word. Therefore, in the ONLYOFFICE editors, the file with the old formulas will open in the same way as in Word.

When converting something in formatting can go, i.e. change, because the formats of old and new formulas are radically different. But this can be easily corrected, because now all the objects of the document are editable.

A little secret : in fact, the formulas are converted even before the user wants it. We parse the binary on the server before opening the file, and show the pictures to save the document view. But if the user needs, we quickly change these pictures to the converted formula.

Here we can be called original: no one will convert the old format into a new one at all. Even Word. ONLYOFFICE editors allow you to correct formulas in doc, then translate everything into docx and never think about it again.

To make such a conversion, we had to learn how to open a closed format with no specifications anywhere. By and large, he is using the reverse engineering method. We had to spend quite a decent amount of time with hundreds of small files with formulas, watching how the behavior of the binary changes, depending on what we add to the formula. However, it was worth it. Could we not leave the user with pictures instead of formulas :)

Difficulties



The only difficulty that may arise when converting old formulas into new ones is the incompatibility of formats with each other. In such cases, we had to work hard to translate one formula into a similar one.

For example, consider a system of three equations in the old format



For equations, left alignment is set (center / right) in the new format there is no such alignment (equations are placed in the center)



To realize left-alignment in new formulas, it is enough to put & at the beginning of each equation. Increasing the minimum distance between the reference lines, we get



In principle, an acceptable result of conversion. After conversion, if needed, you can edit the system a little more by placing & in the right places in front of the variables, we get a result that is difficult to do in the old formulas



Instead of conclusion



We tried to make working with the formulas as convenient for ONLYOFFICE users as possible and quite satisfied with the result. We hope that by converting the old format formulas into the new one, we have contributed to the universal transition to docx. Just take the formulas from the docs and never go back there.

And also - we continue to improve our editors. The main priority in the near future will be the work on footnotes. In addition, a large package of updates for all ONLYOFFICE modules will be released very soon. Among the main ones is the provision of a docx document with review rights (it appeared in version 3.6 of the editors, but now it has become even more convenient to work with it). In general, stay with us :)

Source: https://habr.com/ru/post/281134/


All Articles