📜 ⬆️ ⬇️

Calculation of formulas in office editors. Part 1


Picture to attract attention. Mechanical calculator

Hello. Today we want to present to your attention an introductory article on formulas in office editors. We will also touch on how we solve the problem of presenting and calculating formulas in the kernel of the editor MoiOffice , and outline a plan for future articles on this topic.

Popular standards


To date, there are two of the most popular standards that are used in editors. On the one hand, it is the MS Excel spreadsheet editor and its ECMA OOXML standard. On the other hand are the LibreOffice / OpenOffice Calc family of editors and the OpenFormula standard. To the credit of the latter, it was developed taking into account that it could be used not only in office applications.

For a regular user, formulas in the table editor are associated with a function call. With all versions of editors, there are about four and a half hundreds of functions. They are conventionally divided into application groups: engineering, for working with databases, dates and times, financial, logical, informational, mathematical and trigonometric, statistical, textual, etc. But regardless of the scope of application, all functions follow the same rules. They can work with certain types of input data or use the result of each other's work.
')

Differences in Spreadsheet Editing Standards


As for the functions, there are quite a few differences in both standards. They may have a different number of parameters. In turn, the parameters can have different types of input values. Even the accuracy of the calculations may differ. But not less, but in fact even much more differences in the very syntax of formulas. Below in the table we will give just a few examples of differences in the English locale for the record (the format for writing formulas in different locales may also differ significantly) :


In the last example, the space in MS Excel syntax between two ranges is significant and this is the binary range-crossing operator.

In the syntax of formulas there are indeed differences, but they are not so large as to prevent the development of a single internal representation for both entries of the formula language. The internal representation is a syntactic parse tree, in which there are no differences between the different syntaxes of formulas. Of course, not all constructions can be unambiguously converted from one record to another and there is a certain level of incompatibility. Therefore, in the case of dramatic differences, one has to choose the side of one record and most often this is MS Excel. We support both formats for opening documents, but the main format is OOXML. Everything that we open is converted into a single representation that does not depend on the format. You can save this view in both formats. We tried to keep the differences to a minimum in order to present both formats in a single model. For example, all the above examples in the table for both formats will have an absolutely identical parse tree in our presentation model.

In addition to the differences between entries, there are problems using tokens within a single entry. We are talking about the operator space in OOXML. Not always the gap in the formula is a harmless empty space. Sometimes this is a cell range intersection operator.


An example of using the space operator when combining a cell with a list of cells and intersecting with a range of cells is used.

A space in the syntax creates some ambiguous situations in the record itself. Here is an example of very similar in appearance, but very different in fact Excel records:

= SUM ( A1 , A2 , A3 )
= SUM ( A1 , A2 , A3 )

In the first case, this is a call to the SUM function with a list of parameters of three elements. In the second case, this is a binary space between two parameters, the first of which is a string identifier, which is regarded as a named range of cells , and the second is a list of parameters from three elements. The second case is a valid entry in terms of the syntax of the OOXML formula language. And the result will be the cumulative result of the intersection between the named range and the parameter list.

It is worth mentioning that this entry makes sense: '= SUM A1 '. For example, SUM is a named range A1: B2 , in which case the result of the given example will be the value from cell A1 .

As a conclusion, we say that we initially set ourselves the task of giving our users a full-fledged functionality for working with formulas. The functionality that will not yield, but somewhere and exceed the products - market leaders. Thus, we will be able to raise the bar for the quality of office products. And healthy competition will have a positive impact on the entire ecosystem of editors in the world.

In the following articles we will tell



Thank you and wish you # DIV / 0! nor #NULL!

Source: https://habr.com/ru/post/268351/


All Articles