Arithmetic of arbitrary precision in Erlang

@rawpixel

Even schoolchildren are aware of the existence of various number systems and the fact that not every finite decimal fraction is a finite fraction in a binary number system. Few people think that because of this fact, operations on float and double are not precise.

If we talk about Erlang, then, like many other languages, it implements the IEEE754 standard for float, while the standard type Integer in Erlang is implemented using arbitrary precision arithmetic. However, I would like to have not only bigint, but also the possibility of operating with rational, complex and floating-point numbers with the necessary accuracy.

The article presents a minimal overview of the theory of coding floating-point numbers and the most vivid examples of the emerging effects. The solution, which provides the necessary accuracy of operations through the transition to a fixed-point view, is designed as an EAPA (Erlang Arbitrary Precision Arithmetic) library designed to meet the needs of financial applications developed at Erlang / Elixir.

Standards, standards, standards ...

To date, the main standard of binary floating point arithmetic is widely used in IEEE754 engineering and programming. It defines four presentation formats:

single-precision 32 bits
with double precision (double-precision) 64 bits
with single extended precision (single-extended precision)> = 43 bits (rarely used)
with double extended precision (double-precision precision)> = 79 bits (usually 80 bits are used)
and four rounding modes:
Rounding tending to the nearest whole.
Rounding to zero.
Rounding aiming at + ∞
Rounding toward -∞

Most modern microprocessors are made with the hardware implementation of the representation of real variables in the IEEE754 format. Presentation formats limit the size limit of a number, and rounding modes affect accuracy. Programmers often cannot change the behavior of hardware and implementations of programming languages. For example, the official Erlang implementation stores float in 3 words on a 64-bit machine and in 4 words on a 32-bit machine.

As mentioned above, the numbers in the IEEE754 format are a finite set, on which an infinite set of real numbers is mapped, so the original number can be represented in the IEEE754 format with an error.

The main mass of numbers when mapping to a finite set has a stable and small relative error. So, for float it is 11.920928955078125e-6%, and for double - 2.2204460492503130808472633361816e-14%. In the life of programmers, most solvable everyday problems allow us to neglect this error, although it should be noted that in simple tasks you can step on a rake, since the absolute error value can reach 10 ³¹ and 10 ²⁹² for float and double, respectively, causing difficulties in calculations.

Effects illustration

From general information to the case. Let's try to reproduce the resulting effects in Erlang.

All the examples below are in the form of ct-tests.

Rounding and loss of accuracy

Let's start with the classics - the addition of two numbers: 0.1 + 0.2 =?

t30000000000000004(_)-> ["0.30000000000000004"] = io_lib:format("~w", [0.1 + 0.2]).

The result of the addition is slightly different from the intuitively expected, and the test passes successfully. Let's try to achieve the right result. Rewrite the test using EAPA:

 t30000000000000004_eapa(_)-> %% prec = 1 symbols after coma X = eapa_int:with_val(1, <<"0.1">>), Y = eapa_int:with_val(1, <<"0.2">>), <<"0.3">> = eapa_int:to_float(1, eapa_int:add(X, Y)).

This test is also successful, showing that the problem has been fixed.
Let's continue the experiments, add a very small amount to 1.0:

 tiny(_)-> X = 1.0, Y = 0.0000000000000000000000001, 1.0 = X + Y.

As you can see, our increase went unnoticed. We are trying to correct the problem, illustrating one of the library's possibilities in the way - automatic scaling:

 tiny_eapa(_)-> X1 = eapa_int:with_val(1, <<"1.0">>), X2 = eapa_int:with_val(25, <<"0.0000000000000000000000001">>), <<"1.0000000000000000000000001">> = eapa_int:to_float(eapa_int:add(X1, X2)).

Bit Overflow

In addition to problems associated with small numbers, overflow is an obvious and significant problem.

 float_overflow(_) -> 1.0 = 9007199254740991.0 - 9007199254740990.0, 1.0 = 9007199254740992.0 - 9007199254740991.0, 0.0 = 9007199254740993.0 - 9007199254740992.0, 2.0 = 9007199254740994.0 - 9007199254740993.0.

As can be seen from the test, at some point the difference ceases to be equal to 1.0, which is obviously a problem. EAPA allows you to solve this problem:

 float_overflow_eapa(_)-> X11 = eapa_int:with_val(1, <<"9007199254740992.0">>), X21 = eapa_int:with_val(1, <<"9007199254740991.0">>), <<"1.0">> = eapa_int:to_float(1, eapa_int:sub(X11, X21)), X12 = eapa_int:with_val(1, <<"9007199254740993.0">>), X22 = eapa_int:with_val(1, <<"9007199254740992.0">>), <<"1.0">> = eapa_int:to_float(1, eapa_int:sub(X12, X22)), X13 = eapa_int:with_val(1, <<"9007199254740994.0">>), X23 = eapa_int:with_val(1, <<"9007199254740993.0">>), <<"1.0">> = eapa_int:to_float(1, eapa_int:sub(X13, X23)).

Dangerous reduction

The following test demonstrates the occurrence of a dangerous reduction. This process is accompanied by a catastrophic decrease in the accuracy of calculations in operations where the resulting value is much less than the input. In our case, the result of subtraction is 1.

Let us show that this problem is present in Erlang:

 reduction(_)-> X = float(87654321098765432), Y = float(87654321098765431), 16.0 = XY. %% has to be 1.0

It turned out 16.0 instead of the expected 1.0. Let's try to correct this situation:

 reduction_eapa(_)-> X = eapa_int:with_val(1, <<"87654321098765432">>), Y = eapa_int:with_val(1, <<"87654321098765431">>), <<"1.0">> = eapa_int:to_float(eapa_int:sub(X, Y)).

Other features of floating point arithmetic in Erlang

Let's start with ignoring negative zero.

 eq(_)-> true = list_to_float("0.0") =:= list_to_float("-0.0").

Just want to say that EAPA retains this behavior:

 eq_eapa(_)-> X = eapa_int:with_val(1, <<"0.0">>), Y = eapa_int:with_val(1, <<"-0.0">>), true = eapa_int:eq(X, Y).

as it is permissible. In Erlang, there is no intelligible syntax and processing of NaN and infinities, which gives rise to a number of features, for example, like this:

 1> math:sqrt(list_to_float("-0.0")). 0.0

The next point is the peculiarity of handling large and small numbers. Let's try to reproduce for small ones:

 2> list_to_float("0."++lists:duplicate(322, $0)++"1"). 1.0e-323 3> list_to_float("0."++lists:duplicate(323, $0)++"1"). 0.0

and for large numbers:

 4> list_to_float("1"++lists:duplicate(308, $0)++".0"). 1.0e308 5> list_to_float("1"++lists:duplicate(309, $0)++".0"). ** exception error: bad argument

Let's give a couple more examples for small numbers:

 6> list_to_float("0."++lists:duplicate(322, $0)++"123456789"). 1.0e-323 7> list_to_float("0."++lists:duplicate(300, $0)++"123456789"). 1.23456789e-301

 8> 0.123456789e-100 * 0.123456789e-100. 1.524157875019052e-202 9> 0.123456789e-200 * 0.123456789e-200. 0.0

The examples above confirm the truth for Erlang projects: money cannot be considered in IEEE754.

EAPA (Erlang Arbitrary-Precision Arithmetic)

EAPA - NIF extension written in Rust. At the moment, the EAPA repository presents the most simple and convenient interface for working with fixed-point numbers. Among the features of eapa_int are the following:

No effects IEEE754 encoding
Support large numbers
Adjustable accuracy up to 126 decimal places. (in current implementation)
Autospeiling
Support for all basic operations on numbers
More or less complete testing, including property based.

eapa_int interface:

with_val/2 - translation of a floating-point number into a fixed representation, which can also be safely used in json, xml.
to_float/2 - to_float/2 fixed-point number to a floating-point number with a given accuracy.
to_float/1 - to_float/1 fixed-point number to a floating-point number.
add/2 - the sum of two numbers
sub/2 - difference
mul/2 - multiplication
divp/2 - division
min/2 - minimum of numbers
max/2 - the maximum of the numbers
eq/2 - check for equality of numbers
lt/2 - check that the number is less
lte/2 - test is less than
gt/2 - check that the number is greater
gte/2 - verification is greater than

EAPA code can be found in the https://github.com/Vonmo/eapa repository

When should i use eapa_int? For example, if your application works with money or you need to conveniently and accurately perform computational operations on numbers of the form 92233720368547758079223372036854775807.92233720368547758079223372036854775807, you can safely use EAPA.

Like any solution, EAPA is a compromise. We obtain the required accuracy by sacrificing memory and computing speed. Performance tests and statistics collected on real systems show that most operations are performed in the range of 3-30 μs. This point also needs to be considered when choosing an interface with a fixed point EAPA.

Conclusion

Of course, it’s not always necessary to solve such problems on Erlang or Elixir, but when a problem arises and a suitable tool is not found, you have to invent a solution.
This article is an attempt to share with the community a tool and experience, in the hope that for someone this library will be useful and help save time.
How do you count money in Erlang?

PS Working with rational and complex numbers, as well as native access to Integer, Float, Complex, Rational types of arbitrary precision will be covered in the following publications. Do not switch!

Materials on the topic:

Source: https://habr.com/ru/post/428111/

All Articles