Overview of rounding implementations in Go

Hi, Habr! My name is Oleg, I'm a PHP-and-not-only-developer on Badoo. It often surprises me how differently in programming languages they approach to compiling a standard library. Go is no exception: the absence of the math.Round () function surprised me. However, having rummaged in these your Internet, I found out the reason. I would like to share this knowledge in my free translation.

Rounding in Go is not easy to do correctly. It would seem that we take float64, discard the fractional part and add one to the resulting value if the fractional part was> = 0.5. There are other options available when searching for “golang round” on Google, many of which can also be found on a private ticket on GitHub , where Round () refused to be included in the standard math package. In this article I would like to study these and other implementations and check them for correctness. We will see that almost all have errors that prevent their use.

Choosing a rounding method

There are several ways of rounding, depending on how the result is applied: rounding to a smaller / larger, rounding to a smaller / larger module, rounding to the nearest whole, rounding to the nearest even, etc. ... Rounding to the nearest whole, in turn, can be done differently depending on what the result should be if the fractional part is 0.5. I will consider rounding to the nearest integer, and 0.5 will be rounded to a larger (modulo) direction.

Requirements for the correct implementation of Round () are as follows:

correctly rounds all finite numbers to the nearest integer;
supports special values (NaN, Inf, -0), returning them unchanged.

I will use the following test cases to check the correctness, each pair contains the initial value and the expected result of the Round () function:

tests := [][2]float64{ {-0.49999999999999994, negZero}, // -0.5+epsilon {-0.5, -1}, {-0.5000000000000001, -1}, // -0.5-epsilon {0, 0}, {0.49999999999999994, 0}, // 0.5-epsilon {0.5, 1}, {0.5000000000000001, 1}, // 0.5+epsilon {1.390671161567e-309, 0}, // denormal {2.2517998136852485e+15, 2.251799813685249e+15}, // 1 bit fraction {4.503599627370497e+15, 4.503599627370497e+15}, // large integer {math.Inf(-1), math.Inf(-1)}, {math.Inf(1), math.Inf(1)}, {math.NaN(), math.NaN()}, {negZero, negZero}, }

This list has the usual numbers, special values and some boundary cases that simple algorithms are difficult to handle. Please note that since we use float, we cannot use the number 0.49999999999999999 as the closest to 0.5, since due to the limited accuracy of the float, this number is exactly 0.5. Instead, I use .49999999999999994.

The implementations proposed in the closed ticket were obviously not checked for such data, often even those that were proposed by famous people did not work. This once again proves how difficult it is to write Round ().

int (f + 0.5)

The first implementation proposed by rsc was as follows:

 return int(f + 0.5)

It does not work correctly with special values, negative numbers, numbers greater than math.MaxInt64 and numbers close to 0.5:

 round(-0.5): got: 0, want -1 round(-0.5000000000000001): got: 0, want -1 round(0.49999999999999994): got: 1, want 0 round(4.503599627370497e+15): got: 4.503599627370498e+15, want 4.503599627370497e+15 round(-Inf): got: -9.223372036854776e+18, want -Inf round(+Inf): got: -9.223372036854776e+18, want +Inf round(NaN): got: -9.223372036854776e+18, want NaN

Floor () or Ceil ()

The second proposed option took into account negative numbers:

 if f < 0 { return math.Ceil(f - 0.5) } return math.Floor(f + 0.5)

however, he continued to work incorrectly in some cases:

 round(-0.49999999999999994): got: -1, want -0 round(0.49999999999999994): got: 1, want 0 round(4.503599627370497e+15): got: 4.503599627370498e+15, want 4.503599627370497e+15

The first two tests do not pass, because the result of the difference n - 0.5 is exactly -1.0, while we expect to get something exactly greater than -1.0. If you look at the implementation of round in Postgres , you can figure out how to solve this problem.

The most interesting thing is that this error is not so rare. Prior to version 6, exactly the same was present in Java . It is good that implementation has since improved.

int and copysign

In the third sentence, minux made another attempt to solve the problem of negative numbers:

 return int(f + math.Copysign(0.5, f))

And this option still breaks the tests:

 round(-0.49999999999999994): got: -1, want -0 round(0.49999999999999994): got: 1, want 0 round(4.503599627370497e+15): got: 4.503599627370498e+15, want 4.503599627370497e+15 round(-Inf): got: -9.223372036854776e+18, want -Inf round(+Inf): got: -9.223372036854776e+18, want +Inf round(NaN): got: -9.223372036854776e+18, want NaN

As you can see, some of the tests began to pass, but others began to fall. An attempt was made to improve this algorithm:

 if math.Abs(f) < 0.5 { return 0 } return int(f + math.Copysign(0.5, f))

However, she failed:

 round(4.503599627370497e+15): got: 4.503599627370498e+15, want 4.503599627370497e+15 round(-Inf): got: -9.223372036854776e+18, want -Inf round(+Inf): got: -9.223372036854776e+18, want +Inf round(NaN): got: -9.223372036854776e+18, want NaN

This option looks better than the others, but it also incorrectly handles special values and large numbers. The first problem can be solved with the help of additional conditions, but the second is not so easy to handle.

We have already considered four options, and in each of them there were flaws. It's time to see how the authors of various packages implement Round ().

Kubernetes

Kubernetes 1.7 contains the following implementation:

 if a < 0 { return int32(a - 0.5) } return int32(a + 0.5)

It breaks down the following tests:

 round(-0.49999999999999994): got: -1, want -0 round(0.49999999999999994): got: 1, want 0 round(4.503599627370497e+15): got: 4.503599627370498e+15, want 4.503599627370497e+15 round(-Inf): got: -9.223372036854776e+18, want -Inf round(+Inf): got: -9.223372036854776e+18, want +Inf round(NaN): got: -9.223372036854776e+18, want NaN

Judging from the fact that the function returns int32, it is not designed to work with large numbers. However, it does not work correctly with numbers that are close to 0.5.

Running implementations on Go

Round () used in Postgres

Earlier, I mentioned that Postgres contains the code for the Round () function on C, which works for all tested values. In CockroachDB we rewrote this code on Go , without comments it looks like this:

 func round(x float64) float64 { if math.IsNaN(x) { return x } if x == 0.0 { return x } roundFn := math.Ceil if math.Signbit(x) { roundFn = math.Floor } xOrig := x x -= math.Copysign(0.5, x) if x == 0 || math.Signbit(x) != math.Signbit(xOrig) { return math.Copysign(0.0, xOrig) } if x == xOrig-math.Copysign(1.0, x) { return xOrig } r := roundFn(x) if r != x { return r } return roundFn(x*0.5) * 2.0 }

Let's see how it works. The first six lines handle special cases. Next, we select roundFn from Ceil and Floor, depending on whether the number is positive or negative. Then the fun begins:

 x -= math.Copysign(0.5, x)

With this code, we move x closer to zero.

 if x == 0 || math.Signbit(x) != math.Signbit(xOrig) { return math.Copysign(0.0, xOrig) }

Next, we check whether x has become exactly zero and whether its sign has changed. This means that the original number <= 0.5, in this case we return a zero with the desired sign.

 if x == xOrig-math.Copysign(1.0, x) { return xOrig }

This check is needed for very large numbers, for which x-0.5 == x-1.0, in these cases we can return the number unchanged.

 r := roundFn(x) if r != x { return r }

Next, we round the number using Floor () or Ceil () and return this value if it differs from x, which can happen only if the fractional part of the input value is not exactly 0.5, since we deducted above 0.5 out of him.

 return roundFn(x*0.5) * 2.0

Now we know that the fractional part is 0.5, so we need to round to the nearest even number (the implementation of Round () in Postgres in this place differs from the above options). The comment in the code describes it better:

Dividing input + 0.5 by 2, taking the floor and multiplying by 2 yields the closest even number. This is not the case, that should be OK because underflow is impossible here: x is an integer.

To preserve the original behavior, this code can be replaced with the following:

 return xOrig + math.Copysign(0.5, xOrig)

github.com/montanaflynn/stats

Another working implementation is contained in the github.com/montanaflynn/stats package. Without comments, it looks like this:

 func round(input float64) { if math.IsNaN(input) { return math.NaN() } sign := 1.0 if input < 0 { sign = -1 input *= -1 } _, decimal := math.Modf(input) var rounded float64 if decimal >= 0.5 { rounded = math.Ceil(input) } else { rounded = math.Floor(input) } return rounded * sign }

The key difference from the previous solutions is the use of the Modf () function, which correctly separates the integer and fractional parts of numbers.

Round () in Go 1.10

A few months after the release of Go 1.8, another ticket appeared with a request to add math.Round to Go. In the comments to it, incorrectly working implementations continued to appear, their number increased to eight. Fortunately, the Go team agreed to add math.Round to Go 1.10! And even a working implementation appeared:

 func Round(x float64) float64 { const ( mask = 0x7FF shift = 64 - 11 - 1 bias = 1023 signMask = 1 << 63 fracMask = (1 << shift) - 1 halfMask = 1 << (shift - 1) one = bias << shift ) bits := math.Float64bits(x) e := uint(bits>>shift) & mask switch { case e < bias: // Round abs(x)<1 including denormals. bits &= signMask // +-0 if e == bias-1 { bits |= one // +-1 } case e < bias+shift: // Round any abs(x)>=1 containing a fractional component [0,1). e -= bias bits += halfMask >> e bits &^= fracMask >> e } return math.Float64frombits(bits) }

For those who are not familiar with the float device (I am among them), this code looks completely incomprehensible. Let's try to figure out what he does:

 bits := math.Float64bits(x) e := uint(bits>>shift) & mask

It seems that we take a bit representation of a number, shift it and apply a mask. According to IEEE 754 specification :

This is what IEEE 754-1985 has been used for.

Considering the above constants, we see that the shift amounts to 64–11–1, which means 64 bits per number, 11 of which are used for the exponent, one for the sign and the 52 remaining bits for the mantissa. This means that the shift used removes the bits of the mantissa, and the mask removes the bits of the sign, leaving us only with the exponent.

 switch { case e < bias:

In the resulting number, the exponent is not recorded as it is, but with the addition of 1023 (this is done in order to record negative indicators for very small numbers), which means that we have to subtract 1023 from e calculated above to get the actual indicator. In other words, if e <bias, then we have a negative exponent, which means that the absolute value of the float must be <1. Indeed, we see further:

 // Round abs(x)<1 including denormals. bits &= signMask // +-0 if e == bias-1 { bits |= one // +-1 }

Here the bit is masked by the sign bit, it is used only to save the correct sign: now we can completely ignore the mantissa. We can do this, because in this case we are only interested in the exponent. Since the base of degree 2 is used, and e <bias, we know that the smallest index, which can be, is -1, and 2 ^ -1 = 0.5. In addition, the mantissa has a value of 1.X. Thus, depending on the indicator, our number is either in the range (0.5, 1) or in the range (0, 0.5). Therefore, in the second case, for correct rounding, we need to add one to the number. Fuh. This is described in more detail in Wikipedia .

Now let's look at the second case:

 case e < bias+shift:

Probably, you think that the condition in this thread should be e> bias to cover all cases with a positive exponent. But instead, only their part is used here. The use of the shift is especially interesting here, because it seems to be incomparable with bias. The first is the number of offset bits, and the second is the numerical offset. But, since floating-point numbers are represented as (1.mantissa) * 2 ^ X, then if X is more than the number of bits in the mantissa, we are guaranteed to get the value without the fractional part. That is, the exponent shifted the decimal point to the right so much that the mantissa finally disappeared. Thus, the expression in this thread ignores floating-point numbers that are already rounded.

 // Round any abs(x)>=1 containing a fractional component [0,1). e -= bias bits += halfMask >> e bits &^= fracMask >> e

The first line here is simple: subtract bias from e and get the real value of the exponent. The second line adds 0.5 to the value. This works because the high bit of the mantissa adds 0.5 to the final amount (see the presentation in the Wikipedia article below). In this case, this amount overflows the 52-bit boundaries of the mantissa, the exponent will be increased by 1. The exponent value cannot overflow to the sign bit, since it cannot be greater than the bias + shift from the example above. In any case, the fractional part is cleared. Thus, if the fractional part was greater than or equal to 0.5, it will be increased by 1, otherwise it will be discarded. Sly and not obvious until we take a deeper look.

Conclusion

In this article, I talked mainly about rounding to a smaller module, but there are many other options . In some cases, it is they who are suitable, and I will leave the reader the opportunity to study them and try to implement them on Go. But I hope that now it has become clear to you how rounding is arranged in Go and how you need to test the implementation of rounding.

I think the Go team made the right decision by adding the Round () function to the standard library. Without this, we would continue to use various incorrect implementations.

I hope that now it has become clear to you that when working with float there are many pitfalls, which even the experts sometimes forget. It is easy to invent or copy a one-line implementation from somewhere, but it is difficult to write a truly correct one. Not surprisingly, correctly working rounding appeared only in the sixth major version of Java (15 years after the release of Java 1.0 before the release of Java 7), and I am glad that Go went this way faster.

Source: https://habr.com/ru/post/345784/

All Articles