C #: how not to "shoot yourself in the leg"

Today, we will take a closer look at how it became possible to “ shoot yourself in the foot ” in C #, as well as in general on .NET, when working with logical values , in what practical cases this can happen, and how to prevent this.

What lines are displayed by this console application ?
Having started the application, having previously collected it in the environment of Visual Studio Community 2013, we get the following result:

Unsafe Mode 01: b1 : True 02: b2 : True 03: b1 == b2: False 04: !b1 == !b2: True 05: b1 && b2 : True 06: b1 & b2 : True 07: b1 ^ b2 : True 08: b1 && b3 : True 09: b1 & b3 : False 10: b1 ^ b3 : True Safe Mode 11: b1 : True 12: b2 : True 13: b1 == b2: True 14: !b1 == !b2: True 15: b1 && b2 : True 16: b1 & b2 : True 17: b1 ^ b2 : False 18: b1 && b3 : True 19: b1 & b3 : True 20: b1 ^ b3 : False

Based on the assumption that in each of the logical variables b1, b2, b3 there is either a “true” value, or a value that is different from a “false” (does this also mean “true”? Are they Boolean variables?) questions:

Why are there different results in the Unsafe and Safe Mode blocks in positions 03 and 13, 07 and 17, 09 and 19, 10 and 20, respectively?
(and why, then, are the values in the other corresponding positions in the Unsafe and Safe blocks the same?)
Why inside the Unsafe block the results in positions 05 and 06 are the same, but in 08 and 09 - the results are different?
And why the results in 08 and 09 are different?

Let's try to figure it out:

Probably everyone knows that initially there were no special logical (boolean) data types in programming languages.
Integer data types were used as boolean types.
')
Zero was interpreted as a false value (False), a value other than zero - as true (True).
Thus, the if branching operator could be applied to the integer operand.

There is a well-known mistake that is easy to make in C / C ++ languages by confusing the assignment operators (=) and equalities (==).
The following code will always display the string “i == 1”:

 int i = 0; if (i = 1) printf("i == 1"); else printf("i == 0");

This is due to the fact that in the if branching operator in the operand “i = 1”, the assignment operator (=) was mistakenly used instead of the equality operator (==).
As a result, the value “1” is written into the variable “i”, respectively, the operator “=” returns the value “1”, and the integer value “1” is used as the operand of the if operator, interpreted as a logical (Boolean) value, and will always be executed code from the first branch (printf ("i == 1")).

Therefore, in C / C ++ languages, the comparison operator is used as follows:

 int i = 0; if (1 == i) printf("i == 1"); else printf("i == 0");

instead of "intuitive":

 int i = 0; if (i == 1) printf("i == 1"); else printf("i == 0");

The reason is that in the operator “1 == i” we cannot make an error and write it as “1 = i” - the compiler will not allow the constant (1) to assign a new value (i).

Apparently, at some point, the developers of programming languages decided to add support for “full-fledged” logical types to the languages:
So, in Turbo / Borland Pascal and Delphi appeared type Boolean . Variables of this type could take the values False and True. Moreover, it was documented that the type size is 1 byte, and the ordinal (integer) values returned by the Ord function are 0 and 1 for False and True, respectively.

What about other possible non-zero internal values? The behavior in this case could be uncertain, and the documentation / books clarified that the boolean values should be tested like this:

 var b: Boolean; begin b := True; if b then WriteLn('b = True') else WriteLn('b = False'); end

but not like this:

 var b: Boolean; begin b := True; if b = True then WriteLn('b = True') else WriteLn('b = False'); end

In the variable "b" there could be a nonzero value different from one, and then the result of the comparison "b = True" would be undefined - the result could be false (if the comparison was performed as a comparison of two integers, bypassing the "normalization" of the values, obviously for performance reasons).

On the other hand, it thereby indirectly recognized that a case is possible when a logical variable can contain an internal code different from zero and one, and that a nonzero value is considered “true”, although it can not always be correctly processed:

the logical variable is implemented as an integer, and it is possible to cast an integer to Boolean (not to mention the possibilities of address arithmetic);
this is also confirmed by this : “Casting the variable to a Boolean type is unreliable” —that is, we can cast the integer to Boolean , but the result is “unreliable” —that practically means that the result of testing this value is undefined.

Later in Delphi, Boolean types ByteBool , WordBool , LongBool with sizes 1, 2 and 4 bytes were added for compatibility with Boolean types when working with code written in C / C ++, COM objects, and other third-party code.
It is determined for them that, unlike the Boolean type, any non-zero value is considered “true”.

In C ++, the native type bool was added in the same way (the variables of this type can be false and true ), and its size is non-deterministic (probably depends on the platform's digit capacity - for performance reasons or some other type; data type dimensions for specific versions Microsoft's compilers are listed here and here ).
And also there is no explicit definition of the internal codes false and true , although from the code examples accompanying the definitions false and true , it indirectly implies that false has internal numeric code 0, and true - internal numerical code 1.

We conducted a historical excursion on the genesis of Boolean types, to see the pitfalls when working with such a seemingly simple data type - a logical (boolean) type, and with an understanding of the issue, approach the internal structure of the logical data type in C #, discuss why the program got the results the way it did and how to work correctly in C # with boolean values when interacting with unmanaged code.
The retreat turned out to be quite voluminous, so we will consider these questions next time.

Source: https://habr.com/ru/post/252419/

All Articles

C #: how not to "shoot yourself in the leg"

More articles: