📜 ⬆️ ⬇️

Everyone knows that "++ i + ++ i" is bad, but what is behind the screen?

Bug peeking out from behind a screen at you o_O Undoubtedly, all programmers know that using expressions like the one in the title of the post is not something that is undesirable, but strictly contraindicated. Such constructs, the behavior of the compiler in which is not defined, can bring many subtle errors and undesirable consequences. But I am sure that many novice programmers would like to understand this problem more deeply and, looking behind the compiler screen, to find out exactly what is happening in such cases. I dedicate this post to the study of one of the examples of such code. Welcome under the cat :)

Object of study


For example, I will analyze the work of the expression " ++i + ++i " in two different languages: C and C #. The first, as you know, is compiled into native processor code, and the second, if roughly, works on the basis of a virtual stack machine. And so, consider the examples themselves:

Source code in C:
  1. #include <stdio.h>
  2. void main ()
  3. {
  4. int i = 5;
  5. i = ++ i + ++ i;
  6. printf ( "% d \ n" , i);
  7. }

And the source code in C #:
  1. using System;
  2. public class Test
  3. {
  4. public static void Main ()
  5. {
  6. int i = 5;
  7. i = ++ i + ++ i;
  8. Console .WriteLine (i);
  9. }
  10. }

Behind the Screen C


Using the disassembler, let's see something the same as generated by the C compiler:

 # A # 5: int i = 5;                        
   cs: 0295 BE0500 mov si, 0005  
 # A # 6: i = ++ i + ++ i;                    
   cs: 0298 46 inc si       
   cs: 0299 ​​46 inc si       
   cs: 029A 8BC6 mov ax, si    
   cs: 029C 03C6 add ax, si    
   cs: 029E 8BF0 mov si, ax    
 # A # 7: printf ("% d \ n", i);                 
   cs: 02A0 56 push si       
   cs: 02A1 B8AA00 mov ax, 00AA  
   cs: 02A4 50 push ax       
   cs: 02A5 E8330C call _printf  

As can be seen from the listing, the compiler has mapped the variable i to the SI register of the x86 . After that, double-incrementing this register, I added it to myself through the AX battery. As a result, the variable i becomes equal to 14.
')

Behind the screen C #


With the help of Ildasm, let's see what lies behind C #:

 .method public hidebysig static void Main () cil managed
 {
   .entrypoint
   // Code size 21 (0x15)
   .maxstack 3
   .locals init (int32 V_0)
   IL_0000: ldc.i4.5 // push 5 5
   IL_0001: stloc.0 // i: = pop () null
   IL_0002: ldloc.0 // push ii
   IL_0003: ldc.i4.1 // push 1 i, 1
   IL_0004: add // push (pop () + pop ()) (i + 1)  6
   IL_0005: dup // copy top of stack 6, 6
   IL_0006: stloc.0 // i: = pop () // i: = 6 6
   IL_0007: ldloc.0 // push i 6, i
   IL_0008: ldc.i4.1 // push 1 6, i, 1
   IL_0009: add // push (pop () + pop ()) 6, (i + 1) i.e.  7
   IL_000a: dup // copy tops of stack 6, 7, 7
   IL_000b: stloc.0 // i: = pop () // i: = 7 6, 7
   IL_000c: add // push (pop () + pop ()) 13
   IL_000d: stloc.0 // i: = pop () null
   IL_000e: ldloc.0 // push ii
   IL_000f: call void [mscorlib] System.Console :: WriteLine (int32)
   IL_0014: ret
 } // end of method Test :: Main

For clarity, I added comments to the assembler of the virtual stack machine. Looking at the listing, you can see that for the increment of the variable i , the variable itself and the unit are pushed onto the stack. Then the addition command is executed which, taking two values ​​from the stack and adding, pushes the result back onto the stack. Then duplication of the top of the stack occurs and the value is written back to the variable. Thus, 5 + 1 remains in the stack, i.e. 6. Next, the cycle repeats for another increment: the variable is pushed onto the stack, followed by the unit, the addition, duplication of the vertex occurs, and the result of the second increment is written back into the variable. Now i will be 7, and 6 from the first case and 7 from the second will remain in the stack. Then the command of addition is executed and the result, now equal to 13, is entered into the variable.

Results


Here it is, it looks the same, the code is executed completely differently in different conditions. Avoid such code in your programs.

If the topic seemed interesting to you - let me know, and I will write a couple more interesting moments from the world of compilation;)

UPD . The comments suggest that in PHP , Java , Actionscript 3 , JavaScript and TCL, the result is also 13, but in Perl , K ++ and C GCC - 14 (but the result of GCC may depend on the optimizer settings) :)

PL / SQL and Python distinguished themselves - 10 (as KO suggests, due to the lack of increments in the language), however Bash - 12.

There are also compilers that do not allow writing such code. For example Ruby (or like this ).

And what will be the result in your,% username%, compiler?

UPD2 . Related links: Wikipedia follow points, Alena C ++ .

Source: https://habr.com/ru/post/88185/


All Articles