I will go straight to the point. Task: at any point in the code by calling specials. method to create a second thread that will start from the point of calling this method in the parent thread, while retaining the possibility of debugging and the values of all local variables at all levels of method calls.
The implementation does not depend on the final platform (.Net / Java), because written in C ++ / Asm, however, custom code is made in C #, since I write on it.

')
Now that I have finally stabilized the example for 32-bit systems, I have the courage to show it to the public as completely ready. And yes, I repeat: when adapting it will work on any platform

To begin with, the full list of articles posted on this cycle in Habré Goals
The aim of the work is to build a functional associated with threads, which is not provided by the operating system. For example, the Fork () method of the Linux operating system, corrected for Windows OS realizations, was taken.
So, if we have the Original method, within which the Fork.CloneThread () method is called in some part of it, a second execution thread should occur, the beginning of which will be equal to the call point of the Fork.CloneThread () method and which will be completed at the output of the method Original so that all values of the local variables of the source stream are saved in the second execution thread. In other words, for the CloneThread () call to split the current thread into two.
What is required from the reader
- Lack of fear to read assembler. It's just =) Where something is not clear, use google
- Understanding that a thread stack is one per thread. Understanding what it is for
Materials for preparation:
Stream cloning
What do we have initially? There is our stream. It is also possible to create a new thread or schedule a task in the thread pool by executing your code there. We also understand that information on nested calls is stored in the call stack and that, if desired, we can manipulate it (for example, using C ++ / CLI). Moreover, if we follow the agreements and enter the value of the EBP register, the return address for the ret ret and allocate space for the local ones (if necessary) to the top of the stack, this can simulate the method call.
What needs to be done to clone the flow?
- Preservation
- Inside the CloneThread method (C #) we get the address of any local variable
- Make a call to the C ++ method, passing it this address. At this stage, the call stack looks like this:

Well, or in an abbreviated manner, like this:

- Inside, we get the value of EBP - a pointer to our call frame and by chain, dereferencing the pointer, go to the CloneThread method, checking the current EBP with the address of a local variable in CloneThread. This is necessary in order to go through all the proxy calls between C # and C ++, which are generated by JITter.
- We add 1 to exit the CloneThread frame and get into the code that calls our library function. Everything from the received address to the ESP is a chain of calls from the user code. We save it to the buffer, create a stream (or take it from the pool) and pass it the address of this buffer — copies of the stack.
- Recovery . In order for the new thread to continue working from the copy point in the parent, it is necessary to imitate the CloneThread () call from the user method that was called in the new thread (which no one actually called). To do this, we need to add a saved piece of the stack of the parent thread to the top of our call stack, fix the EBP chain that forms the stack of frames, and run the code.
- Initially, when our code just started working in the second thread, we have this kind of call stacks:

- We get the address of the ESP.
- Push to the stack the address of the body of the current method - to return from the user method, which will be simulated
- We push EBP - to maintain the integrity of the stackframes. Together with the stack copy on the heap, we have the following form of call stacks:

- We fix the saved EBP chains in a copy of the stack (you can't do it in place) before copying

- Using the push commands, insert into the current stack a copy of the stack of the parent thread (simulate calling the user-defined method that called CloneThread, which called the many proxies and the C ++ method as a result)

- We do the far JMP in C ++ method CloneThread, in which we ensure the launch of return
- This leads to an output in CloneThread (C #), which goes into a custom method
- Voila - in both streams the code is executed from the same point. Flow branching over.
Why do it
The most important thing for what it is done is to consolidate understanding of how everything works and that if you know, you can begin to manipulate it.
Resources
- DotNetEx project GitHub : the project in it is called AdvancedThreadingLibrary, to start using RocketScience / 01-forkingThread. By the way , in the same library there are examples with sizeof (ReferenceType), IoC with the shipped assembly and a pool of objects in its heap.