📜 ⬆️ ⬇️

Dirty reverse engineering solutions

image

Before the developers, quite often there is a choice - to do everything correctly, after spending a lot of time on solving the problem, or to make it work, without really going into the details of exactly how this happened. From the customer’s side, of course, the most attractive is a kind of middle ground, which in this case is at the same time a good understanding by the programmer of the task and the fewer man-hours spent on him. With developers, too, everything is not so simple - on the one hand, to understand what is happening in its own code, this is quite natural desire (especially if the support of this product will also lie on its shoulders), and on the other hand, if the results of the application are presented in a visual form (graphics / sound or video fragments etc), development is one-time, and the testing department says that everything is fine, why not scroll the rest of Habr's working time by dedicating time to yourself?

Get to the point. In one of the previous articles, I already mentioned a program called “Talker” . Despite the name, by itself it does not voice anything, but merely serves as a link between the user and the speech engines, providing a more user-friendly interface and configuration capability. One of the most popular in narrow circles of engines is the "Digalo 2000 text-to-speech engine" (hereinafter - Digalo), the link to which can be found just on the site "Speaking". As you have probably guessed from the topics of my previous articles, not everything is so good with him, and there were also some bugs. This time the problem manifested itself when the text was “aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa”. Having a little experimented, I found that when a certain number of "inseparable" symbols were reached, Digalo began to crash, offering to debug its process. Well, why not?
')
How was the process, and what came of it, read under the cut (before reading this article, I strongly recommend that you familiarize yourself with the previous ones, which can be found, for example, here ).

I think it’s needless to say that the source code with Digalo is not supplied (moreover, even the binaries of the version I’m interested in cannot be downloaded from the official site), so our best friends will again be the disassembler, debugger and hex editor, which, in fact, can be reduced to only one OllyDbg . But before undertaking the study of the disassembled listing, yes, yes, you need to download the application and check if it is covered with a protector and not packed with any packer.

Download and install Digalo, climb into the directory where it was installed, and examine the executable file in DiE and PEiD :

image

image

It is easy to see that both analyzers decided that DIGALO_RUS.exe is packed with PECompact , and, in principle, we have no particular reason not to believe them.

Despite the fact that PECompact and ASPack (which was already discussed in one of the previous articles ) are completely different packers, the principle of unpacking is the same for them. We load DIGALO_RUS.exe into OllyDbg, we reach the instruction PUSHFD , which is executed immediately after the first JMP , open the Command Line with Alt-F1, set the hardlock on ESP-4 with the help of the command hr esp-4 , press F9 to those until we are in place after the POPFD instruction has been executed , we reach the nearest RETN , press F8 and find ourselves at 0x0045B97B , which in this case is OEP:

image

We remove the dump using the OllyDump plugin, leaving a checkmark on the “Rebuild Import” checkbox, checking the performance of the application under investigation after unpacking and ... We see that it works (of course, on those lines that it correctly processed before).

Now we are faced with an important question - how can we debug this speech engine? The problem is that it falls almost immediately after the start, cutting off the possibility of attaching to an already running process. Well, there is a small trick here - we can change the first byte located on the OEP to an INT3 instruction, which in this case (due to the lack of a debugger connected to the process) will cause the OS to show a standard dialog box with a suggestion to debug the process in the current JIT- debugger Make OllyDbg as such (Options -> Just-in-time debugging -> Make OllyDbg just-in-time-debugger) and replace the first byte with OEP from 0x55 ( PUSH EBP ) with 0xCC ( INT3 ):

image

Save the changes (right-click on the CPU window -> Copy to executable -> All modifications -> Copy all -> right-click on the window that opens -> Save file), replace the original executable file and run the console version of the "Speaker" with the argument " aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ”:

image

Press the “Debug the program” button, replace INT3 back with PUSH EBP , press F9 and see that we are dealing with Access Violation:

image

We start the application again, set the breakpoint to the address where the Access Violation occurs (in my case it is 0x00428B9D ), and try to figure out how often this place is called before the crash. It turns out that the breakpoint is triggered twice before falling (after the first, everything is fine, but at the time of the second trigger, the value of the ECX register just contains the address, the access to which causes this exception). Let's run a trace from this place after the first breakpoint is triggered and see what happens in the “Run trace” window if the application is successful (for example, when starting a talk with the argument “Hello”) and in the event of a crash:

In case of a fall
Address Thread Command ; Registers and comments Flushing gathered information 00428B9D 00002410 MOV EDX,DWORD PTR DS:[ECX+64] ; EDX=0051A820 00428BA0 00002410 LEA EAX,DWORD PTR DS:[EAX+EAX*2] ; EAX=00000003 00428BA3 00002410 MOV CL,BYTE PTR DS:[ESI] ; ECX=03007BA0 [...] 004303FB 00002410 CMP EDX,ECX 004303FD 00002410 JL digalo_r.00430282 00430282 00002410 MOV EDI,1 ; EDI=00000001 00430287 00002410 MOV EAX,DWORD PTR SS:[ESP+1C] ; EAX=00000028 0043028B 00002410 MOV EDX,DWORD PTR DS:[4A8BDC] ; EDX=004A8DC8 00430291 00002410 MOVSX ECX,BYTE PTR SS:[ESP+EAX+113] ; ECX=00000074 

In case of correct operation
 Address Thread Command ; Registers and comments Flushing gathered information 00428B9D 000024D8 MOV EDX,DWORD PTR DS:[ECX+64] ; EDX=0059A880 00428BA0 000024D8 LEA EAX,DWORD PTR DS:[EAX+EAX*2] ; EAX=00000003 00428BA3 000024D8 MOV CL,BYTE PTR DS:[ESI] ; ECX=02F37BE5 [...] 004303FB 000024D8 CMP EDX,ECX 004303FD 000024D8 JL digalo_r.00430282 00430403 000024D8 MOV EDI,DWORD PTR SS:[ESP+28] ; EDI=00000001 00430407 000024D8 MOV ESI,DWORD PTR SS:[ESP+244] ; ESI=02F37B20 0043040E 000024D8 LEA EDX,DWORD PTR SS:[ESP+114] ; EDX=024CF348 

If you look at the output from the end, we will see that the difference is at least that in the event of a fall, a conditional transition to the address 0x00430282 works , which does not happen if it works correctly.

Well, let's try to fill in the conditional instruction to this address and see what happens. Yes, Digalo is now really pronouncing this most drawn-out “a”! But another problem appeared - after reading the text, the engine again drops from Access Violation, but in a completely different place:

image

Already at the addresses you should be aware that this time we are talking about the bowels of the system libraries. Let's look at the call stack with Alt-K and find out that the crash occurred inside the HeapFree WinAPI function:

image

Of course, with 99% probability we didn’t find any bug in kernel32.dll, but just passed the wrong parameters. If we set breakpoints on calls to HeapFree , then we will see that in all other cases the argument passed as the pMemory parameter contains an address that is significantly different from the one that was passed when the application crashed :

image

Suspicious, isn't it? But what can we do? Option two - either long and tedious to study the reason for getting this address here, or just to score on the release of memory. Most of you, probably, are already beginning to cover me with obscene expressions, but if you think about it, there may be practically nothing terrible about this. Yes, yes, you heard right. Of course, I agree that removing all HeapFree calls from the code is, to put it mildly, wrong, because during operation the application can allocate an insane amount of memory (for example, when reading long text or something like that), the failure of which can result already to new problems. However, what's wrong with the fact that we remove the release of memory at the completion of the application? Since we are talking only about Windows, the OS will still free up resources (for some platforms and systems this could be critical, I agree).

Let's take a look at the call stack, how we got here. Well, we will launch the application again and set the breakpoints to the addresses 0x0045A2B3 and 0x0041136C . The breakpoint at the first address is triggered many times, which tells us that this function is most likely a wrapper over HeapFree and serves for the general release of the specified memory, but the breakpoint at the second address is triggered only after reading the text to it. , most likely, means that this procedure is called only when the application terminates:

image

Let us call procedure 0x0045A273 , located at 0x0041136C , and check if this fixes our problem. Yes, the problem is fixed - the engine says the specified phrase and ends correctly:

image

Since my goal was to get the possibility of pronouncing a particular lingering sound “a” using the speech engine Digalo, then, one might say, the task was completed. Yes, we didn’t go deep into finding out the reasons for the application crash when calling the HeapFree function, and didn’t fully understand whether it is possible to just set up a conditional transition in order to avoid the initial problem, but in the end, why spend too much on solving this task? a lot of time? Sound pronounced? Pronounced. For the rest of the phrases and sounds, you can continue to use the original version of the Digalo executable file, so as not to worry that we have added some unintended consequences to our actions.

Afterword


With my article I wanted to show that to achieve your right here and now may not be as difficult for you as you might think. If a program refuses to save the results of your work, or, for example, does not do what is expected of it in certain situations, you can easily solve this question yourself, without waiting for an answer from those. support of the application used (which, by the way, may not be at all). Good or bad, decide for yourself. In the end, this is generally a rather strange question for reverse engineering, right?

Thank you for your attention, and again I hope that the article was useful to someone.

Source: https://habr.com/ru/post/260861/


All Articles