📜 ⬆️ ⬇️

We continue to deal with the "historical reasons" in cmd.exe

image

In the previous article, we talked about a possible solution to the situation with the need to specify the "/ D" key for the CD command, which is included in the delivery of the standard command line interpreter for the Windows operating system cmd.exe . It is time to talk about another behavior, which stretches from time immemorial for no particular reason.

This time we will talk about autocompletion of paths, which in most environments and software products (and cmd.exe is not an exception in this case) by pressing the Tab / Shift-Tab keys. I think no one will argue with the fact that the feature is quite useful and often saves up to a few seconds of time that would be spent on manually entering the full path to the interested user of a file or directory. It's great that it is also present in cmd.exe, however ...
')
Let's experiment. Run cmd.exe (Win-R -> cmd), start typing “CD C: /”, press Tab, and ... Instead of the expected directories like “Program Files” and “Windows”, we get the first alphabetical object from% HOMEPATH%, “Stuck together” together with “C: /” (in my case it gave the result in the form of “C: /. Vim”). Why? I think those who, by their nature of work, often had to deal with cmd.exe, already understood what was the matter - instead of forward slash, for correct autocompletion, you should use backslash (by the way, there are other exceptions in this regard). This is especially unusual for those who spend most of their time in other systems (for example, * nix-like), where the forward slash is used as the separator path and not the reverse. Why Microsoft decided to use this particular symbol instead of the forward slash, which has already become familiar to many users at the time, is explained, for example, here . Well, it remains for us to either come to terms with this, or pick up a debugger file and start exploring cmd.exe. If we had chosen the first path, then there would have been no article, so you should have already guessed what was going on.

How was the process, and what came of it, read under the cut (carefully, a lot of screenshots ).

First of all, we need to see why this cmd.exe decided to search for objects not in the user-specified directory, but in% HOMEPATH.

Iteration over objects in a directory using WinAPI is usually done using the FindFirstFile and FindNextFile functions , as well as their variations in the form of FindFirstFileEx , FindFirstFileTransacted , etc. Run OllyDbg , load cmd.exe into it (of course, copied in advance to any directory other than "% WINDIR% \ system32"), open a window with a list of intermodule calls (right-click on the window CPU -> Search for -> All intermodular calls ), we write “FindFirstFile” and set breakpoints on all calls using the F2 key:

image

Enter the “CD C: /” command we are examining, press Tab and see the following picture in front of us:

image

Pay attention to the first argument passed to the FindFirstFileEx function - it is he who, according to the documentation, sets the criterion by which the search will be performed:

lpFileName [in]
The asterisk (*) or the question mark (?)

In my case, he pointed to the address 0x0030F660 , where the Unicode string "C: \ Program Files \ *" was stored. Why precisely she? Yes, because it was there that I was at the time of entering the CD command.

Let's do the same thing using backslash instead of forward slash. Press F9, enter the command “CD C: \” followed by pressing Tab and see:

image

Yes, now this argument points to the string "C: \ *", as expected. Therefore, in the case of using a forward slash as the path separator, cmd.exe runs over the objects suitable for auto-completion in the current directory.

We run through the calls of all the procedures from the call stack, which is opened by pressing Alt-K, and we see near one of them something similar to the parsing of the command that came from the user:

image

We set the breakpoint at the beginning of this procedure (in my case it is 0x4ACE1877 ), press F9, enter our command with a backslash and Tab'om again and start step by step debugging. Shortly after the amplified pressing of the F7 key, we realize that we were in a cycle that runs over all the characters in the command entered by the user:

image

EBP + 8 points to a Unicode line with a command, EBP + 10 contains the length of the command, and EDI is a loop counter.

Almost immediately after this cycle, there is a call to the function std :: memcpy , as a result of which, if you use backslash, the “C: \” will get to dest

image

, and in the case of forward slash, the empty string:

image

Well, let's try to figure out what is happening in this cycle by translating the algorithm of its work into some high-level programming language. IDA Pro can decompile the code for you, but, unfortunately, it asks for a lot of money for it, so we will try to translate it yourself into C ++:

#include <cstddef> #include <cstring> #include <cwchar> #include <iostream> #include <string> int main() { std::wstring command; std::getline(std::wcin, command); auto command_size = command.size(); int ebx = -1; int esi = 0; int edx = 0; const int ebp_24 = 0; // Always 0 in our case cause it changes in the '"' branch // Not actually used in our case int ebp_1c = 0; int ebp_28 = 0; int ebp_2c = 0; /** * 4ACE18C7 | > / 897D D0 / MOV DWORD PTR SS : [EBP - 30], EDI * 4ACE18CA | . | 8B45 10 | MOV EAX, DWORD PTR SS : [EBP + 10] * 4ACE18CD | . | 3BF8 | CMP EDI, EAX * 4ACE18CF | . | 0F8D 90000000 | JGE cmd.4ACE1965 */ for (std::wstring::size_type i = 0; i < command_size; ++i) { /** * 4ACE18D5 | . 8B45 08 | MOV EAX, DWORD PTR SS : [EBP + 8] * 4ACE18D8 | . 0FB70478 | MOVZX EAX, WORD PTR DS : [EAX + EDI * 2] */ const wchar_t cur_symbol = command[i]; // 4ACE18DC | . 66:83F8 2F | CMP AX, 2F if (cur_symbol == L'/') { /** * 4ACE18E2 | . 8D77 01 | LEA ESI, DWORD PTR DS : [EDI + 1] * 4ACE18E5 | . 8975 D8 | MOV DWORD PTR SS : [EBP - 28], ESI */ esi = i + 1; ebp_28 = esi; } else if (cur_symbol == L'"') { // ... } // 4ACE18F0 | . 3955 DC | CMP DWORD PTR SS : [EBP - 24], EDX if (ebp_24 == edx) { /** * 4ACE190C | . 50 | PUSH EAX; / w * 4ACE190D | . 68 E008D04A | PUSH cmd.4AD008E0; | wstr = " &()[]{}^=;!%'+,`~" * 4ACE1912 | .FF15 F010CC4A | CALL DWORD PTR DS : [<&msvcrt.wcschr>]; \wcschr * 4ACE1918 | . 59 | POP ECX * 4ACE1919 | . 59 | POP ECX * 4ACE191A | . 85C0 | TEST EAX, EAX */ if (std::wcschr(L" &()[]{}^=;!%'+,`~", cur_symbol) != NULL) { /** * 4ACE191E |. 8D77 01 |LEA ESI,DWORD PTR DS:[EDI+1] * 4ACE1921 |. 8975 D8 |MOV DWORD PTR SS:[EBP-28],ESI * 4ACE1924 |. 8365 E4 00 |AND DWORD PTR SS:[EBP-1C],0 * 4ACE1928 |. 33D2 |XOR EDX,EDX */ esi = i + 1; ebp_28 = esi; ebp_1c = 0; edx = 0; } else { // 4ACE192C | > \33D2 | XOR EDX, EDX edx = 0; /** * 4ACE1935 | . 66:83F8 3A | CMP AX, 3A * 4ACE1939 | . 74 1B | JE SHORT cmd.4ACE1956 * 4ACE193B | . 66 : 83F8 5C | CMP AX, 5C * 4ACE193F | . 74 15 | JE SHORT cmd.4ACE1956 */ if (cur_symbol == L':' || cur_symbol == L'\\') { /** * 4ACE1956 | > \8D5F 01 | LEA EBX, DWORD PTR DS : [EDI + 1] * 4ACE1959 | . 895D D4 | MOV DWORD PTR SS : [EBP - 2C], EBX * 4ACE195C | > 8955 E4 | MOV DWORD PTR SS : [EBP - 1C], EDX */ ebx = i + 1; ebp_2c = ebx; ebp_1c = edx; } else if (cur_symbol == L'*' || cur_symbol == L'?') { // ... } } } } /** * 4ACE1965 |> \83FB FF CMP EBX,-1 * 4ACE1968 |. 74 04 JE SHORT cmd.4ACE196E * 4ACE196A |. 3BDE CMP EBX,ESI * 4ACE196C |. 7D 05 JGE SHORT cmd.4ACE1973 */ if (ebx == -1 || ebx < esi) { /** * 4ACE196E | > \8BDE MOV EBX, ESI * 4ACE1970 | . 895D D4 MOV DWORD PTR SS : [EBP - 2C], EBX */ ebx = esi; ebp_2c = ebx; } /** * 4ACE1973 | > \2BC6 SUB EAX, ESI * 4ACE1975 | . 03C0 ADD EAX, EAX * 4ACE1977 | . 8BF8 MOV EDI, EAX * 4ACE1979 | . 57 PUSH EDI; / n * 4ACE197A | . 8B45 08 MOV EAX, DWORD PTR SS : [EBP + 8]; | * 4ACE197D | . 8D0470 LEA EAX, DWORD PTR DS : [EAX + ESI * 2]; | * 4ACE1980 | . 50 PUSH EAX; | src * 4ACE1981 | .FF75 E0 PUSH DWORD PTR SS : [EBP - 20]; | dest * 4ACE1984 | .E8 52FAFDFF CALL <JMP.&msvcrt.memcpy>; \memcpy */ const std::size_t count = (command_size - esi) * 2; wchar_t dest[1024] = { 0 }; std::memcpy(dest, command.substr(esi).c_str(), count); std::wcout << "Result: " << dest << std::endl; } 

Places marked with the comment "// ..." are not affected in the cases we are considering.

Characters like '*' and '\' were identified by the ASCII code table:

image

Experimenting with the input data, you can see the following:

CD C: \
Result: C: \

CD C: /
Result:

CD C: \ Windows \
Result: C: \ Windows \

CD C: / Windows \
Result: Windows \

It is easy to see that forward slash causes problems regardless of the exact location of the path entered by the user, at least at the end, at least in the middle.

The solution could be to replace all forward slashs with backslash immediately after cmd.exe realized that auto completion is necessary. To do this, I propose to approach from the other side - to carry out step-by-step debugging immediately after the user enters data from the standard input stream.

However, reading data from stdin can be done in many different ways. How to understand exactly what is used in cmd.exe? Quite simply - press F9, then F12 (Pause), look at the call stack and see among the calls the WinAPI function called ReadConsole :

image

By default, ReadConsole returns control to the code that called it after pressing the Enter key, but apparently this is not our case, since it should complete its work, for example, after pressing Tab.

We put the software breakpoint on its call and achieve its operation:

image

Note the last parameter, here called pReserved . In fact, it is called pInputControl and is responsible for the following:

pInputControl [in, optional]
A pointer to a readout. This parameter can be NULL

In our case, it is not NULL at all, so let's see what the CONSOLE_READCONSOLE_CONTROL structure looks like:

 typedef struct _CONSOLE_READCONSOLE_CONTROL { ULONG nLength; ULONG nInitialChars; ULONG dwCtrlWakeupMask; ULONG dwControlKeyState; } CONSOLE_READCONSOLE_CONTROL, *PCONSOLE_READCONSOLE_CONTROL; 

Looking at the "raw" bytes is not very convenient, so let's use a special plugin for OllyDbg called StollyStructs , which is designed to visualize structures. Download, unzip .dll and .ini to the directory where the OllyDbg executable file is located (of course, if it is specified as a path for plugins, which is done by default) and restart the debugger. After restarting cmd.exe, the addresses may change, but most likely, the “ending” of the addresses will remain the same. For example, if previously the ReadConsole call that we are interested in was located at 0x4ACD3589 , then now it will probably be located at the address 0xXXXXX589 :

image

We put a breakpoint, stop at it, click on Plugins -> StollyStruct -> Select structure, enter the address passed as an argument to pInputControl in the Address field, and ... We do not find the structure CONSOLE_READCONSOLE_CONTROL in the drop-down list. Well, the author did not promise that all structures from WinAPI will be set in advance. Option two - either add a description of this structure to the plug-in configuration file, or use another structure that will be similar to the one we are interested in. The first thing that came to my mind is the RECT structure, which also contains 4 fields with the only difference that it uses LONGs instead of ULONGs, which, in principle, will hardly bother us in this case:

 typedef struct _RECT { LONG left; LONG top; LONG right; LONG bottom; } RECT, *PRECT; 

The result is the following:

image

Specifying the characters to stop input is done using the dwCtrlWakeupMask field:

dwCtrlWakeupMask
A user-defined control character is used.

As you can see, in our case it contains the value 0x200, which is obtained as a result of performing the bit shift operation 1 << 0x9, where 0x9 is the ASCII code of the Tab.

Well, we made sure that the return from the ReadConsole function is performed after the user has entered either Enter'a or Tab'a. Now let's go back to step by step debugging.

Having run a little, we will find ourselves in another cycle, which is iterated over all the characters in the command entered by the user:

image

Here, EDI points to a unicode string with a command, and EAX is a loop counter.

As you can see, each character is compared first with 0x0D, then with the value at 0x4A1640A0 , and then with the content at 0x4A1640A4 . If you look at the table of ASCII codes, then we will see that 0x0D is nothing more than a carriage return. At the addresses indicated above, the same value 0x9 is stored, which, as mentioned earlier, is the Tab's ASCII code:

image

And not far from the address to which the transition will be carried out in the case of equality of the current character with the Tab, there is the parsing code of the transmitted command, which we have already seen earlier. Well, in my opinion, this is the very place where it is best to place the transition on our code cave.

What are we going to do in it? I propose to proceed as follows - run from the end of the line to its beginning, checking each character up to the first space character encountered for equality with forward slash and replacing it with backslash. It will look something like this:

 PUSHFD PUSHAD ;   ,     TEST EAX,EAX JZ l1 ;    (    ;  ,    Tab') l4: DEC EAX ;   ECX   MOVZX ECX,WORD PTR DS:[EDI+EAX*2] CMP CX,2F ;   forward slash JE l2 CMP CX,20 ;    JE l1 JMP l3 l2: ;  forward slash  backslash MOV WORD PTR DS:[EDI+EAX*2],5C l3: ;     ,     TEST EAX,EAX ;        JNZ l4 l1: POPAD POPFD ;    ,   ;     JMP 4ACD42CD 

Find a place for our code cave (you can do this with Ctrl-B -> many zeros in the “HEX + 0C” field) and write the following code there (addresses, of course, may differ):

 4A163CC5 9C PUSHFD 4A163CC6 60 PUSHAD 4A163CC7 85C0 TEST EAX,EAX 4A163CC9 74 1D JE SHORT cmd.4A163CE8 4A163CCB 48 DEC EAX 4A163CCC 0FB70C47 MOVZX ECX,WORD PTR DS:[EDI+EAX*2] 4A163CD0 66:83F9 2F CMP CX,2F 4A163CD4 74 08 JE SHORT cmd.4A163CDE 4A163CD6 66:83F9 20 CMP CX,20 4A163CDA 74 0C JE SHORT cmd.4A163CE8 4A163CDC EB 06 JMP SHORT cmd.4A163CE4 4A163CDE 66:C70447 5C0>MOV WORD PTR DS:[EDI+EAX*2],5C 4A163CE4 85C0 TEST EAX,EAX 4A163CE6 ^ 75 E3 JNZ SHORT cmd.4A163CCB 4A163CE8 61 POPAD 4A163CE9 9D POPFD 4A163CEA ^ E9 DE05FFFF JMP cmd.4A1542CD 

where 0x4A1542CD is the address to which we had to go as a result of a conditional branch located at 0x4A154299 and performing a check for equality of the current symbol in the command to Tab. That transition, respectively, is replaced by a jump to our code cave:

image

I think you have already noticed that he wiped out the following instructions. It's okay, because, in fact, it was a similar test for the equality of the current character on the same Tab, and it was impossible to get to it in other ways. To make sure of this, you can select our changes, return everything, as it was with Alt-Backspace, select a line with this instruction and press Ctrl-R, where there will be one and only line with the same address:

image

We check the workability, and ... By pressing Tab, forward slash and are really replaced with backslash, as a result of which auto-completion is performed according to the user-specified directory, regardless of which slashes he originally used.

Afterword


Someone may say that this is all the little things. Some may not like the fact that we are solving this task, and not the developers from Microsoft. Someone may not like anything at all . But the fact remains that we have solved our problem, and now cmd.exe works as we wanted at the very beginning of the article. And to do similar or not, is up to you.

In fairness it should be noted that in PowerShell this “problem”, as well as the situation with the “/ D” key for the CD command, is still corrected.

Thank you for your attention, and again I hope that the article was useful to someone.

Source: https://habr.com/ru/post/261233/


All Articles