
In the standard notebook for all versions of Windows, starting around 2001, there is an error about which almost everyone knows, but no one is going to fix it. And this is understandable, because this is not a critical vulnerability, it does not threaten anyone’s security. Does anyone use a notebook at all?
Nevertheless, the fact itself is rather strange, so we will try to find this error in the 64-bit and 32-bit notepad.exe code from windows 7, fix it, and find out finally, why it originated. The error is as follows:
')
If the word wrap option is turned on in Notepad, then
after saving the file , various glitches begin: the lines start to break up, the cursor flies away, the text is entered not where you expect it, and so on.
To begin, let's try to find out more precisely what is happening. Open or enter some text with long lines so that they are transferred. Save the file. If you now try to edit it, for example, by adding the word "blue", the lines will be transferred incorrectly, breaking the formatting:

If you reduce the notepad window, the lines are cut (this is visible on the title picture), and when stretched, they remain in place, not filling the increasing window. As if in each line appeared a hard "line feed" in the place where it ended at the moment of saving. Apparently the text somehow spoils in memory:

If you now save the file again, it becomes even worse. All lines are reformatted, but the window is not updated. Therefore, the cursor can move to another place, and if you start to enter text, it turns out that you are not entering it in the place where the cursor is, but in a completely different one. Programmers who wrote notepad reasoned logically: when saving a file, nothing in the window should change, so there is no point in updating it. But in our case, taking into account this error, the entire text changes. Every windows user can reproduce the situation, because the latest version where this error was not was Windows'98, and it is unlikely that anyone else has it.
So, apparently, when you save the file, something goes wrong and the text is spoiled. How to find this place in the code? Open notepad.exe in some debugger. As you know, there are two notebooks for compatibility in a 64-bit system: 32-bit and 64-bit, you should not confuse them.
We introduce the text, which will be easy to see how it spoils when you wrap lines. Type “first text line second text line” in one line and then reduce the window so that it is cut in the middle.

It is reasonable to assume that writing is done using the WriteFile function. It turns out that it is called in the code as much as 6 times. Without thinking twice, we set breakpoints on all 6 calls. Start the notebook and click "save". Execution stops here:

Let's look at all registers where call parameters are contained. In rcx we have 104, it is not clear what. A rdx = 002D45E0, it looks like an address in memory. Let's see what is there.

Fine. From here we are recording. Let's try to execute the code further to see where it is spoiled. However, almost immediately the data is overwritten, which means that this is just a temporary buffer, and the text itself is stored somewhere else. Let's look above the program.

Yeah, before saving, the text is apparently converted from multibyte to single-byte encoding. In the same way as last time, let's look at the parameters. rax = 002D45E0, here we still have zeros. This is the place where the result will fall. esi = 20 is the length of the text. ecx = 4eZ, no comments. edx = 400, the same. But r8 = 002D6780:

Again we will continue execution, observing the contents of this section of memory. After a few dozen commands, we exit the subroutine, some transitions, calls are performed, but without paying attention to it, we continue to put pressure on “step over”, executing the code step by step, and watching only the window with the text. And at some point it changes. As you can see, between 1 and 2 lines there appeared codes 0d, 0d, 0a:

As is usually the case, we slipped the necessary command, constantly pressing the button, so we have to repeat everything again, remembering where approximately it happened. Now, as we approach the right place in the code, we slow down, and we determine exactly that the text has become corrupted on this call:

You can try what will happen if you do not make this call. We reach this place again, and right here, in debugging, we change the RIP (the register where the address of the currently running code is stored) to 00000000FFA38EE1, as if we had missed this call, which spoiled everything to us. Surprisingly, everything works, the text does not break!
Here it must be said that in such cases they usually do not understand what this subroutine is, what it does and why, but simply throw it out of the EXE file. This can be done in different ways, for example, to hammer it all
NOP 's, or to change the conditional transition on the equality “je”, which by the way is immediately before it, to the unconditional “jmp”.
But we now need not so much to correct this error, how interesting it is to find out where it came from. Therefore, we go inside and look:

Here is such a wonderful little routine. We pass it in steps. At first, some two variables are compared with zero, as a result, the first call does not know what is being done, but is made in succession to call SendMessage. That is, everything that happens, it is sent two some kind of Windows messages, and the text deteriorates immediately after the first one (highlighted in green). The naked eye can see that their codes are transmitted to EDX (highlighted in red). Look for the code 0C8h.
This turns out to be the EM_FMTLINES message. It seems to be quite similar, we send messages to format the strings, so they have been preformatted. It's time to read the documentation. MSDN tells us the following:
This message defines the inclusion of "soft" line feeds in a multiline edit control. A “soft” line break is two characters [CR] and one [LF] and is inserted into the line where it is cut during word wrap.
Parameter wParam: true - insert characters, false - delete them.
The message only affects the buffer returned by the EM_GETHANDLE and WM_GETTEXT messages, and does not affect the text displayed in the edit control. It also does not affect "hard" line breaks, which consist of one [CR] and one [LF].
In addition, we learn that this message was entered no later than in Windows 95. Well, that's all, it became clear. In 95, it was assumed that it does not affect, but now we see what is affected, and how else. Having a little studied a code, we find some similar calls, and the following picture is presented to our mind’s eye:
Long ago, in the first half of the 1990s, Microsoft programmers wrote a notebook for Windows 95. To implement the remarkable line wrapping function, they came up with sending a message to the window (or its element) so that it reformatted itself by inserting special characters. To distinguish these characters from the normal line feed, they invented the sequence 0d, 0d, 0a. To prevent it from getting into the file, before saving, all such codes were deleted, and after saving, they were added back.
Later, when windows XP was made, the element began to transfer everything as it should, and it no longer needed this message. However, no one remembered why it was needed, and therefore they decided to leave it as it was just in case. Moreover, everything seemed to work, and no one noticed the problems after saving. Since then, this code has remained, having reached the latest versions of Windows 7 and 8. I did not install the top ten, but most likely there is one too.
We now turn to the correction of errors. After the message 08h, another OB1h is sent, and this EM_SETSEL is the allocation setting. It seems that it’s still completely wrong to throw out this subroutine, and there is some strange challenge at the beginning. Therefore, it is better to delete only the first call to SendMessage, or change its parameter from 1 to 0, or change the transition to another address, so that, after checking the variable [0FFA40054h], immediately move to the second call. There are many options, but the result will be the same.

Where is the parameter equal to 1? Everything is very simple - it is in the register r8. To shorten code, the compiler never uses direct zero forwarding to registers. Such a command takes b bytes: 2 bytes operation code, 4 bytes - 32-bit zero. Instead, the XOR register itself is with itself, resulting in a zero, and it only takes 3 bytes. After that, r9, which is zero, is sent to r8 with the addition of one (highlighted in green). This operation also takes only 4 bytes. This green 1 we need to change to 0, and then the text will not deteriorate.
And now we will find the same procedure in the 32-bit version of the notebook. If you do not want to repeat all the same manipulations with debugging, you can find it by simply searching for the number 0C8h.

As you can see, a completely similar code, only 32-bit. Now, to correct the error, it remains only to find this place in the ehe-shnik and change the desired byte. Before this, do not forget to take ownership of the file and give yourself the right to change it.
64-bit notepad.exe (193536 bytes) change byte at [80FC] from 1 to 0
32-bit notepad.exe (179712 bytes) change byte at address [6FC8] from 1 to 0
I have no doubt that somewhere in the depths of the Microsoft code there are still many places where ancient bugs are sleeping, which, most likely, no one will ever fix. We can only hope that they are all as harmless as this one, and nothing terrible will happen when they are transferred to the next operating system, which users around the world will be happy to install.