A couple of years ago,
I wrote about using the PAF (Postcode Address File) base of the British Royal Mail (Royal Mail) to bring users' mailing addresses to a standard form. Since PAF is the main intellectual property of Royal Mail, getting it is not so easy: an annual subscription costs from ÂŁ 400 depending on the completeness of the base and the frequency of updates. A week or two after the subscription, a solid red box with a CD comes in the mail:

On the disk - an EXE file that requests a "serial number" and unpacks the database (set of CSV files) to disk. The serial number is sent separately to the attacker who intercepted the package, could not use the base. (They will invent the same - a text file with a serial number!) Each client has his own number so that in case of a “leak” it is clear who to complain to. However, the serial number does not interfere with the “leakage” of the data itself, and
Postzon (one of the components of the PAF) appeared on WikiLeaks in 2009. The commentaries to it noted that "
this base was compiled by taxpayers, and activists, including The Guardian and Sir Tim Berners-Lee , have long tried to convince the Royal Mail to open free access to PAF; but so far these attempts have not crowned with success . " However, a year after Postzon appeared on WikiLeaks, a similar database appeared in open access on behalf of the British cartographic service Ordnance Survey and the
OS Code-Point Open - in this way, Royal Mail kept the person, not giving in to the requirements of activists, and leaked received the status of public. However, a fully PAF is still not publicly available. (While I was preparing this article, Postzon and with WikiLeaks disappeared somewhere; but
Google remembers everything .)
A year after receiving the PAF, I needed to look at it again, but the sheet with the serial number, sent separately from the disk, had time to get lost in a year. Then it became interesting to me - how difficult would it be to bypass the verification of the serial number in a product so solid and so fiercely protected from “information liberators”? Half an hour later I had the data on my screw, and the unpacking program itself seemed to me a good demo for beginner reverse engineers. No IDA is required - only free and fast-install tools.
')
Severe assemblers,
whom even Kaspersky is afraid of , will surely find this example a toy, and, while yawning, leaf through the rest of the article. Well, okay - tutorials in the style of
"how to draw an owl" annoy me a lot more than those in which simple things are chewed.
Let's start with the fact that we run the program under
windbg :

We hammer in a field for a serial with garbage, and we click "Begin". A message appears that the serial is incorrect.
Click on the debugger to break:

Switch to the main thread (
~0s
) and look at the call stack (
k
):

We see that execution stopped somewhere inside the message loop launched from the
MessageBox
, and the return address from the
MessageBox
is 0041f086. Let's see what was in the code
before the MessageBox
call, for example, 0x40 bytes before returning (command
u 041f086-40 l 1f
):

So, the
MessageBox
is called (with the return address 0041f086), if the byte
[ebp-45h]
turned out to be zero, and the value of this byte is written just a few instructions earlier - the result of the function call 00402adc is stored there, obviously checking the value of the serial number.
Amazing! Run the file again, setting a breakpoint immediately after calling the checking function:
g 0041f051
.
As soon as we press "Begin" on the first screen, the debugger stops the program, and in the eax register we see the result of the serial: zero.
“Fixing” it by one:
r ax=1
, and resuming execution:
g
Hooray! We see the second screen - the Royal Mail license agreement.

We click on the further, further, further, and finally, the program reports on the successful unpacking of the database.
But ... what is she unpacking to us ?!

Here are the cunning! It looks like the serial number entered is checked again somewhere.
How to find where?
Let's take the disassembler
dumppe , set on our file, and look in the listing for the line "INVALID KEY"
In my favorite Farah, this is done with the
edit:<dumppe /disasm SetupRM.exe
command
edit:<dumppe /disasm SetupRM.exe
, and fans of other file managers will have to redirect the output to a file.

The string was found at 00481AE6, and it is referred to (Xref) at 00401CB2:

We see the check and
jz
at 00401CAD. There is no point in “correcting” the register value before checking, because the check is called many thousand times - for each unpacked line. So, you need to fix the code itself.
Run the program for the third time, with familiar movements:
g 0041f051
, “Begin”,
r ax=1
.
Then we fix the check: we replace the opcode
jz
(74) with
jmp short
(eb) using the command
eb 401cad eb
. Raymond Chen has a
handy selection of frequently used x86 opcodes.
We continue execution (
g
) and happily agree with all the requirements of Royal Mail.

Here are our data, the long-awaited!

So, the minimum windbg commands needed:
~ x s | s et active thread | switch to stream context x |
k | stac k | show call stack |
u address l length | u nassemble | show code at |
g address | g o | perform until we reach the address |
r register = value | r egister | change register value |
eb address value | e nter b yte | change byte value by address |
For lovers of "programming with the mouse" all this has analogues somewhere on the menu and on the toolbars; and, as you can see, these six teams are already enough to cope with programs of the Royal Mail level.
Reverse engineering is available to everyone!... Of sporting interest, I later repeated my research with the next release of the PAF, from 2012. It turned out that the defense there was significantly strengthened - instead of two
je/jne
checks, it was necessary to fix as many as seven!