Breaking down resource compression in Might and Magic III

I don’t remember very well how I ended up in the DOSBox debugger, and why I was picking at the 16-bit assembler, restoring the function of unpacking the MM3.CC resource files — but it was great. I got the game on one of the recent sales of the humble bundle, and then on the net I came across the Jeff Ludwig page , which described problems with modifying the game related to compression in MM3.CC. In particular, the following was written there:

It turned out that this algorithm is quite difficult to crack, and so far no one has learned how to unpack this data.

The call has been accepted. His article describes how he tried to fight the algorithm. I will sign it, as I did myself, and in the end I will give a link to an open source utility that can not only unpack, but also package the MM3.CC file.

Dos packer

Looking at MM3.EXE, I discovered that it is a compressed DOS executable file, with some uncompressed overlay, at the beginning of which stands FBOV. I didn’t know anything about DOS compressor, but I’ve seen from Jeff Ludwig that he is using a thing called “Universal Program Cracker” v1.11. I found version 1.10 (released on June 25, 1997) and unpacked the exe. And I even managed to correctly process the overlay data. Still, I wanted to know the name of the packer. I was prompted to use the Detect It Easy program, and indeed - it issued:

EXECUTRIX-COMPRESSOR(-)[by Knowledge Dynamics Corp] Borland TLINK(2.0)[-]

')
For history buffs, I can recommend the old thread of discussions concerning this software - from 1991 and 1995:

https://groups.google.com/forum/#!topic/comp.os.msdos.programmer/QsjHLY6Kb4s
https://groups.google.com/forum/#!topic/comp.compression/IAj2-VHbtl4

IDA DOS loader

Unpacking the exe is fine, but correctly disassembling it is even better. Unfortunately, IDA stumbled on it. He correctly defined the overlay, but could not load it. After reviewing the code, I realized that analyzing it without an overlay would result in a headache, since the code clearly missed areas (despite the fact that the unpacking procedure is stored in the exe file). While searching for FBOV in Google, I came across the source code of the IDA DOS loader, which confirmed that IDA should be able to download this overlay without any problems. I recompiled the debug version of IDA DOS loader and traced its work through Visual Studio to understand why it does not load the overlay. For this, I had to describe several internal parameters of the FBOV structure. The title is described as follows:

 #define FB_MAGIC 0x4246 #define OV_MAGIC 0x564F struct fbov_t { ushort fb; // = FB_MAGIC ushort ov; // = OV_MAGIC uint32 ovrsize; uint32 exeinfo; int32 segnum; };

exeinfo is the offset (absolute, from the beginning of the MZ header header) array of structures describing each segment stored in the overlay. segnum - the number of segments. They are described by the following structure:

 struct seginfo_t { ushort seg; ushort maxoff; ushort flags; ushort minoff; };

This is all in theory, and in IDA DOS loader, all this is implemented in the LoadCppOverlays () function. But with this exe theory ceases to work - though, mistaken only a few bytes. During the debug, I realized that exeinfo indicates the position immediately after the mentioned array of segments. I added one line in LoadCppOverlays ():

  fbov.exeinfo -= fbov.segnum*sizeof(seginfo_t);

And it all worked. I did not find any documentation for FBOV, so I don’t know for sure if there are several implementations of these overlays. I am sure that the IDA DOS loader was implemented to work with the correct version, because surely the person who wrote it checked it with live examples. Maybe it was a special feature of MM3 developers, who knows.

We are looking for unpacker

For the search, I used the DOSBox debugger and a set of breakpoints on int 21h (standard interrupt for working with the DOS API); I was especially interested in the 3Dh (open file), 3Fh (file reading) and 42h (file search) functions. And I quickly found what I was looking for.

Algorithm analysis

Now almost all the picking of packers / unpackers is done through Hex-Rays Decompiler, which does not present any particular difficulties. However, it does not work with assembler 16-bit x86, and it is unlikely when it will work. It’s as if I’ve come back to 2005, when I wrote my first static unpacker. It was IDA 4.9 time, then there was not even an interactive flowchart mode that appeared in March 2006. I mention this because for me this technology was breakthrough and dramatically accelerated the reverse engineering of algorithms. I offer you a graphical representation of the unpacker:

The purple blocks are the initialization of the algorithm, the brown ones are the main unpacking cycle, the white ones are working with the memory and the structure of the CC file. Removing an algorithm from an assembler usually takes several steps.

1. Data collection. The complexity depends on the complexity of the algorithm. It is necessary to collect information on input and output buffers, temporary buffers, variables (local and global) and constants that can be used in calculations. To simplify the work, you need to assign variables to the CPU registers whose names are similar to the register name (_ax, _bx, _cx, etc.). The type of the variable must match the size of the register - in our case it is uint16_t. In some cases, it is better to represent registers as unions (union) in order to simplify access to 8-bit parts of registers. Working with local and global variables is quite complicated, especially at first, since it is not always clear whether this is a normal type of variable that occupies 1,2,4 bytes, or is it an array. If there are a lot of accesses to memory addresses located close to each other, we can assume that this is an array, and later we can divide them into separate variables, if they are accessed, not as an array. This is especially useful when working with local variables, so all access to esp / ebp needs to be processed through arrays (for simplicity, let's call it _stack). In this phase it is very important to initialize all known data.

2. Search cycles - this is for me the most interesting. An interactive graphical representation in IDA is one of the best tools for this. It is convenient to start with the simplest internal cycles, and go upstairs. Each cycle can be painted in different colors and grouped. Cycle grouping simplifies graphical representation, and hiding internal loops helps to find the next level cycles, and so on.

3. Code rewriting is the most tedious part. Rewrite each opcode or opcode group into expressions in a higher level language, block by block. If all the cycles were correctly found, it is not very difficult to do - but, like any tedious work, this part is error prone. The remaining logic is conditional expressions that are easy to translate into high-level language. It is convenient to mark the processed blocks with a different color so as not to get lost in them.

4. Validation. In most cases, the first time you will earn nothing. Errors made in the first three stages are common and difficult to find. Fixing them usually involves running two debuggers at the same time — for the original code and for your version.

5. Decorating code. After everything has worked, it's good to go over the code and deal with all the variables, arrays and constants that have not been uniquely identified before. It is also good to give variables normal names and get rid of optional constructions that mimic assembler. Ideally, then there should be no variables named after the x86 registers or arrays that simulate a stack. Everything should look like a normal high level code.

In the case of Might and Magic III, I went through all the stages, and in the end I got a working unpacker. To avoid the mistakes of clause 4, I worked with the debugger of the original code, and checked each rewritten simple block to generate an identical assembler code. Walking through all the compressed MM3CC streams, I found that one of the branches of the algorithm was not involved, so for the time being I left it empty. The red blocks on the graph show the part that is not executed on the game files.

Then I started to google the name of the algorithm. I searched for the hexadecimal constants found in the code and the word “decompress”:

"0x13A" decompress
"0x4E6" decompress
“0x274” decompress
“0x139” decompress
"0xFC4" decompress <- success

I found the source code unpacker of some old format from Amiga . In addition to this constant, there were also table constants presented in MM3. It turned out that MM3 uses the LZHUF algorithm . Having learned this, I even more combed the code I received, and copied the missing parts of the algorithm (red blocks) from this source . The MM3 version is identical to the original LZHUF implementation with a few exceptions — instead of using the value 0x20 to initialize the dictionary, it uses the value obtained from the argument. The 8-bit value of all compressed streams in the MM3 .CC file is different. I guessed that this was in each case the most common byte in the data.

MM3.CC Packer / Unpacker

I decided to finish the titanic work with a normal utility that someone else can use. The CC file format is described in the Xeen Wiki , but this description only works for CC files from Might and Magic IV and V. And for MM3.CC, the file structure is similar, but file name hashing and compression are different. The file header and table of contents are exactly the same as described in the Xeen Wiki:

 struct FileEntry; struct FileHeader { uint16_t NumberOfFileEntries FileEntry FileEntries[NumberOfFileEntries]; }; struct FileEntry { uint16_t hash; uint16_t offsetLo; uint8_t offsetHi; uint16_t compressedSize; // includes 4 bytes header uint8_t padding; };

The FileEntries array is encrypted using the algorithm below (the same as in the Xeen Wiki):

 void encryptHeader(uint8_t* buf, size_t size) { uint8_t key = 0xAC; for (size_t i = 0; i < size; i++) { buf[i] = _rotr8(buf[i] - key, 2); key += 0x67; } } void decryptHeader(uint8_t* buf, size_t size) { uint8_t key = 0xAC; for (size_t i = 0; i < size; i++) { buf[i] = _rotl8(buf[i], 2) + key; key += 0x67; } }

Files in the CC container are identified by a 16-bit hash (FileEntry.hash):

 uint16_t hashFileName(const char* fileName) { uint16_t hash = 0; while (0 != *fileName) { uint8_t c = ((*fileName & 0x7F) < 0x60) ? *fileName : *fileName - 0x20; hash = _rotl16(hash, 9); // xchg bl, bh | rol bx, 1 hash += c; fileName++; } return hash; }

The first two files are special. These are uncompressed texts, and their size is directly recorded in the exe file, so it is better not to change them. All the others are compressed data blocks with a small descriptor at the beginning of each block:

 { uint16_t decompressionInitializer; uint16_t decompressedSize; }

The decompressionInitializer could be uint8_t, since it always stores an 8-bit value in the upper and lower 8 bits. I do not know why it is stored that way. decompressedSize is stored in the big-endian value, which is also strange. Another oddity is that after re-compression with the help of my utility, the MM3.CC file has decreased by 33 Kb. I also prepared a list of file names compiled from MM3.EXE so that the correct file names were obtained when unpacking (the list is incomplete - I missed 15 of the 556 file names). That's almost all - I provide a link to the github repository, where the MM3CC file packer / unpacker is located

github.com/rwfpl/rewolf-mm3-dumper

Compression is taken from standard LZHUF with minor changes, unpacking is the result of three-day reverse engineering. Squeezer and uncleaner do not check buffers, therefore it is not recommended to use them for serious things. Using them is easy:

 x:\mm3>mm3_cc_dumper.exe Might and Magic III CC file packer/unpacker v1.0 Copyrigh (c) 2015 ReWolf http://blog.rewolf.pl Usage: Unpack: mm3_cc_dumper.exe dump input_file.cc Pack: mm3_cc_dumper.exe pack input_directory output_file.cc

Source: https://habr.com/ru/post/264947/

All Articles