📜 ⬆️ ⬇️

Reverse Engineering Star Wars: Yoda Stories

image

[Approx. Lane: Zach Barth from Zachtronics not only writes great puzzle video games (SpaceChem, TIS-100, SHENZHEN I / O) and upgrades electronic smash machines , but also enjoys the backward development of old game resources. This article is dedicated to the successful cracking of his childhood favorite game.]

Foreword


I do not know why, but I was always attracted by the reverse-engineering of data files of computer games. Decompiling a game’s code is a difficult task, and analyzing data files is often much easier (because they contain a lot of well-visible content, such as text and sprites) and allows you to modify the game if you understand the files well enough.

In 2006, a few days after the release of the Star Wars: Empire at War demo, I published a couple of rudimentary tools that allowed you to dump and repack the game data files, including a simple mod that allowed you to play as the Empire (this feature was disabled in the demo). There are almost no traces of my work on the network, but I managed to get a free T-shirt for the game developer Petroglyph, and this is already something.
')
Many years before that, I had another Star Wars game called Star Wars: Yoda Stories . She was quite confused and got bad reviews, but that didn’t stop me. As a novice programmer and a staunch Star Wars fan, I tried to find game resources to make my own terrible Star Wars game. However, I was able to detect only sound effects and a small number of sprites that were distributed as theme icons for the desktop.


Fast forward to sixteen years: I study my ancient collection of CDs in search of old games that can be played at a computer party in the 1990s. After inserting the CD into the drive, I immediately realized that I had found a data file of about four megabytes in size, which was waiting for the knowledge I had received at the college for hacking. Better later than never!

File structure (complexity: padawan)


I think that a hex editor will be enough for reverse engineering of something like this, but later we will use a decompiler, a calculator and knowledge of the corresponding programs. I'm a big fan of HxD , so I’ll use this editor.

If you want to experiment yourself, here is a link to the game data file: YODESK.DTA

It's time to open this file!


So this is definitely not a text and is not meant to be read by humans, but this does not surprise us at all. I am sure that sixteen years ago I opened the same file in Notepad and immediately closed it, even not having understood what I saw.

First, we see four-letter ASCII characters that seem to me similar to section identifiers. Down below, we see this confirmation: SNDS, TILE, CHAR, PUZ2 and more. The file ends with an ENDF section, and this implies that the overall structure of the file is some sort of hierarchy or list of sections with labels.

The VERS ID obviously starts the “version” section, which contains the following four bytes: 0x00, 0x02, 0x00, 0x00. I suspect that this is version 2.0 of the file format, because Yoda Stories came out after the game about Indiana Jones, in which, it seems, the same engine was used. However, this is not particularly important, because this piece of data does not interest us very much.

Next comes the STUP (setup?) Section, which contains a lot of mysterious data:


Obviously, there is a pattern here, but without detailed knowledge of the game, we cannot understand what it is for. The following question seems to me more important: how to skip it? Although it can be assumed that this section has a fixed length, in fact it is not.

If we look again at the beginning of the section (previous screenshot), we will see that four suspicious bytes are following the STUP identifier: 0x00, 0x44, 0x01, 0x00. If we check the remaining data in this section after these four bytes, we will see that they are exactly 0x00014400 bytes long. Coincidence? I do not think!

Obviously, these four bytes are an unsigned 32-bit integer indicating the amount of data that makes up the remainder of the STUP section. If it seems to you that the bytes are written in reverse order, then you are right: they are stored in order from low to high (little-endian) , in which the least significant byte is stored first - a common practice for x86 and x86-64 processors. If we count the length value, we can skip the rest of the section, without knowing anything about the data stored in it.

Reading binary files manually, even as small as 4 MB, is not a very productive way of working, so it’s time to start working with a program that parses a file and / or dumps data dumps in the process of figuring out how they are encoded. I prefer the C # programming language, so I will use it; assuming that the file format is not completely confused, I can create a fairly detailed BinaryReader class and start working. Here is the program that we have prepared so far:

static void Main(string[] args) { using (BinaryReader binaryReader = new BinaryReader(File.OpenRead("YODESK.DTA"))) { bool keepReading = true; while (keepReading) { string section = new string(binaryReader.ReadChars(4)); switch (section) { case "VERS": uint version = binaryReader.ReadUInt32(); break; case "STUP": uint length = binaryReader.ReadUInt32(); byte[] data = binaryReader.ReadBytes((int)length); break; default: throw new Exception("Unknown section: " + section); } } } } 

Unfortunately, it will work only for the first two sections: as soon as we reach the third section (SNDS), it becomes obvious that we will have to handle all the options that are in the file. This happens quite often with reverse-engineering file formats, because they have many different meanings that have one of many types; so we need to understand every type we can meet. Fortunately, in almost all sections of this file, the section identifier is followed by its unsigned 32-bit length, that is, we can use the STUP section processing code to skip them.

 static void Main(string[] args) { using (BinaryReader binaryReader = new BinaryReader(File.OpenRead("YODESK.DTA"))) { bool keepReading = true; while (keepReading) { string section = new string(binaryReader.ReadChars(4)); switch (section) { case "VERS": uint version = binaryReader.ReadUInt32(); break; case "STUP": case "SNDS": case "ZONE": case "TILE": case "PUZ2": case "CHAR": case "CHWP": case "CAUX": case "TNAM": uint sectionLength = binaryReader.ReadUInt32(); byte[] sectionData = binaryReader.ReadBytes((int)sectionLength); break; case "ENDF": keepReading = false; break; } } } } 



It seems that in the ZONE section, its length is indicated immediately after the identifier (0x00010292), but if we jump to this number of bytes, we will find ourselves in a void, that is, most likely, we are doing something wrong. However, it is obvious that immediately after the start there are other identifiers: IZON, IZAX, IZX2, IZX3, IZX4 and from zero to several IACT after them. This set of identifiers is repeated one more time below, and then it appears in the file several more times. Here we can use the knowledge of how the game works: one of the important “features” of Yoda Stories is that with each new passage a random map is generated, but after several games it becomes obvious that the cards are actually made up of smaller cards size. The fact that IACT sections contain text that is similar to scripts confirms the idea that each IZON actually corresponds to a map fragment:


If we cannot skip the ZONE section in a trivial way, then we will have to learn more about its structure so that we can analyze it and continue parsing the rest of the document. As a result, I managed to do this, despite the fact that this part was much more sophisticated than the previous ones, because next to the section identifiers there was added data that had nothing to do with the length.

There are many sections of IZON, so it is obvious that there is some data that allows you to move from one to another; however, it is possible that the program simply scans the data until it finds the next line “IZON”, although this is a very unreliable approach: who knows, maybe someone will add “IZON” sometime in the NPC dialog line! Measuring the length from the beginning of the first IZON to the next IZON, we get the length 0x064C, which is suspiciously close to four bytes after four bytes following the identifier ZONE (0x00000646). If we take these four bytes as a length identifier and look at the next bytes 0x646, we begin to notice the pattern:


4 : "ZONE"
2 : ????
2 : ????
4 : (X)
X : zone
2 : ????
4 : (X)
X : zone
2 : ????
4 : (X)
X : zone
...
2 : ????
4 : (X)
X : zone
4 : "PUZ2" ( )


Further study revealed the facts of varying degrees of utility:


If we update the program to read the ZONE section on the basis of this information, we can read the entire file.

 static void Main(string[] args) { using (BinaryReader binaryReader = new BinaryReader(File.OpenRead("YODESK.DTA"))) { bool keepReading = true; while (keepReading) { string section = new string(binaryReader.ReadChars(4)); switch (section) { case "VERS": uint version = binaryReader.ReadUInt32(); break; case "STUP": case "SNDS": case "TILE": case "PUZ2": case "CHAR": case "CHWP": case "CAUX": case "TNAM": uint sectionLength = binaryReader.ReadUInt32(); byte[] sectionData = binaryReader.ReadBytes((int)sectionLength); break; case "ZONE": ushort count = binaryReader.ReadUInt16(); for (int i = 0; i < count; i++) { ushort unknown = binaryReader.ReadUInt16(); uint zoneLength = binaryReader.ReadUInt32(); byte[] zoneData = binaryReader.ReadBytes((int)zoneLength); } break; case "ENDF": keepReading = false; break; default: throw new Exception("Unknown section: " + section); } } } } 

With this code, we can read the entire file without encountering any exceptions that would indicate that we made wrong assumptions about the overall structure of the file format. And although we have not yet learned anything from it, we are already on the right track.

Tiles (difficulty: Jedi Knight)


Now that we can parse the entire file structure, it is time to dive deeper and start dumping what we came here for: a strange but exhaustive set of game tiles.


If we go down to the middle of the TILE section, it will immediately become obvious to us that there is something interesting here: raster images! Game images are 32x32 pixel sprites, so if similar patterns start appearing in a 32-byte-wide column, there is a high probability that this format is one-byte per pixel uncompressed bitmap.


If we move to the beginning of the TILE section, then it is easy to notice the 1024-byte (32x32) chunk of data corresponding to the ASCII character space, presumably being sprite data with a prefix of four bytes. If we examine the interval of values ​​of these four bytes in the entire data array, then we note that they are relatively random, but they often repeat and look like an array of bit flags.

4 :
1024 :
4 :
1024 :
...
4 :
1024 :


Perhaps the bit flags describe the properties of the tiles? What's the difference! We are already close to extracting images! It remains for us to expand the functionality of our program to dump the intended image data into image files and ...

 case "TILE": { Directory.CreateDirectory(@"Tiles"); uint tileSectionLength = binaryReader.ReadUInt32(); for (int i = 0; i < tileSectionLength / 0x404; i++) { uint unknown = binaryReader.ReadUInt32(); Bitmap tile = new Bitmap(32, 32); for (int j = 0; j < 0x400; j++) { byte pixelData = binaryReader.ReadByte(); Color pixelColor = Color.FromArgb(pixelData, pixelData, pixelData); tile.SetPixel(j % 32, j / 32, pixelColor); } tile.Save(string.Format(@"Tiles\{0}.png", i)); } break; } 



Victory! Or something like that ...

Obviously, the game has color, so simply assigning a pixel value to each RGB component will not solve the problem; we will never get anything but grayscale. Wikipedia tells us that the standard 8-bit true color color scheme is 0bRRRGGGBB, so let's try it:

 case "TILE": { Directory.CreateDirectory(@"Tiles"); uint tileSectionLength = binaryReader.ReadUInt32(); for (int i = 0; i < tileSectionLength / 0x404; i++) { uint unknown = binaryReader.ReadUInt32(); Bitmap tile = new Bitmap(32, 32); for (int j = 0; j < 0x400; j++) { byte pixelData = binaryReader.ReadByte(); byte r = (byte)((pixelData & 0xE0) << 0); byte g = (byte)((pixelData & 0x1C) << 3); byte b = (byte)((pixelData & 0x03) << 6); Color pixelColor = Color.FromArgb(r, g, b); tile.SetPixel(j % 32, j / 32, pixelColor); } tile.Save(string.Format(@"Tiles\{0}.png", i)); } break; } 


Obviously, this is also the wrong pixel format, so enough to guess, let's start analyzing the numbers themselves. Make a screenshot of the game and compare the values ​​corresponding to the colors in the final sprite:


These shades of blue are very interesting: they are very close, but they have some different shades. If we take the pixel values ​​from the image processing program, we get the following values:

0x1F 0x00 / 0x00 / 0x00
0x92 0x0B / 0x53 / 0xFB
0x93 0x00 / 0x00 / 0xFB
0x94 0x00 / 0x00 / 0xCB
0x95 0x00 / 0x00 / 0x9F
0x96 0x00 / 0x00 / 0x6F


I was too optimistic that there was some kind of bit conversion from a pixel value to a color component, but the fact that 0x1F somehow corresponds to black (all zeros) indicates my error. If the decrease in the pixel value from 0x93 to 0x92 magically adds two bytes of information to color, then it is time to recognize the inevitable: the color palette is stored somewhere, and we have no idea where it is.

Of course, we can take a bunch of screenshots and automate the process of matching pixel values, but there should be a more optimal way; data palettes must be stored somewhere. If I programmed this system, I would probably just store the array of colors, in which the array index is the value of the pixel data. For fun, let's look in the file for one of the colors used in the R2D2 sprite, namely, 0x0B53FB ...

Nah, nothing.

Another popular pixel format - BGR, maybe 0xFB530B?

Also empty.

However, we did not search in another place: in the executable file of the game. Usually a lot of gameplay information is stored as constants in the code, perhaps this is the case in our case. Searching 0x0B53FB in a binary file does not give results, but if we look for the BGR value ...


My God, this is a color palette! The next four bytes have the value 0xFB000000, that is, the color 0x93 in BGRA, so there is a high probability that we stumbled upon something useful. If we look for the data in the disassembler, we can understand a little more:


This image is hard to understand, but this is our palette data, right where we expect to find an array of compile time constants in the decompiled program. Knowing that this color is 0x92, we can start work from the end and find the beginning of the palette to copy it into our program for exporting images with the correct colors:

 private static readonly byte[] PaletteData = new byte[] { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x8B, 0x00, 0xC3, 0xCF, 0x4B, 0x00, 0x8B, 0xA3, 0x1B, 0x00, 0x57, 0x77, 0x00, 0x00, 0x8B, 0xA3, 0x1B, 0x00, 0xC3, 0xCF, 0x4B, 0x00, 0xFB, 0xFB, 0xFB, 0x00, 0xEB, 0xE7, 0xE7, 0x00, 0xDB, 0xD3, 0xD3, 0x00, 0xCB, 0xC3, 0xC3, 0x00, 0xBB, 0xB3, 0xB3, 0x00, 0xAB, 0xA3, 0xA3, 0x00, 0x9B, 0x8F, 0x8F, 0x00, 0x8B, 0x7F, 0x7F, 0x00, 0x7B, 0x6F, 0x6F, 0x00, 0x67, 0x5B, 0x5B, 0x00, 0x57, 0x4B, 0x4B, 0x00, 0x47, 0x3B, 0x3B, 0x00, 0x33, 0x2B, 0x2B, 0x00, 0x23, 0x1B, 0x1B, 0x00, 0x13, 0x0F, 0x0F, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xC7, 0x43, 0x00, 0x00, 0xB7, 0x43, 0x00, 0x00, 0xAB, 0x3F, 0x00, 0x00, 0x9F, 0x3F, 0x00, 0x00, 0x93, 0x3F, 0x00, 0x00, 0x87, 0x3B, 0x00, 0x00, 0x7B, 0x37, 0x00, 0x00, 0x6F, 0x33, 0x00, 0x00, 0x63, 0x33, 0x00, 0x00, 0x53, 0x2B, 0x00, 0x00, 0x47, 0x27, 0x00, 0x00, 0x3B, 0x23, 0x00, 0x00, 0x2F, 0x1B, 0x00, 0x00, 0x23, 0x13, 0x00, 0x00, 0x17, 0x0F, 0x00, 0x00, 0x0B, 0x07, 0x00, 0x4B, 0x7B, 0xBB, 0x00, 0x43, 0x73, 0xB3, 0x00, 0x43, 0x6B, 0xAB, 0x00, 0x3B, 0x63, 0xA3, 0x00, 0x3B, 0x63, 0x9B, 0x00, 0x33, 0x5B, 0x93, 0x00, 0x33, 0x5B, 0x8B, 0x00, 0x2B, 0x53, 0x83, 0x00, 0x2B, 0x4B, 0x73, 0x00, 0x23, 0x4B, 0x6B, 0x00, 0x23, 0x43, 0x5F, 0x00, 0x1B, 0x3B, 0x53, 0x00, 0x1B, 0x37, 0x47, 0x00, 0x1B, 0x33, 0x43, 0x00, 0x13, 0x2B, 0x3B, 0x00, 0x0B, 0x23, 0x2B, 0x00, 0xD7, 0xFF, 0xFF, 0x00, 0xBB, 0xEF, 0xEF, 0x00, 0xA3, 0xDF, 0xDF, 0x00, 0x8B, 0xCF, 0xCF, 0x00, 0x77, 0xC3, 0xC3, 0x00, 0x63, 0xB3, 0xB3, 0x00, 0x53, 0xA3, 0xA3, 0x00, 0x43, 0x93, 0x93, 0x00, 0x33, 0x87, 0x87, 0x00, 0x27, 0x77, 0x77, 0x00, 0x1B, 0x67, 0x67, 0x00, 0x13, 0x5B, 0x5B, 0x00, 0x0B, 0x4B, 0x4B, 0x00, 0x07, 0x3B, 0x3B, 0x00, 0x00, 0x2B, 0x2B, 0x00, 0x00, 0x1F, 0x1F, 0x00, 0xDB, 0xEB, 0xFB, 0x00, 0xD3, 0xE3, 0xFB, 0x00, 0xC3, 0xDB, 0xFB, 0x00, 0xBB, 0xD3, 0xFB, 0x00, 0xB3, 0xCB, 0xFB, 0x00, 0xA3, 0xC3, 0xFB, 0x00, 0x9B, 0xBB, 0xFB, 0x00, 0x8F, 0xB7, 0xFB, 0x00, 0x83, 0xB3, 0xF7, 0x00, 0x73, 0xA7, 0xFB, 0x00, 0x63, 0x9B, 0xFB, 0x00, 0x5B, 0x93, 0xF3, 0x00, 0x5B, 0x8B, 0xEB, 0x00, 0x53, 0x8B, 0xDB, 0x00, 0x53, 0x83, 0xD3, 0x00, 0x4B, 0x7B, 0xCB, 0x00, 0x9B, 0xC7, 0xFF, 0x00, 0x8F, 0xB7, 0xF7, 0x00, 0x87, 0xB3, 0xEF, 0x00, 0x7F, 0xA7, 0xF3, 0x00, 0x73, 0x9F, 0xEF, 0x00, 0x53, 0x83, 0xCF, 0x00, 0x3B, 0x6B, 0xB3, 0x00, 0x2F, 0x5B, 0xA3, 0x00, 0x23, 0x4F, 0x93, 0x00, 0x1B, 0x43, 0x83, 0x00, 0x13, 0x3B, 0x77, 0x00, 0x0B, 0x2F, 0x67, 0x00, 0x07, 0x27, 0x57, 0x00, 0x00, 0x1B, 0x47, 0x00, 0x00, 0x13, 0x37, 0x00, 0x00, 0x0F, 0x2B, 0x00, 0xFB, 0xFB, 0xE7, 0x00, 0xF3, 0xF3, 0xD3, 0x00, 0xEB, 0xE7, 0xC7, 0x00, 0xE3, 0xDF, 0xB7, 0x00, 0xDB, 0xD7, 0xA7, 0x00, 0xD3, 0xCF, 0x97, 0x00, 0xCB, 0xC7, 0x8B, 0x00, 0xC3, 0xBB, 0x7F, 0x00, 0xBB, 0xB3, 0x73, 0x00, 0xAF, 0xA7, 0x63, 0x00, 0x9B, 0x93, 0x47, 0x00, 0x87, 0x7B, 0x33, 0x00, 0x6F, 0x67, 0x1F, 0x00, 0x5B, 0x53, 0x0F, 0x00, 0x47, 0x43, 0x00, 0x00, 0x37, 0x33, 0x00, 0x00, 0xFF, 0xF7, 0xF7, 0x00, 0xEF, 0xDF, 0xDF, 0x00, 0xDF, 0xC7, 0xC7, 0x00, 0xCF, 0xB3, 0xB3, 0x00, 0xBF, 0x9F, 0x9F, 0x00, 0xB3, 0x8B, 0x8B, 0x00, 0xA3, 0x7B, 0x7B, 0x00, 0x93, 0x6B, 0x6B, 0x00, 0x83, 0x57, 0x57, 0x00, 0x73, 0x4B, 0x4B, 0x00, 0x67, 0x3B, 0x3B, 0x00, 0x57, 0x2F, 0x2F, 0x00, 0x47, 0x27, 0x27, 0x00, 0x37, 0x1B, 0x1B, 0x00, 0x27, 0x13, 0x13, 0x00, 0x1B, 0x0B, 0x0B, 0x00, 0xF7, 0xB3, 0x37, 0x00, 0xE7, 0x93, 0x07, 0x00, 0xFB, 0x53, 0x0B, 0x00, 0xFB, 0x00, 0x00, 0x00, 0xCB, 0x00, 0x00, 0x00, 0x9F, 0x00, 0x00, 0x00, 0x6F, 0x00, 0x00, 0x00, 0x43, 0x00, 0x00, 0x00, 0xBF, 0xBB, 0xFB, 0x00, 0x8F, 0x8B, 0xFB, 0x00, 0x5F, 0x5B, 0xFB, 0x00, 0x93, 0xBB, 0xFF, 0x00, 0x5F, 0x97, 0xF7, 0x00, 0x3B, 0x7B, 0xEF, 0x00, 0x23, 0x63, 0xC3, 0x00, 0x13, 0x53, 0xB3, 0x00, 0x00, 0x00, 0xFF, 0x00, 0x00, 0x00, 0xEF, 0x00, 0x00, 0x00, 0xE3, 0x00, 0x00, 0x00, 0xD3, 0x00, 0x00, 0x00, 0xC3, 0x00, 0x00, 0x00, 0xB7, 0x00, 0x00, 0x00, 0xA7, 0x00, 0x00, 0x00, 0x9B, 0x00, 0x00, 0x00, 0x8B, 0x00, 0x00, 0x00, 0x7F, 0x00, 0x00, 0x00, 0x6F, 0x00, 0x00, 0x00, 0x63, 0x00, 0x00, 0x00, 0x53, 0x00, 0x00, 0x00, 0x47, 0x00, 0x00, 0x00, 0x37, 0x00, 0x00, 0x00, 0x2B, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0xE3, 0xF7, 0x00, 0x00, 0xCF, 0xF3, 0x00, 0x00, 0xB7, 0xEF, 0x00, 0x00, 0xA3, 0xEB, 0x00, 0x00, 0x8B, 0xE7, 0x00, 0x00, 0x77, 0xDF, 0x00, 0x00, 0x63, 0xDB, 0x00, 0x00, 0x4F, 0xD7, 0x00, 0x00, 0x3F, 0xD3, 0x00, 0x00, 0x2F, 0xCF, 0x00, 0x97, 0xFF, 0xFF, 0x00, 0x83, 0xDF, 0xEF, 0x00, 0x73, 0xC3, 0xDF, 0x00, 0x5F, 0xA7, 0xCF, 0x00, 0x53, 0x8B, 0xC3, 0x00, 0x2B, 0x2B, 0x00, 0x00, 0x23, 0x23, 0x00, 0x00, 0x1B, 0x1B, 0x00, 0x00, 0x13, 0x13, 0x00, 0x00, 0xFF, 0x0B, 0x00, 0x00, 0xFF, 0x00, 0x4B, 0x00, 0xFF, 0x00, 0xA3, 0x00, 0xFF, 0x00, 0xFF, 0x00, 0x00, 0xFF, 0x00, 0x00, 0x00, 0x4B, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0xFF, 0x33, 0x2F, 0x00, 0x00, 0x00, 0xFF, 0x00, 0x00, 0x1F, 0x97, 0x00, 0xDF, 0x00, 0xFF, 0x00, 0x73, 0x00, 0x77, 0x00, 0x6B, 0x7B, 0xC3, 0x00, 0x57, 0x57, 0xAB, 0x00, 0x57, 0x47, 0x93, 0x00, 0x53, 0x37, 0x7F, 0x00, 0x4F, 0x27, 0x67, 0x00, 0x47, 0x1B, 0x4F, 0x00, 0x3B, 0x13, 0x3B, 0x00, 0x27, 0x77, 0x77, 0x00, 0x23, 0x73, 0x73, 0x00, 0x1F, 0x6F, 0x6F, 0x00, 0x1B, 0x6B, 0x6B, 0x00, 0x1B, 0x67, 0x67, 0x00, 0x1B, 0x6B, 0x6B, 0x00, 0x1F, 0x6F, 0x6F, 0x00, 0x23, 0x73, 0x73, 0x00, 0x27, 0x77, 0x77, 0x00, 0xFF, 0xFF, 0xEF, 0x00, 0xF7, 0xF7, 0xDB, 0x00, 0xF3, 0xEF, 0xCB, 0x00, 0xEF, 0xEB, 0xBB, 0x00, 0xF3, 0xEF, 0xCB, 0x00, 0xE7, 0x93, 0x07, 0x00, 0xE7, 0x97, 0x0F, 0x00, 0xEB, 0x9F, 0x17, 0x00, 0xEF, 0xA3, 0x23, 0x00, 0xF3, 0xAB, 0x2B, 0x00, 0xF7, 0xB3, 0x37, 0x00, 0xEF, 0xA7, 0x27, 0x00, 0xEB, 0x9F, 0x1B, 0x00, 0xE7, 0x97, 0x0F, 0x00, 0x0B, 0xCB, 0xFB, 0x00, 0x0B, 0xA3, 0xFB, 0x00, 0x0B, 0x73, 0xFB, 0x00, 0x0B, 0x4B, 0xFB, 0x00, 0x0B, 0x23, 0xFB, 0x00, 0x0B, 0x73, 0xFB, 0x00, 0x00, 0x13, 0x93, 0x00, 0x00, 0x0B, 0xD3, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0xFF, 0x00, }; case "TILE": { Directory.CreateDirectory(@"Tiles"); uint tileSectionLength = binaryReader.ReadUInt32(); for (int i = 0; i < tileSectionLength / 0x404; i++) { uint unknown = binaryReader.ReadUInt32(); Bitmap tile = new Bitmap(32, 32); for (int j = 0; j < 0x400; j++) { byte pixelData = binaryReader.ReadByte(); byte r = PaletteData[pixelData * 4 + 2]; byte g = PaletteData[pixelData * 4 + 1]; byte b = PaletteData[pixelData * 4 + 0]; Color pixelColor = pixelData == 0 ? Color.Transparent : Color.FromArgb(r, g, b); tile.SetPixel(j % 32, j / 32, pixelColor); } tile.Save(string.Format(@"Tiles\{0}.png", i)); } break; } 


It worked! We have extracted more than two thousand strange, but charming sprites. It is worth noting that the color 0x00 is actually transparent, and this is quite obvious if you look at the raw pixel data.

Although it is much more boring than dumping images, it’s not so difficult to decipher the flags if we export them as part of the image file name and look at them as a whole. The logic and close study of small images and the corresponding numbers gives us the following:

:
bit0 =
bit1 = ,
bit2 = ,
bit3 = , /
bit4 = , ( , )
bit5 = -
bit6 =
bit7 =
bit8 =

:
bit16 =
bit17 = (????)
bit18 =
bit19 =

:
bit16 = -
bit17 =
bit18 = (???)
bit19 = ( ???)
bit20 =
bit22 =

:
bit16 =
bit17 =
bit18 =

:
bit16 =

-:
bit17 = - ()
bit18 = - (, )
bit19 = - (, )
bit20 = - (, )
bit21 = - (, )
bit22 = - ( , )
bit23 = - ( , )
bit24 = - ( , )
bit25 = - ( , )
bit26 = - ( , )
bit27 = - ( , )
bit28 = - ( , )
bit29 = - ( , )
bit30 = - ()


It is noteworthy that bits from 16 and above are used with different values ​​depending on bits 0-8, and a little confusing at first, if we expect that a particular bit should have only one value.

Maps (difficulty: Jedi Knight)


Although we have already achieved what we were striving for, the zone maps also look like an interesting object for research. If we return to how the zones are described in the hexadecimal editor, we will see that there are several subsections in the zone records. To analyze the zone, and not skip them, you need to learn more about each of them:


It is difficult to see this without self-loading the file into the hex editor, but the lengths of the IZON subsections are rather inconsistent. The first four bytes after IZON are similar to the length, but if we track them, then in fact they will drop us either too close or beyond the identifier of the subsection IZAX:


Very often this process is like a game in a complex puzzle (you can believe me, I write them), but in the end it hit me: there is a small header that determines, among other things, the size of the map (9x9 or 18x18), then there are 6 bytes on the square of the grid, then a 16-bit integer, which determines the number of 12-byte structures following it.

Other subsections are as confusing: IZAX determines its length plus six, the same with IZX2 and IZX3; IZX4 has a constant length, and the frequently occurring IACT directly indicates its length. Connect all together and get the following:

4 : "IZON"
4 :
2 : (W)
2 : (H)
1 : ( )
5 : ( )
1 : (0x01 = , 0x02 = , 0x03 = , 0x05 = )
1 : ( )
W*H*6 :
2 : (X)
X*12 :
4 : "IZAX"
2 : (X)
X-6 : IZAX
4 : "IZX2"
2 : (X)
X-6 : IZX2
4 : "IZX3"
2 : (X)
X-6 : IZX3
4 : "IZX4"
8 : IZX4
4 : "IACT"
4 : (X)
X :
4 : "IACT"
4 : (X)
X :
...
4 : "IACT"
4 : (X)
X :


We know that there are six bytes per grid square, that is, there is a high probability that they correspond to what is written in the grid square when the map is loaded. Just looking at the first square of the first card (0x00, 0x00, 0xFF, 0xFF, 0x01, 0x00), I was almost sure that these are three 16-bit integers, each of which represents a tile placed in this place (because Tiles are much more than 256). Expanding the functionality of the program to connect the extracted files based on this information, we get the following:


Cards! Looks like our guess was right. It turns out that there are more than 600 cards in the game, and each has its own script and scheme. It's amazing how much effort was put into this seemingly simple game.



Also of interest is the length of the variable “object information” at the end of the IZON subsection. It seems that it defines additional properties of tiles located on the map. Examining the data as one giant fragment in a hexadecimal editor does not help much, but if we tabulate them as 12-byte entries and compare them with the map image, some patterns begin to appear:



N/AXYN/A
09 00 00 00 04 00 04 00 01 00 01 00
09 00 00 00 09 00 09 00 01 00 05 00
09 00 00 00 0F 00 09 00 01 00 09 00
0F 00 00 00 09 00 0D 00 01 00 5D 00
06 00 00 00 0A 00 03 00 01 00 AE 04
06 00 00 00 0A 00 04 00 01 00 AE 04


If we process each record as six 16-bit integers, the first is a type flag, and the third and fourth are map coordinates. The second and fifth constantly have the same values ​​(0x0000 and 0x0001), and the latter is very messy, which makes it look like an argument. When we start looking at which tiles these records indicate, then each entry of type 0x09 indicates a door, where the argument is the ID of the card that the door leads to in the game.

Figuring out what the different types of records correspond to is a rather monotonous task, but by teaching our program to pause interaction when it finds an undeciphered record type, we can systematically study each type of data in the data file. A bit of logic and guesswork, and we get the following:

Type idDescriptionArgument
0x00trigger location
0x01spawn point
0x02place associated with the Force
0x03transport to an additional cardthe card to which the player is transported
0x04transport to main cardthe card to which the player is transported
0x05object that gives the player a locator
0x06item boxinside object
0x07Puzzle NPCTile ID of the corresponding character
0x08weapon boxinternal weapon
0x09door (at the entrance)the card with which the door is connected
0x0Adoor (exit)
0x0Bnot used
0x0Cblocking
0x0Dteleport
0x0EX-Wing from Dagobathe card on which the X-Wing transfers the player
0x0FX-Wing to the planet Dagobathe card on which the X-Wing transfers the player

Scripts (complexity: Jedi Master)


Having dealt with the format of tiles and maps, it would be logical to begin to understand the scripting language that controls the gameplay logic of specific maps. It is filled with ASCII characters, there are fragments in it that resemble commands, so it probably won't be too difficult ... right?


Unfortunately, a lot happens in the scripts that are hard to decipher. Unlike tiles and maps that are visually obvious after they are decrypted, scripts use variables and gameplay states that are not so easy to see and check.If I strongly sought to reverse-develop this game, I would assume that you can load levels and match the observed gameplay behavior with the data in the scripts, but I'm not ready for that. Before I finish, I want to try to do something with what we have learned ...

Creating a mod


The IZON subsection for map number 95 begins at offset 0x266512 in the data file. We can skip 20 bytes to get to the map data, and then skip another 6 * (18 * 9 + 12) bytes to go to the point (12, 11). If we then use the hex editor to reset the values ​​of tiles 1985 and 1986 to the "average" value of the square of each grid and overwrite the game data file, we get the following:


I don’t quite understand why there is a sports car in the game's set of tiles, but I’m sure that Yoda would ride him on dates with the best girls.

ADDITION: A sehugg user with Hacker News reported that this is actually a car from the 1978 movie “ Summer in Search of Corvette ” , in which Mark Hamill (played by Luke Skywalker) was filmed. That's how Easter eggs!

Finally


Obviously, with patience, much more can be done with this data; after reverse engineering most of the game data file, it will surprisingly simply re-create from scratch (it now works in a terribly small window) or write tools for exporting, editing and repacking the contents of the game to create mods. I don’t know why you might need it, but there is definitely such an opportunity. At a minimum, I now have access to a cool and comprehensive set of strange-looking Star Wars sprites. I hope you learned something interesting about reverse engineering of data file formats of old computer games!

Source: https://habr.com/ru/post/346230/


All Articles