📜 ⬆️ ⬇️

Xenoblade Chronicles - parsing game data

image

Hello! My name is Artem, in tyrnets, better known by the idiotic nickname TTEMMA , but not the essence. I am one of the founders of the amateur group of translators of Russian Studio Video 7 and the only romhaker programmer in this team.

The team and I were the first to give Resident Evil fans translations of two iconic games on the Nintendo GameCube - Resident Evil Remake and Resident Evil Zero , sometime I’ll tell you how we all did it, but in this topic I would like to tell you about such a luxurious game like the Xenoblade Chronicles on the Nintendo Wii and how the romhak of this game went on and on. In this game, everything is done in the Japanese style, it is strange and in some moments you just ask yourself the question “Why?”, But then you remember how strange the Japanese are and these questions disappear. Well, let's start?

Foreword


Xenoblade Chronicles is the kind of game that is worth it, no, you even need to purchase a Nintendo Wii . JRPG, a large open world, a bunch of auxiliary quests and a fascinating storyline that will delay the passing of the game not for weeks, but for months. Anyone who is familiar with Nintendo Wii knows that the console is designed for weak colorful family games like Super Mario , etc., but what Monolith Soft created was worthy of praise, their Xenoblade Chronicles has beautiful and beautiful graphics, despite the huge technical limitations of the console (only the Resident Evil Remake and Resident Evil Zero can compete in terms of graphics).
')
image

Having looked with the team on this game, we decided that it needs to be translated into our great and mighty one. But as you know, without understanding the technical component of the game, it is clearly not worthwhile to take on the translation. And now we will talk about the technical component.

Technical nuances


I will talk a little about the Wii itself and about what it’s about in this topic will not go.

With the Nintendo GameCube , that the Nintendo Wii runs on an IBM processor with a PowerPC architecture. This processor works in Big-Endian mode, it is important to remember this (as for me, hacking Big-Endian files is much more convenient than Little-Endian due to byte order).

Fortunately, Nintendo took great care of the game developers and in their SDKs they provided just a huge number of formats for any purpose, it is not about them that the topic will not go. Maybe later I will talk about them in detail, but my laziness is unlikely to allow. I want to single out only one of the Nintendo formats - BRFNA (Binary Revolution Font), which will be discussed further.

Font


The font in Xenoblade Chronicles (hereinafter referred to as XC) is stored in a standard but rarely used format in games with the extension BRFNA.

There are only two standard Wii font formats:

  1. BRFNT (more popular)
  2. BRFNA (rarely used)

I will not delve into their structure, but only talk about the differences:


BRFNA managed to deliver a lot of problems, firstly - an unknown type of compression, secondly - a slightly odd separation by encoding. Saved us out of this situation, oddly enough, the official font converter from 3DS SDK from Nintendo . But there were problems with it, I had to study the encodings used in the XC itself, write separate configuration files for the texture converter and play around with the settings of the converter itself so that the font was identical to the original. And for good, after several days of torment, I was able to derive Russian letters using Russian character codes from UTF8.

image

True, the game rested for a long time because of the size of the new font and crashed at the very beginning of the game loading. At first there were suspicions that my crooked hands were doing something wrong, but after I removed the umlauts from the font, the game quietly started. But I categorically did not want to remove umlauts, so I approached from the other side, I just changed the format of textures with IA4 (4 bits per color, 4 bits per transparency) to just I4 (4 bits per color, without transparency) and voila, XC flew off as pretty little

Why did I decide to change the texture format? Because I can! Well, to be honest, this did not degrade the quality of the characters. The output of characters in this game works in such a way that it displays only the alpha channel, without using the main channel in any way, but if you use the font format without transparency, then there is nothing to use it except as the main channel. Ugliness, I thought, and decided to do without transparency, so as not to litter the place.

At this point, the work with the font was completed and deleted from the task list.

PKB \ PKH - file containers


Something I did not begin my story with. To get to many of the main files, you have to somehow extract them from the PKB container.

PKB is just a container, without any pointers, sizes and file names. All that can be noticed is a bunch of files equalized by 2048 bytes.

PKB Example
image

The most interesting is stored in the PKH files, but to get to them will have to try. All PKH files are in a separate archive for each U8 language named static.arc.

STATIC.ARC English
image

PKH is a very strange markup for PKB, which stores the size, pointer and index of the file. From the index, the game itself somehow gets the full file name, but I didn’t deal with it, because it is too dreary and senseless.

I could not make out the structure of this container to the end, but it was enough to extract and pack the files.

PKH can be divided into 2 blocks: Header and Entry, which I did.

public class pkhModuleEntry { public uint ID; public uint unk; public ushort sizeFile; public uint offsetFile; public pkhModuleEntry() { ID = unk = offsetFile = sizeFile = 0; } } public class pkhModule { uint Magic; uint version; uint tableOffset; uint pkhSize; uint countFiles; pkhModuleEntry[] entry; string[] extensions; ... } 

Entry starts with the tableOffset pointer. The only problem is that the entry is divided into several blocks, the download of all the information about the files is as follows:

 for (int i = 0; i < countFiles; i++) { entry[i] = new pkhModuleEntry(); entry[i].ID = mainPkhSfa.ReadUInt32(); entry[i].unk = mainPkhSfa.ReadUInt32(); } for (int i = 0; i < countFiles; i++) entry[i].sizeFile = mainPkhSfa.ReadUInt16(); for (int i = 0; i < countFiles; i++) entry[i].offsetFile = mainPkhSfa.ReadUInt32(); 

By the code above, you can understand that all information about files is divided into 3 blocks:

  1. File indices and unknown values
  2. File sizes
  3. File pointers

You can see that the pointer to a specific file is stored in uint32, that is, in a 4-byte variable, but the size, for some reason, in a 2-byte one. I will explain this flaw, as I said above, in the PKB files are aligned to 2048 bytes, and this was done for a reason. File size is not specified in bytes, but in the number of data blocks. For example, the file size is specified 0xC, therefore the size in PKB will be 0xC * 0x800 = 0x6000.

PKH Example
image

Having studied this structure, the unpacker / packer was quickly riveted and I began to study the containers that contain the text.

Containers with text


As always, the Japanese have done oddities in their game. After a long study of gaming containers, 3 fronts were allocated with gaming text:

  1. Container BDAT - stores in itself some data and lines, priority system (menu, trading, settings).
  2. Container SB - keeps in itself scripts and strings with conversations with residents.
  3. Container REV - stores data and strings used in cut scenes.

The Japanese approached the defense of their lines perfectly, but we didn’t like this fact at all.
Only strings are encrypted in each container, it would not be a problem if only one encryption algorithm was used. But alas, the Japanese decided to develop their own encryption algorithm for each container, which created many problems for us.

In this topic, I will only talk about the BDAT container and its encryption algorithm, about the encryption in the SB container, so far I will keep silent, but I can’t say anything about encryption in the REV container, since he is in the process of hacking so far.

BDAT Container


image

The very first container that hacked me is BDAT. After a quick inspection, it was difficult to understand that he is keeping the text in himself. But we are not done with a finger, so we immediately googled about this format. Some information about the structure of this container was found on the foreign forum and proofs were provided that the text was stored there. Even the software was found, which extracts it, but for some reason he did not eat my files. Poryskav even on foreign forums, I realized that their version of the game contains the text in clear text, but I do not see it in my files. In my head immediately flowed information flows and various assumptions, and only one was true - the Japanese are encrypted, encrypted. Only one thing remains, to figure out how.

After several manipulations, I had a memory dump with decoded BDAT and original on my hands, and the process of analyzing these files began. Having spent a lot of time comparing files, I could not figure out the encryption. I did not see any patterns, and there was only one way out - debazhit!

Unfortunately, Dolphin has a shitty debugger (or I just snickered and got used to the PCSX debugger, where there are all the possible functions for debugging). I had to find out in what area of ​​memory BDAT is decrypted and put brik on the record, but Dolphin can only put bryak on the command at the address, but on the read / write from the def. The RAM section is not able, it became a problem. Dolphin searches began with additional features for debugging and such was found - Dolphin DebugFast based on version 4, only one feature was added to it - a brik to read / write to RAM, what you need, I thought, and proceeded with the further hack.

Finding in the memory of the site with the data I needed, I set the brick and began to study how the game decrypts its BDAT. Everything turned out to be simple and interesting at the same time. In BDAT, there is a 2-byte key, the first byte is loaded into the R5 register, the second in R0, respectively, there is also a boolean variable, which is set to 1 (true) at the beginning.

If the Boolean variable is set to 1, then the decryption takes place using the R5 register, and if it is 0, then the decryption takes place using the R0 register.

Encryption is based on simple XOR, the decryption procedure is as follows:

  1. Encrypted byte = Encrypted byte ^ R (5 or 0)
  2. R (5 or 0) = (Encrypted Byte + R (5 or 0)) & 0xFF
  3. Changing a Boolean variable to the opposite value

C # code:

 public static void BDAT_DecryptPart(int offset, int size, ushort key, MemoryStream data) { data.Position = offset; int endOffset = offset + size; if (endOffset > data.Length) endOffset = (int)data.Length; bool reg = true; byte _r0 = (byte)(0xFF - (key & 0xFF)); byte _r5 = (byte)(0xFF - (key >> 8 & 0xFF)); byte inByte = 0; while (offset < endOffset) { inByte = data.GetBuffer()[offset]; if (reg) { data.GetBuffer()[offset] = (byte)(inByte ^ _r5); _r5 = (byte)((_r5 + inByte) & 0xFF); reg = false; } else { data.GetBuffer()[offset] = (byte)(inByte ^ _r0); _r0 = (byte)((_r0 + inByte) & 0xFF); reg = true; } offset += 1; } } 

Encryption is designed very interesting, each next byte depends on the past, and even with alternation, brilliant! Moreover, the resources for decryption are almost exhausted, and it is not possible to understand the essence of the algorithm without debugging.

Having finished with encryption, I began to deal with the structure of BDAT itself. After decrypting the string data, at the beginning of the file some names were noticed, more like the name of some blocks.

Try on
Encrypted block with 0x2C - 0x66.

image

But I postponed the analysis of this block, and decided to deal with the general structure. By a difficult analysis, it was revealed that Header occupies only 0x20 bytes in us, I described its structure below.

I will not go deep, as I have determined all this, but I’ll just tell you what each of these bytes means.

 class header { public uint magic; public byte mode; public byte unk; public ushort offsetToNameBlock; public ushort sizeTableStruct; public ushort unkTableOffset; public ushort unk2; public ushort offsetToMainData; public ushort countEntryMain; public ushort unk3; public ushort unk4; public ushort cryptKey; public uint offsetToStringBlock; public uint sizeStringBlock; ... } 


After the Header, before the offsetToNameBlock, unknown data goes, as it turned out, this is information about the blocks in MainData and has this structure:

 class typeStruct { public byte unk; public byte type; public ushort idx; ... } 


And the last block remained - offsetToNameBlock, it has the following structure:

 class nameBlock { public string bdatName; public nameBlockEntry[] nameEntry; public nameBlock(StreamFunctionAdd sfa, int countName) { bdatName = sfa.ReadAnsiStringStopByte(); sfa.SeekValue(2); nameEntry = new nameBlockEntry[countName]; for (int i = 0; i < countName; i++) { nameEntry[i] = new nameBlockEntry(sfa); } } } class nameBlockEntry { public ushort offsetToStructType; public ushort unk; public string name; public typeStruct type; public nameBlockEntry(StreamFunctionAdd sfa) { offsetToStructType = sfa.ReadUInt16(); unk = sfa.ReadUInt16(); name = sfa.ReadAnsiStringStopByte(); type = new typeStruct(sfa, offsetToStructType); sfa.SeekValue(2); } } 

I want to select only the countName variable, which is not found anywhere else in the Header, but it is calculated by taking the pointer to NameBlock 0x20 and dividing this number by 4. Let me explain why: Header ends at 0x20, NameBlock starts far after Header, and as we know, right after Header there is information about the structure of blocks in MainData, which occupies 4 bytes per structure. And in order to find out the number of such structures, it is necessary to find out the size of only information about the structures and divide by their size, that is, 4.

It seems, at first glance, a complex structure, but I will try to explain it in another way:

There is a block where all data is stored - MainData. This block is divided into several blocks, the number of which is described by the variable countEntryMain, and the size of one such block is described by the variable sizeTableStruct. But what data is stored in one such block is already described using the typeStruct class, the number of which can be from 1 to several. For each typeStruct there is a name that is stored in nameBlockEntry.

That's all, BDAT has been disassembled software has been riveted to extract / replace text, which successfully plows.

Sample extracted lines from BDAT
image

Conclusion


In this topic, I tried to voice how I tried to hack one of the legendary games on the Wii and bring to you, as the Japanese continue to do everything so that no one scans in their files.

It will be possible to continue the parsing of formats in this game, but this is not accurate. If you liked my article, I will tell you how we translated Resident Evil Remake and Resident Evil Zero .

Thank you for your time!

PS This is my first article in a similar subject, please do not throw slippers, but it is better to immediately point out errors. Maybe I did not disclose something that was necessary or didn’t explain, I ask you to point this out so that there will be no more such errors.

Source: https://habr.com/ru/post/330612/


All Articles