Format Analysis: Sound in some games on the Unreal Engine

The culture of modifying games originated in ancient times. The earliest I remember is Wolfenstein 3D (1992). If I’m not mistaken, you could draw your cards, and then new enemies, replace textures and sounds. The main obstacle in modding is the parsing of unknown data formats. Let us leave the moral aspects of this phenomenon for other resources, and dwell on the technical difficulties that may arise in this difficult task.

I have accumulated quite a lot of stories of this kind, from the simplest ones, such as parsing the simplest archive, where many thousands of game files are stored in one file, to replacing 3D models, researching and writing custom audio codecs. I'll tell you one of them, medium difficulty.
')
Suppose you have a desire to replace certain phrases in the game, or even to wipe the full dubbing in any language for which the developers did not have enough strength or resources. It would seem that you just need to record the sound, find where it is in the game, and replace the necessary files. But it is not always easy, for example, in the latest games from the Batman: Arkham series, the wwise sound engine has been used, which has been integrated into the Unreal Engine for quite some time.

I have already come across UEs more than once, but as you know, commercial developers have the ability to completely change any part of the engine code, so almost all games are unique in terms of data structures, and it is always interesting to look for.

First, let's look at the sound files. They usually lie in the audio folder and are assembled into one big package, with an unexpected .WAD extension (hello DOOM). If you want, you can even extract all the sounds from it, but it will be several thousand nameless files, and finding something among them will be very problematic, unless you listen to them all “manually”. I must say that most often it is easier. Developers, for their own convenience, leave somewhere a file with a list of phrases. But this is not the case.

It is logical to assume that once the game itself somehow finds the necessary sounds and subtitles for them, it means that this information is somewhere in the files, you just need to find it. Nowhere in the folders for localization the texts are not found, which means they are scattered on separate levels of the game, as is often the case. Take for example one of the .upk files with a name that looks like a level and unpack it. Fortunately, the tools for this are available, even with the source code.

Inside, files of the type .RDialogueEvent are quickly detected, in which the texts of phrases in 11 languages are visible with the naked eye.

File names are similar to source sound names. Great, now it only remains to find a match between them and the sound files. That's just here and the problems begin. There are of course identifiers in the audio package. This is a 30-bit hash, which is always used in wwise for sounds, but unfortunately nowhere among the dialogue files they can not be found. Everywhere there are only incomprehensible numbers, nothing resembling a sound ID, they would be immediately noticeable. On the other hand, this is understandable, because the engine is not so simple, and you can not just take and play a sound file in the game. It is contained in an audio bank, it has many properties that impose various effects, etc.

And it turns out that in each folder with a dialogue there is a .akbank file - apparently this is the wwise audio bank.

Here he has a lot of identifiers inside, having tried them at random, we find that one of them (highlighted in green) is in the audio package. If we extract data from this identifier from there, then we will get a certain segment of several sounds coaxed together. Convert these sounds from the internal wwise format to the usual ogg. Yes, indeed, in one of them, Batman says: “I don’t have time for this,” and they answer him in another file. And the phrases just correspond to the texts of this particular dialogue.

Already not bad! In principle, this could be stopped: all the dialogues are arranged in folders, for each of them there is a bank with reference to the audio segment. Of course, we don’t know where the file is, but cutting a segment into parts, listening to it and putting several phrases into places (there are usually only 3-4 of them in dialogs) can be done manually.

But we are not looking for easy ways. To understand, so to the end. Check just in case, suddenly the sounds go right in order? Of course not, they are confused. Like it or not, somewhere there should be information about the connection of sounds in the segment with the text of the dialogue. I have been digging in different files for quite a long time, hoping to find something, but everything is useless. Good. Once such a thing, unpack all the packages of the game. This is a few gigabytes, well, nothing, the first time what? Here are just a complete search for all data of the game also did not give anything. The only place where there are sound identifiers is the audio bank. It turns out that the connection goes only through it. Nothing can be done, you have to climb inside and figure out how it works.

Now, for fidelity, we will find some dialogue in the game that can be quickly checked. After a spectacular introduction with a charming girl reporter and mask show, Batman captures Hugo Strange. He says a couple of phrases starting with “I feel I should thank you”, then he leaves, and the game begins. This is where the first save occurs. This moment will suit us.

Find the villain's phrase in the files. It appears in the OW_E8_Ch1z_Anim package. So you can’t guess right away. Inside there is only one dialogue, which contains all the beginning of the game. These are as many as 24 phrases, but perhaps it’s even good, in the jumble of codes it is easier to find the number 24 than 1 or 2. So, we were going to examine the contents of .akBank

The wwise bank format is already partially explored. Let's hope that this information is enough for our purpose. Judging by the beginning of the .akbank file, there are 5 audio banks for 5 languages in it at once, the first is the INT bank (English) - we'll see it.

First, there is an incomprehensible table after the SRRC header, then quite a lot of zeros (as seen in the last picture), then the BKHD segment, and then the HIRC segment, in which, apparently, there is a description of all audio objects. In this case, we have 79 of them (0x4F highlighted in green). According to the description, the objects in the segment go one after another, for each type is indicated (1 byte), then 32-bit length, and ID. The length and content of the object varies depending on the type.

Objects type 2 are the sounds themselves. The type is highlighted in red, length - in yellow. Each of them contains the ID of the object itself (green) and the ID of the sound file (purple) where it is contained. Below you can see the beginning of the next object of the same type.

Objects 3 are sound actions, it looks like each of them is “playing sound”, with some unknown parameters, but each of them has its own ID (gray) and sound ID that you actually need to play (green).

Objects 4 - sound events. Very short entries, in which there is only that event ID (blue), and also indicates that it contains only one action, and the ID of this action itself (gray).

Well, it looks like we have 24 chains of events of the following type:

event -> action -> sound

They are linked by identifiers, and end up with links to sound files. How to find the necessary files? Looking for these codes, we find them just in the very table at the beginning of the bank. Apparently this is a table in which it is recorded, where there are separate sounds inside the sound segment. And indeed, there are just 24 elements in it, and for each file is indicated the same ID that we had in the sound object, an offset from the beginning, and a length. Congratulations! Now we have a complete connection from audio events in banks to individual sound files:

To have as source data we have the ID of several events, one for each phrase of the dialogue, and for each of them we can find a sound file. But how to connect them now with the dialogue itself?

Let's try to look for these identifiers somewhere. There are no dialogs in the files again. There are some very short .akevent files in the folder - there are 24 of them too. Obviously, these are audio event files. There are some small numbers inside, they are all the same, no use for them. The only thing that is different there is the id of the audio events we found in the bank.

Again a dead end: there are identifiers for all events, but there is no connection between them and the text of the dialogue! Just in case, we will do a test: change the ID in the desired file and start the game. Yes, indeed, Hugo opens his mouth, but says nothing. So this is exactly the data for which the game finds the desired sound. At the same time, we note that the subtitle is still shown. So the texts of the dialogues in our case are primary, but the sound is already coming from them.

And here I remember that the UE3 engine has a habit of referring to package objects through their sequence number inside the package, that is, directly as they are packed inside it. Let's look at the export file that is generated when unpacking packages:

The numbers here are decimal, and start from zero, in the game they start from 1, so it turns out that the event files in the export are numbered 0x35-0x4. Let's see if they are somewhere among the dialogues. We start to look - and it’s necessary, right at the beginning of the file there is this number!

Here is the last missing link. At the same time, next to we find 0x2 - this is the number of the bank file. If suddenly there are several dialogs in the folder, they can also be distinguished. Now we fully know how to find the corresponding sound in the text of the dialogue.

Such a rather complex interaction scheme turned out. It seems that the developers decided not to care about convenience, and simply relied on the internal mechanisms of the engine, which led to this result in this case. And the cases, as I said, are very different. The file structure and links between them can be completely different. Here we have a link to the sound of the dialogue text. But on the contrary, the primary sound is the sound, and to it the identifier is the text. Or the event of the script of the game is primary, and links from it go to both sound and text. It happens that the files are not by name, but by hash. But in any case, somehow they are all connected, it remains only to find this connection.

As a final touch, let's try to check our results. Find the dialogue file from the phrase “I feel I should thank you” we need and replace 4B with 4C in it. We start the game, and our friend Hugo, instead of this phrase, meaningfully says: "I guarantee everyone."

Let's leave Batman on this, the study can be considered complete. In writing, the process looks fast, but in fact, each stage can be accompanied by a long contemplation of 16-digit numbers, without any hope that at some point they will form intelligent chains, and you will understand what they mean. But sometimes it does happen.

Source: https://habr.com/ru/post/257793/

All Articles

Format Analysis: Sound in some games on the Unreal Engine

More articles: