📜 ⬆️ ⬇️

Qlie visual story engine disassembly



Amateur translation of visual stories, when compared with translations of other games, has a number of features and involves working with a large amount of text. Perhaps the overwhelming majority of all visual novels were released in Japanese, only a few were translated into English (officially or by amateurs), and even less were translated into other languages.

Therefore, when working with translation, one has to deal with Japanese engines, many of which are not very friendly to localizers. Because of this, it quickly comes to the realization that the availability of translation skills, language skills, great enthusiasm and free time does not mean that the translated version of the game will soon be released.

Very roughly, the translation process of any game (not only visual stories) implies:
')

However, in the case of Japanese visual novels, this usually looks like this:


I hope our experience will be useful for someone.

Back in 2013 (and possibly earlier), I decided to translate the Bishoujo Mangekyou -Norowareshi Densetsu no Shoujo- visual novel from Japanese (美少女 華 鏡 呪 呪 少女)) from Japanese. I already had experience in translating games, but before I had to translate only short stories on relatively simple and well-known engines like Kirikiri .

Here, our team of translators had to open the engine of this novel, even before getting to the actual text itself.

Let's start with the description of the .exe file, where the words QLIE and IMOSURUME are mentioned. The file itself contains the string FastMM Borland Edition 2004, 2005 Pierre le Riche, which means the engine is most likely written in Delphi.



With a quick google it is possible to find out that Qlie is the name of the visual novel engine released by Warmth Entertainment. Apparently, IMOSURUME is the internal name of the script engine, and Qlie is a commercial name. There is a site qlie.net , which lists the games released on this engine and the official site of the company Warmth Entertainment.

But nowhere in the public domain there are neither official tools for working with the engine, nor documentation for it, which is expected.

Therefore, you have to deal with the game yourself, relying on unofficial utilities. To get started is to find all the parts of the game that will need to be translated.

Game archives are in the data0.pack, data1.pack, and data7.pack files in the \ GameData subfolder. The screensavers are in the \ GameData \ Movie folder, but you can still not touch them.


The hex editor shows that there are no recognizable headers for the game .pack archives, but at the end of the file there is a piece similar to the table of contents and the FilePackVer3.0 label


Fortunately, there is already a unpacker for this format and not even one. We used the console exfp3_v3 from asmodean.

Unpacking is not as simple as it may seem. Since the engine supports several archive formats (FilePackVer1.0, FilePackVer1.0, FilePackVer3.0), and in this case, FilePackVer3.0 is used, for special unpacking you also need a special key file key.fkey, which encrypts the archive. It is located in the \ Dll subfolder.


In addition, exfp3_v3 should clarify the archive from which particular game it is unpacking.
Therefore, you also need to specify the number of the game from the list offered by the unpacker (Bishoujo Mangekyou games there at number 15), or specify the executable file of the game as the third parameter for the unpacker.


Already after unpacking the game files, a logical thought appeared: how to pack the game in the future with a ready translation? After all, the unpacker does not support reverse operation.
At our request, w8m (thanks a lot to him for this) added arc_conv.exe to his program to pack game archives. It is enough to pack all the modified files into a new archive (for example, data8.pack), place it in the GameData folder, and they will automatically pull themselves up into the game.

Let's return to the unpacked resources. The game scenario files from the data0.pack archive can be found in the \ scenario \ ks_01 \ subfolder

All script files with the .s extension are encoded in the far from the most convenient Shift Jis encoding, and the engine does not support any Unicode encodings. Strings to translate look something like this:

【キリエ】 %1_kiri1478% 「へえ……分かっているじゃない」 私が献上したロシアンティーを見て、キリエは嬉しそうに目を細める。 ^cface,,赤目微笑01 【キリエ】 %1_kiri1479% 「日本人は、ジャムを紅茶に入れて飲むのが、ロシアンティーだと勘違いしている人が多いのだけれど……」 

You can see that each phrase in Japanese is preceded by the name of the hero in Japanese brackets. (【】), Which this phrase is pronounced (in the game it is displayed in the upper part of the window with the text). Or, if these are the words of the author, the name is not added.


But there are still service teams.

The engine commands in the script are somewhat reminiscent of the TeX markup language, but are much more unintuitive and inconvenient compared to the Kirikiri or RenPy commands .

Here are some of them:

@@@ is a triple dog. Often script files begin with this command. Apparently, loading definitions from third-party files.

For example:

 @@@Library\Avg\header.s 

@@ is a double dog. Label in the script file. You can go to it later.

%1_kiri1478% - play the voice file. These commands are inserted between the name of the hero and the text that is displayed on the screen. “1_kiri1478” - in this case, the file name from the \ voice \ folder of the data1.pack file. It is interesting that the Japanese percent (%) is used in the command, and not the usual one.

^savedate, ^saveroute, ^savescene, are the three teams that are most likely used in the game's save system and must enter the save information on the place and time of the player’s save.

For example:

 ^savedate,"現在" ^saveroute,"美少女万華鏡-1-" ^savescene,"呪われし伝説の少女 オープニング" 

That is, date: present moment, branch: Bishoujo Mangekyou -1-, scene: Norowareshi Densetsu no Shoujo Opening. This data should have been displayed in the save slot, but apparently the developers decided to refuse it. As a result, ^saveroute in all parts of the scenario, ^savedate changes from “present moment” to “daydreaming”, and in-game days (in-game nights) change.

^facewindow, - the state of the textbox with text displayed on the screen. (Shown - 1 or not - 0)

^sload, - playing in-game sounds from the \ sound \ folder on the corresponding channel.

 sload,Env1,◆セミ01アブラゼミ 

Playing cicada sound on Env1 channel

The team has two optional parameters, the first is responsible for the looping of the sound, and the second is still a mystery, but it is rarely used in the game.

 ^sload,SE1,■クチュ音01,1 

Playing a looped sound on channel SE1.

^eeffect - output a special effect for a certain number of seconds. Apparently, supports consistent output of several effects.

 ^eeffect,WhiteFlash 

The effect of a white flash.

^ffade - transition effect when changing screen.
It has a whole bunch of additional parameters, but only a few are really useful: the name of the transition effect, an additional picture, if it is required, and the transition time.

 ^ffade,Overlap,,1000 

Dissolving of one picture in another, in 1 second.

^iload - download background image on the screen. You can assign an id to an image to access it in the future.

 ^iload,BG1,0_black.png 

Output file 0_black.png as background with id BG1

^we and ^wd - turn on and off the image in the window.

^facewindow,1 and ^facewindow,0 Turns on and off the image of the hero in the dialog box.

^mload - playing music on a specific channel.

 ^mload,BGM1,nbgm13 

Playing the track nbgm13 on channel BGM1

Some of the most important teams are:
\jmp - go to the label with the specified name.

^select - displays a selection window on the screen where the player must select one of the options.

For example:

 ^select, ,  \jmp,"@@route01a"+ResultBtnInt[0] @@route01a0 

Here the transition will be performed after answering the question, and the answer number (0 or 1) is returned from ResultBtnInt [0]. As a result, \jmp move the narration to the @@ route01a tag + answer number. That is, @@ route01a0 or @@ route01a1

The unpleasant feature is that the usual comma in these commands serves as a separator and cannot be used in the answer variants themselves. The Japanese have no such problem, they use the Japanese comma (、). In this case, we can replace the comma with ‚(U + 201A SINGLE LOW-9 QUOTATION MARK).

For example:

 ^select, ‚  , ‚  

The remaining teams are not so important in the first approximation.

Of course, before translating, the script should be recoded into something more convenient, for example in UTF-8, in order to combine Cyrillic and Japanese characters.

After the change of the engine (about the next part), the game accepts both Russian and Japanese text. But for the time being, compatibility requires encoding Japanese characters in Shift Jis, and Cyrillic characters in cp1251 encoding.

We quickly sketched a Python program for transcoding, taking into account the Cyrillic alphabet:

UTF8 to cp1251 and ShiftJIS
 # -*- coding: utf-8 -*- # UTF8 to cp1251 and ShiftJIS recoder # by Chtobi and Nazon, 2016 import codecs import argparse from os import path JAPANESE_CODEPAGE = 'shift_jis' UTF_CODEPAGE = 'utf-8' RUS_CODEPAGE = 'cp1251' def nonrus_handler(e): if e.object[e.start:e.end] == '~': # UTF-8: 0xEFBD9E -> SHIFT-JIS: 0x8160 japstr_byte = b'\x81\x60' elif e.object[e.start:e.end] == '-': # UTF-8: 0xEFBC8D -> SHIFT-JIS: 0x817C japstr_byte = b'\x81\x7c' else: japstr_byte = (e.object[e.start:e.end]).encode(JAPANESE_CODEPAGE) return japstr_byte, e.end if __name__ == '__main__': arg_parser = argparse.ArgumentParser(prog="Recode to cp1251 and ShiftJIS", description="Program to encode UTF8 text file to " "cp1251 for all cyrillic symbols and ShiftJIS for others. " "Output file will be inputfilename.s", usage="recode_to_cp1251_shiftjis.py file_name") arg_parser.add_argument('file_name', nargs=1, type=argparse.FileType(mode='r', bufsize=-1), help="Input text file name. Only files coded in UTF8 are allowed.\n") codecs.register_error('nonrus_handler', nonrus_handler) input_name = arg_parser.parse_args().file_name[0].name output_name = path.splitext(input_name)[0] + ".s" with open(input_name, 'rt', encoding=UTF_CODEPAGE) as input_file: with open(output_name, 'wb') as output_file: for line in input_file: for char1 in line: bytes_out = bytes(line, UTF_CODEPAGE) output_file.write(char1.encode(RUS_CODEPAGE, "nonrus_handler")) print("Done.") 


However, it was not without problems. The program, when attempting to transcode a tilde (U + FF5E FULLWIDTH TILDE), gave the error "UnicodeEncodeError: 'Shift Jis' codec can't be encoded character '\ uff5e' in position 0: illegal multibyte sequence"

At first I sinned on Python, but in the end I found out a rather unusual nuance. There is uncertainty between the Unicode and non-Unicode Japanese encoding method, depending on the specific implementation.

As a result, Windows associates the Shift Jis symbol with the code 0x8160 with Unicode ~ (U + FF5E FULLWIDTH TILDE), and other transcoders (for example, the iconv utility) correlate the same symbol with 〜 (U + 301C WAVE DASH), according to the official Unicode Ratio table - ftp://ftp.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/SHIFT JIS.TXT

To determine the correspondences between the symbols of Microsoft, apparently decided to use the scheme from its cp932 encoding, which is an enhanced version of Shift Jis.

The same situation with the symbol with the code 0x817C, which is converted to UTF8 as - (U + FF0D FULLWIDTH HYPHEN-MINUS) in Windows, or as - (U + 2212 MINUS SIGN) in iconv.

Since all the script files were first converted from Shift Jis to UTF8 using Notepad ++ (and it uses the matching table adopted in Windows), the reverse conversion from UTF8 to Shift Jis through our Python program appeared.

Therefore, we had to consider the occurrence of ~ and ~separate conditions.

There were other minor flaws - for example, dots ... (U + 2026 HORIZONTAL ELLIPSIS) were replaced by Cyrillic dots from cp1251, and not Japanese from Shift Jis.

After translating the text, you can proceed to work with game graphics.

Graphic files of the game are in the same pack archives, but after unpacking, they still have to work hard. For example, almost all png images are unpacked as files like sample + DPNG000 + x32y0.png In other words, png images are cut into horizontal strips, 88 pixels thick, and each strip is recorded in a separate file. The file name contains the sequence number of the bar (DPNG000 ... 009) and the x, y coordinates.


I'm still wondering why this was necessary. If it is difficult to rip resources from a game, then this is clearly not the best method.

In order to glue the cut png files, a small merge_dpng permo script from asmodeus was created, which uses ImageMagick. Unfortunately, and with him problems. First, I needed Pearl, which I did not use, and even after installing it, it turned out that the script did not work correctly.

On this occasion, we wrote a similar program on python:

Qlie engine dpng files merger
 # -*- coding: utf-8 -*- # Qlie engine dpng files merger # by Chtobi and Nazon, 2016 # Requires ImageMagick magick.exe on the path. import os import glob import re import argparse import subprocess IMGMAGIC = os.path.dirname(os.path.abspath(__file__)) + '\\' + 'magick.exe' IMGMAGIC_PARAMS1 = ['-background', 'rgba(0,0,0,0)'] IMGMAGIC_PARAMS2 = ['-mosaic'] INPUT_FILES_MASK = '*+DPNG[0-9][0-9][0-9]+*.png' SPLIT_MASK = '+DPNG' x_y_ajusts_re = re.compile('(.+)\+DPNG[0-9][0-9][0-9]\+x(\d+)y(\d+)\.') if __name__ == '__main__': arg_parser = argparse.ArgumentParser(prog="DPNG Merger\n" "Program to merge sliced png files from QLIE engine. " "All files with mask *+DPNG[0-9][0-9][0-9]+*.png" "into the input directory will be merged and copied to the" "output directory.\n", usage="connect_png.py input_dir [output_dir]\n") arg_parser.add_argument("input_dir_param", nargs=1, help="Full path to the input directory.\n") arg_parser.add_argument("output_dir_param", nargs='?', default=os.path.dirname(os.path.abspath(__file__)), help="Full path to the output directory. " "It would be a script parent directory if not specified.\n") input_dir = arg_parser.parse_args().input_dir_param[0] output_dir = arg_parser.parse_args().output_dir_param[0] os.chdir(input_dir) all_append_files = glob.glob(INPUT_FILES_MASK) # Select only files with DPNG prep_bunches = [] for file_in_dir in all_append_files: # Check all files and put all splices that should be connected in separate list for num, bunch in enumerate(prep_bunches): name_first_part = bunch[0].partition(SPLIT_MASK)[0] # Part of the filename before +DPNG should be unique if name_first_part == file_in_dir.partition(SPLIT_MASK)[0]: prep_bunches[num].append(file_in_dir) break else: prep_bunches.append([file_in_dir]) os.chdir(os.path.dirname(os.path.abspath(__file__))) # Go to the script parent dir for prepared_bunch in prep_bunches: sorted_bunch = sorted(prepared_bunch) # Prepare -page params for imgmagic png_pages_params = [["(", "-page", "+{0}+{1}".format(*[(x_y_ajusts_re.match(part_file).group(2)), x_y_ajusts_re.match(part_file).group(3)]), input_dir+part_file, ")"] for part_file in sorted_bunch] connect_png_list = \ [imgmagick_page for imgmagick_pages in png_pages_params for imgmagick_page in imgmagick_pages] output_file = output_dir + sorted_bunch[0].partition(SPLIT_MASK)[0] + ".png" subprocess.check_output([IMGMAGIC] + IMGMAGIC_PARAMS1 + connect_png_list + IMGMAGIC_PARAMS2 + [output_file]) 


It would seem that now we got the whole set of pictures that appears in the game? Not at all - if you view all the connected pictures from all the archives, it still turns out that some are missing, although they are in the game. The fact is that in the engine there is another type of files - with the extension .b. This is something like an animation with images and sounds recorded inside.

The resources stored inside are quite easy to get, but, alas, none of the ready unpackers of .b files in our case worked as it should. Either some files remained unpacked, or there were errors due to Japanese names, and I did not want to boot from the Japanese locale.

Here one more our script came in handy. Since then we were not familiar with something like Kaitai Struct , we had to act almost from scratch.

The .b file format was simple and, moreover, our unpacker was required to unpack resources only from this game. In other games on the Qlie engine, additional types of resources appeared inside the .b files, but we will not dwell on them in detail.

So, open any .b file in a hex editor and look at the beginning. Before evaluation, it should be noted that the byte order of all numeric values ​​will be Little-endian.


And finally, the png file itself.


The absound section is similar in structure to abimage.

AnimatedBMP extractor
 # -*- coding: utf-8 -*- # Extract b # AnimatedBMP extractor for Bishoujo Mangekyou game files # by Chtobi and Nazon, 2016 import glob import os import struct import argparse from collections import namedtuple b_hdr = b'abmp12'+bytes(10) signa_len = 16 b_abdata = (b'abdata10'+bytes(8), b'abdata11'+bytes(8), b'abdata12'+bytes(8), b'abdata13'+bytes(8)) b_imgdat = (b'abimgdat10'+bytes(6), b'abimgdat11'+bytes(6), b'abimgdat14'+bytes(6)) b_img = (b'abimage10'+bytes(7), b'abimage11'+bytes(7), b'abimage12'+bytes(7), b'abimage13'+bytes(7), b'abimage14'+bytes(7)) b_sound = (b'absound10'+bytes(7), b'absound11'+bytes(7), b'absound12'+bytes(7)) # not sure about structure of sound11 and sound12 b_snd = (b'absnddat11'+bytes(7), b'absnddat10'+bytes(7), b'absnddat12'+bytes(7)) Abimgdat13_pattern = namedtuple('Abimgdat13', ['signa', 'name_size_len', 'hash_size_len', 'unknown1_len', 'unknown2_len', 'data_size_len']) Abimgdat13 = Abimgdat13_pattern(signa=b'abimgdat13'+bytes(6), name_size_len=2, hash_size_len=2, unknown1_len=1, unknown2_len=12, data_size_len=4) Abimgdat14_pattern = namedtuple('Abimgdat14', ['signa', 'name_size_len', 'hash_size_len', 'unknown1_len', 'data_size_len']) Abimgdat14 = Abimgdat14_pattern(signa=b'abimgdat14'+bytes(6), name_size_len=2, hash_size_len=2, unknown1_len=77, data_size_len=4) Abimgdat_pattern = namedtuple('Abimgdat', ['name_size_len', 'hash_size_len', 'unknown1_len', 'data_size_len']) # probably, abimgdat10,abimgdat11 and others Other_imgdat = Abimgdat_pattern(name_size_len=2, hash_size_len=2, unknown1_len=1, data_size_len=4) Absnddat11_pattern = namedtuple('Absnddat11', ['signa', 'name_size_len', 'hash_size_len', 'unknown1_len', 'data_size_len']) Absnddat11 = Absnddat11_pattern(signa=b'absnddat11'+bytes(7), name_size_len=2, hash_size_len=2, unknown1_len=1, data_size_len=4) def create_parser(): arg_parser = argparse.ArgumentParser(prog='AnimatedBMP extractor\n', usage='extract_b input_file_name output_dir\n', description='AnimatedBMP extractor for QLIE engine *.b files.\n') arg_parser.add_argument('input_file_name', nargs='+', help="Input file with full path(wildcards are supported).\n") arg_parser.add_argument('output_dir', nargs=1, help="Output directory.\n") return arg_parser def check_type(file_buf): if file_buf.startswith(b'\x89' + b'PNG'): return '.png' elif file_buf.startswith(b'BM'): return '.bmp' elif file_buf.startswith(b'JFIF', 6): return '.jpg' elif file_buf.startswith(b'IMOAVI'): return '.imoavi' elif file_buf.startswith(b'OggS'): return '.ogg' elif file_buf.startswith(b'RIFF'): return '.wav' else: return '' def bytes_shiftjis_to_utf8(shiftjis_bytes): shiftjis_str = shiftjis_bytes.decode('shift_jis', 'strict') utf_str = shiftjis_str.encode('utf-8', 'strict').decode('utf-8', 'strict') return utf_str def check_signa(f_buffer): if f_buffer.endswith(b_abdata): return 'abdata' elif f_buffer.endswith(b_img): return 'abimgdat' elif f_buffer.endswith(b_sound): return 'absound' def prepare_filename(out_file_name, out_dir, postfix=''): ready_name = out_dir + os.path.basename(out_file_name) + postfix return ready_name def create_file(file_name_hndl, out_buffer): if len(out_buffer) != 0: with open(file_name_hndl, 'wb') as ext_file: ext_file.write(out_buffer) else: print("Zero file. Skipped.") def check_file_header(file_handle, bytes_num): file_handle.seek(0) readed_bytes = file_handle.read(bytes_num) if readed_bytes == b_hdr: print("File is valid abmp") return True else: print("Can't read header. Probably, wrong file...") return False if __name__ == '__main__': parser = create_parser() arguments = parser.parse_args() all_b_files = glob.glob(arguments.input_file_name[0]) output_dir = arguments.output_dir[0] for b_file in all_b_files: file_buffer = bytearray(b'') with open(b_file, 'rb') as bfile_h: check_file_header(bfile_h, len(b_hdr)) read_byte = bfile_h.read(1) file_buffer.extend(read_byte) while read_byte: read_byte = bfile_h.read(1) file_buffer.extend(read_byte) # Finding content sections signature check_result = check_signa(file_buffer) if check_result: if check_result == 'abdata': file_buffer = bytearray(b'') read_length = bfile_h.read(4) size = struct.unpack('<L', read_length)[0] file_buffer.extend(bfile_h.read(size)) # Adding _abdata to separate from other parts outfile_name = prepare_filename(b_file, output_dir, '_abdata') create_file(outfile_name, file_buffer) elif check_result == 'abimgdat': images_number = struct.unpack('B', bfile_h.read(1))[0] # Number of pictures in section for i1 in range(images_number): file_buffer = bytearray(b'') file_name = '' imgsec_hdr = bfile_h.read(signa_len) if imgsec_hdr == Abimgdat13.signa: file_name_size = struct.unpack('<H', bfile_h.read(Abimgdat13.name_size_len))[0] # Decode filename to utf8 file_name = bytes_shiftjis_to_utf8(bfile_h.read(file_name_size)) # CRC size hash_size = struct.unpack('<H', bfile_h.read(Abimgdat13.hash_size_len))[0] # Picture CRC (don't need it) pic_hash = bfile_h.read(hash_size) unknown1 = bfile_h.read(Abimgdat13.unknown1_len) unknown2 = bfile_h.read(Abimgdat13.unknown2_len) pic_size = struct.unpack('<L', bfile_h.read(Abimgdat13.data_size_len))[0] print("pic_size:", pic_size) file_buffer.extend(bfile_h.read(pic_size)) elif imgsec_hdr == Abimgdat14.signa: file_name_size = struct.unpack('<H', bfile_h.read(Abimgdat14.name_size_len))[0] file_name = bytes_shiftjis_to_utf8(bfile_h.read(file_name_size)) hash_size = struct.unpack('<H', bfile_h.read(Abimgdat14.hash_size_len))[0] pic_hash = bfile_h.read(hash_size) bfile_h.seek(Abimgdat14.unknown1_len, os.SEEK_CUR) pic_size = struct.unpack('<L', bfile_h.read(Abimgdat14.data_size_len))[0] file_buffer.extend(bfile_h.read(pic_size)) else: # probably abimgdat10, abimgdat11... file_name_size = struct.unpack('<H', bfile_h.read(Other_imgdat.name_size_len))[0] file_name = bytes_shiftjis_to_utf8(bfile_h.read(file_name_size)) hash_size = struct.unpack('<H', bfile_h.read(Other_imgdat.hash_size_len))[0] pic_hash = bfile_h.read(hash_size) bfile_h.seek(Other_imgdat.unknown1_len, os.SEEK_CUR) pic_size = struct.unpack('<L', bfile_h.read(Other_imgdat.data_size_len))[0] file_buffer.extend(bfile_h.read(pic_size)) for i, letter in enumerate(file_name): # Replace any unusable symbols from filename with _ if letter == '<' or letter == '>' or letter == '*' or letter == '/': file_name = file_name.replace(letter, "_") # Checking file signature and adding proper extension outfile_name = prepare_filename(b_file, output_dir, '_' + file_name + check_type(file_buffer)) create_file(outfile_name, file_buffer) file_buffer = bytearray(b'') elif check_result == 'absound': sound_files_number = struct.unpack('B', bfile_h.read(1))[0] for i2 in range(sound_files_number): file_buffer = bytearray(b'') file_name = '' sndsec_hdr = bfile_h.read(signa_len) if sndsec_hdr == Absnddat11.signa: file_name_size = struct.unpack('<H', bfile_h.read(Absnddat11.name_size_len))[0] file_name = bytes_shiftjis_to_utf8(bfile_h.read(file_name_size)) hash_size = struct.unpack('<H', bfile_h.read(Absnddat11.hash_size_len))[0] snd_hash = bfile_h.read(hash_size) unknown1 = bfile_h.read(Absnddat11.unknown1_len) snd_size = struct.unpack('<L', bfile_h.read(Absnddat11.data_size_len))[0] file_buffer.extend(bfile_h.read(snd_size)) else: file_name_size = struct.unpack('<H', bfile_h.read(Absnddat11.name_size_len))[0] file_name = bytes_shiftjis_to_utf8(bfile_h.read(file_name_size)) hash_size = struct.unpack('<H', bfile_h.read(Absnddat11.hash_size_len))[0] snd_hash = bfile_h.read(hash_size) unknown1 = bfile_h.read(Absnddat11.unknown1_len) snd_size = struct.unpack('<L', bfile_h.read(Absnddat11.data_size_len))[0] file_buffer.extend(bfile_h.read(snd_size)) for i, letter in enumerate(file_name): if letter == '<' or letter == '>' or letter == '*' or letter == '/': file_name[i] = '_' outfile_name = prepare_filename(b_file, output_dir, '_' + file_name + check_type(file_buffer)) print("create absound") create_file(outfile_name, file_buffer) file_buffer = bytearray(b'') 


The script should automatically unpack found png, jpg, bmp, ogg and wav files. But beyond that, the unknown imoavi files also come across.

The point is that in the game all the animations are made either as a full-fledged video in ogv format, or as images animated by the engine, which are recorded in .b files, or as animated jpg files in imoavi format.

In this case, we were interested in jpg images, so we had to deal with them as well.

There are two sections in imoavi: SOUND and MOVIE. In the MOVIE section, 47 bytes after the header, there are four bytes of the jpg file size. The files are recorded one after another in the original form, separated by a sequence of 19 bytes, where the size of the next file is recorded.

The imoavi voiced in the game did not come across, therefore the SOUND section is always empty.

Well, since we started to engage in pulling out all the resources of the game, a small script was also written at the same time to pull out jpg from imoavi.

Imoavi extractor
 # -*- coding: utf-8 -*- # Extract imoavi # Imoavi extractor for Bishoujo Mangekyou game files # by Chtobi and Nazon, 2016 import glob import os import struct import argparse imoavi_hdr = b'IMOAVI' hdr_len = len(imoavi_hdr) def create_file(file_name, out_buffer, wr_mode='wb'): if len(out_buffer) != 0: with open(file_name, wr_mode) as ext_file: ext_file.write(out_buffer) else: print("Zero file. Skipped.") def prepare_filename(file_name, out_dir, postfix=''): ready_name = out_dir + os.path.basename(file_name) + postfix return ready_name def create_parser(): arg_parser = argparse.ArgumentParser(prog='Imoavi extractor\n', usage='extract_imoavi input_file_name output_dir\n', description='Imoavi extractor for QLIE engine *.imoavi files.\n') arg_parser.add_argument('input_file_name', nargs='+', help="Input file with full path(wildcards are supported).\n") arg_parser.add_argument('output_dir', nargs='+', help="Output directory.\n") return arg_parser if __name__ == '__main__': parser = create_parser() arguments = parser.parse_args() all_imoavi = glob.glob(arguments.input_file_name[0]) output_dir = arguments.output_dir[0] for imoavi_f in all_imoavi: file_buffer = bytearray(b'') with open(imoavi_f, 'rb') as imoavi_h: # Read imoavi file header imoavi_h.read(hdr_len) imoavi_h.seek(2, os.SEEK_CUR) # 0x00 imoavi_h.seek(1, os.SEEK_CUR) # 0x64 imoavi_h.seek(3, os.SEEK_CUR) # 0x00 imoavi_h.seek(5, os.SEEK_CUR) # SOUND imoavi_h.seek(3, os.SEEK_CUR) # 0x00 imoavi_h.seek(1, os.SEEK_CUR) # 0x64 imoavi_h.seek(11, os.SEEK_CUR) imoavi_h.seek(5, os.SEEK_CUR) # Movie imoavi_h.seek(3, os.SEEK_CUR) # 00 ?? imoavi_h.seek(1, os.SEEK_CUR) # 0x64 imoavi_h.seek(3, os.SEEK_CUR) # 0x00 ?? imoavi_h.seek(4, os.SEEK_CUR) # ?? imoavi_h.seek(1, os.SEEK_CUR) # Number of jpg files in section imoavi_h.seek(4, os.SEEK_CUR) # 0x00 imoavi_h.seek(1, os.SEEK_CUR) # 0x05 ??? imoavi_h.seek(2, os.SEEK_CUR) # 0x00 ?? imoavi_h.seek(4, os.SEEK_CUR) # 720 ?? imoavi_h.seek(4, os.SEEK_CUR) # Full size without header? to_next_size = struct.unpack('<L', imoavi_h.read(4))[0] # Bytes till next header imoavi_h.seek(16, os.SEEK_CUR) # 0x00 jpg_size = struct.unpack('<L', imoavi_h.read(4))[0] imoavi_h.seek(4, os.SEEK_CUR) # 0x00 file_num = 0 file_buffer.extend(imoavi_h.read(jpg_size)) outfile_name = prepare_filename(imoavi_f, output_dir, '_' + (str(file_num)).zfill(3) + '.jpg') create_file(outfile_name, file_buffer) while to_next_size != 0: file_buffer = bytearray(b'') to_next_size = struct.unpack('<L', imoavi_h.read(4))[0] if to_next_size == 24: # 0x1C header for index part file_buffer.extend(imoavi_h.read(to_next_size)) outfile_name = prepare_filename(imoavi_f, output_dir, '_' + '.index') create_file(outfile_name, file_buffer, 'ab') # concatenate with index file else: imoavi_h.seek(2, os.SEEK_CUR) # unknown imoavi_h.seek(2, os.SEEK_CUR) # Unknown, almost always FF FF or FF FE file_num = struct.unpack('B', imoavi_h.read(1))[0] # File number imoavi_h.seek(11, os.SEEK_CUR) # 0x00 jpg_size = struct.unpack('<L', imoavi_h.read(4))[0] imoavi_h.seek(4, os.SEEK_CUR) # 0x00 file_buffer.extend(imoavi_h.read(jpg_size)) outfile_name = prepare_filename(imoavi_f, output_dir, '_' + (str(file_num)).zfill(3) + '.jpg') create_file(outfile_name, file_buffer) 


After unpacking, you can make sure that the animation from the screen saver in the menu is stored just in the 1_ イ ト ル 画面 ム ム ー ビ ー .b file in imoavi format.


On this game with all the resources.

Unfortunately, during the translation process, several more unpleasant nuances emerged that could not be overcome. The game, as I already wrote, does not support Unicode encodings. Therefore, all translated text is displayed at the wrong letter spacing. There were a few more problems with back-packing files and running the game without changing the system encoding to Japanese.

At some point, we (or rather, the one who was responsible for the technical part of the translation in our team) thought: maybe we shouldn't hang around with the old engine, but port the novelty to the Renpy engine, at the same time getting a cross-platform format?
Perhaps we hurried, but at some point, it was a pity to quit, and there was nothing left but to finish the translation.

What did we have to face during porting?
About this in the second part.

References:

Our bitbucket scripts

About the Qlie engine in Japanese

Shift Jis encoding table

More about the problem of transcoding from Shift Jis to UTF-8

Utility exfp3_v3 from asmodean

Source: https://habr.com/ru/post/426431/


All Articles