📜 ⬆️ ⬇️

Localization of VN on the example of Hoshizora no Memoria

Just do not say that geeks do not read VN. So, there is one visual (profile on vndb) .
Very good, by the way, a rating of 8.06, I advise you to read to those who are not confused by a certain amount of hentai.
Well, in general, not about this article.
There is an English patch, there is a Chinese. There is no Russian. Not fair. Let's try to fix it.
It will take:



Study


Let's start a new game, see the first sentence - "は 彼女 が 好 き き だ っ た" . In the English version - "I liked her" . Remember, it is still useful to us. Close the game, go to the folder. We see there a couple of suspicious files - “Memoria.hcb” and “MemoriaEN.hcb”, respectively. Let's start with the first. Open the hex editor, see the first 4 bytes. Something very similar. We check with ..En.hcb , and make sure that this is nothing more than a file size of minus 2029 bytes in Big Endian. The last 2029 bytes are occupied by the code that is called when the game is closed (smooth shading of the image and output of the farewell text). In principle, it might be necessary to change this, but in this version of localization is not required.

Now we are looking for the phrase “I liked her”.
')


If you write a few lines in a row, some regularity becomes visible:

Suppose this is again some kind of offset. Well, let's follow it. We appear immediately after the phrase "俺 は 彼女 が 好 き だ っ た". Almost. The first pair of characters turned into some kind of mess.



The first 5 bytes are replaced by similar ones 06 XX XX XX XX! At this address is just the beginning of the “I liked her” , or rather, 0E. After the line, some kind of magic begins. It was experimentally established that the sequence 08 08 08 02 E3 89 04 00 02 7F 98 04 00 is waiting for a mouse click or keystroke — confirming the transition to the next line. This is still useful.

If to generalize, the following structure turns out:
In the original script:
0E <size, 1 byte> <Text, size bytes, encoded in Shift-JIS > <13 bytes, confirmation that the player has read the text>
<some checks, depending on them <06 move to the next line or to another storyline branch >>

In the English patch, a classic trick was used - if you need to insert the code in a certain place, but you can't knock down the addressing, then we jump to the end of the file, we do our business there, we go back.

Thus, 5 bytes of the original string are changed to <06> <address slightly further than the original end of the file>
and there is <0E> <jump to that 13-byte check>

Ripping text


The utility was written in haste, which tears out the text from the script
ripper
#include <conio.h> #include <iostream> void main() { FILE *en; FILE *jp; FILE *out; en = fopen("MemoriaEN.hcb", "rb"); jp = fopen("Memoria.hcb", "rb"); out = fopen("out.txt", "w"); fseek(en, 0x8c827, SEEK_SET); fseek(jp, 0x8c827, SEEK_SET); unsigned char todo_en[50]; unsigned char todo_jp[50]; unsigned char size_en; unsigned char size_jp; char str_en[255]; char str_jp[255]; unsigned int off_en; unsigned int off_en2; unsigned int off_en3; bool r = true; do { fread(&todo_en, 1, 1, en); fread(&todo_jp, 1, 1, jp); if ((todo_en[0] ) != 6 || (todo_jp[0] ) != 14) break; fread(&size_jp, 1, 1, jp); fread(str_jp, size_jp, 1, jp); fread(&off_en, 1, 4, en); fseek(en, off_en, SEEK_SET); fread(&todo_en, 1, 1, en); if ((todo_en[0] ) != 14) break; fread(&size_en, 1, 1, en); fread(str_en, size_en, 1, en); //fwrite(str_jp, size_jp, 1, out); //fwrite((const char *) "\n", 1, 1, out); fwrite(str_en, size_en, 1, out); fwrite((const char *) "\n", 1, 1, out); fwrite((const char *) "\n", 1, 1, out); fread(&todo_en, 1, 1, en); if ((todo_en[0] ) != 6) break; fread(&off_en, 1, 4, en); if (off_en == 0x0045793e) break; off_en2 = ftell(en); fseek(en, off_en, SEEK_SET); fseek(jp, ftell(en), SEEK_SET); do { fread(&todo_jp, 1, 1, jp); fread(&todo_en, 1, 1, en); if (todo_en[0] == 6 && todo_jp[0] == 14) break; } while (!feof(en)); fseek(en, -1, SEEK_CUR); fseek(jp , ftell(en), SEEK_SET); } while (true); fclose(jp); fclose(en); fclose(out); } 


The code is completely non-optimal, but it copes with its work, in a few seconds it generates 2.5 megabytes (or Mibibite?) Of the text. If you uncomment 2 lines in the middle, the text will be collected - Japanese - English bilingual.

Here I cheated a little, because I didn’t study the format of the checks after the text in detail, and I get the address of the beginning of the next line by comparing the English and the Japanese script.

Putting text back


Suppose we somehow translated the received text, now it would be nice to shove it back.
One more softinka, the code is already corrected, the error description - further.

puffer
 #include <conio.h> #include <iostream> unsigned char buff[0x7fffff]; unsigned int end; //offset of russian strings unsigned char s[1024]; unsigned int retpos; //return here after russian unsigned char cl[13] = {0x08, 0x08, 0x08, 0x02 , 0xE3 , 0x89 , 0x04, 0x00, 0x02, 0x7F, 0x98, 0x04, 0x00}; //prompt for click unsigned char ro = 250; //max len of string unsigned char todo_en; //must be 0x06 or 0x14 unsigned char size_en; //size of engish string unsigned char jmp = 0x06; //jumps to xx xx xx xx in LE unsigned char stl = 0x0E; //indicates new sring int ts; int k; unsigned char d; int parts; FILE *en; FILE *trans; FILE *text; char str1[255]; void main() { en = fopen("MemoriaENO.hcb", "rb"); //non-modifed english script text = fopen("text.txt", "r"); //translated strings trans = fopen("MemoriaEN.hcb", "wb"); //translated script rewind(trans); rewind(text); fseek(en, 0, SEEK_END); end = ftell(en); //begin writing localization here rewind(en); fread(buff, end, 1, en); fwrite(buff, end, 1, trans); fseek(en, 0x45798e, SEEK_SET); do { s[0] = '\0'; ts = 0; do { fscanf(text, "%c", &s[ts]); ts++; } while (s[ts-1] != 0); fseek(text, 4, SEEK_CUR); //get new string, terminated with \0 fseek(trans, ftell(en), SEEK_SET); fread(&todo_en, 1, 1, en); //if ((todo_en) != 14) break; fread(&size_en, 1, 1, en); fseek(en, size_en, SEEK_CUR); retpos = ftell(en); //english string begin + len = return here after russian fwrite(&jmp, 1, 1, trans); fwrite(&end, 4, 1, trans); //jump to the russian line fseek(trans, end, SEEK_SET); if (ts <= ro) { //if length of line < byte, just write it fwrite(&stl, 1, 1, trans); fwrite(&ts, 1, 1, trans); fwrite(s, ts, 1, trans); end += (ts + 2 + 5); } else { //get maximum amount of words, which does not exceed byte //and write it end += (ts + 2 + 5); do { str1[0] = '\0'; memcpy(str1, &s, ro); strrev(str1); d = strcspn(str1, " "); strrev(str1); str1[ro-d] = '\0'; d = strlen(str1); memcpy(s, &s[d], ts - d); ts -= (d + 1); str1[d-1] = 0x00; fwrite(&stl, 1, 1, trans); fwrite(&d, 1, 1, trans); fwrite(&str1, d, 1, trans); fwrite(cl, 13, 1, trans); end += 15; } while (ts > ro); //write the remaining part of line s[ts] = 0x00; fwrite(&stl, 1, 1, trans); fwrite(&ts, 1, 1, trans); fwrite(&s, ts+1, 1, trans); } fwrite(&jmp, 1, 1, trans); fwrite(&retpos, 4, 1, trans); //jump back to english fseek(en, 5, SEEK_CUR); if (retpos == 0x732aaf) break; //exit in case of last line } while (true); fclose(en); fclose(trans); fclose(text); } 



Another “masterpiece” of the code, of course, but it does what it takes, and also in a couple of seconds.

Run, going. We are trying to play with the translation. The first few sentences are fine,
and suddenly - Runtime Error!
Again, take the hex editor, looking for a problem proposal. That's it. The length of the line is 1 byte, the translation is longer, as a result, the size overflows, and the game considers that the text is not 300 bytes, but 45. Accordingly, the engine tries to “execute” some symbol in the middle of a sentence as an instruction, with a clear result.
Recall those 13 bytes. So, when the size is exceeded, we break the text into parts, give the first,
waiting for a click, give the second, until the text is over. You can, of course, rephrase the translation, but it's better to be safe.

Total



Install the font Segoe UI in the game ...



True, line breaks are not always optimal, sometimes the phrase breaks right in the middle of a word.
The game supports the manual indication of the location of the gap, for this you need to place a tilde in the desired location. It would be necessary to modify the "zapihivatel", so that he arranged tildes. Also, until the names of the characters and menus are translated. Names are easy to change right in the script, but I’m not sure about the menu yet. It looks like you need to dig out more resources and redraw some of the graphics.

I created a project for the translation on the Notabenoid, who wants to - can participate, alone and cannot master it for a year.
Notabenoid .

Source: https://habr.com/ru/post/160603/


All Articles