📜 ⬆️ ⬇️

The tale of the compressor, which can be called, but I do not remember how

Your attention is provided not entirely New Year's story, in which there is a plot, intrigue, detective investigation, chase, deceit, the wisdom of the ancients and a happy ending. Under the cut, you are expected by the archaeological excavations of the Habr era of perestroika and a pinch of x86 assembler to taste.


Outset


I love to play old computer games that were released in the late 80s - early 90s. They have their own indescribable old-fashioned surroundings, the predominance of game mechanics and plot over expressive means, and other things for which people love old school. When I was in high school, I got the game Skycat from Gamos, which was a 2d shooter with puzzle elements. I was interested in playing it, but I could not even pass the first level. In those days, I was already interested in Pascal programming and managed to rest against the limited graphics capabilities of the GRAPH.TPU module, so I slowly began to join the world of the x86 assembler. And then one day, looking at IDA them. Comrade Gilfanova and on the file SKYCAT.EXE weighing some 15 kilobytes, I made a rash decision: to conduct a full reverse engineering of an impassable game to find out what she had inside there. I have been doing this fascinating process for many years in a very leisurely mode, and upon its completion I plan to share the results with the community (I hope that not in as many years now it’s much closer to a successful end than to the beginning). But this article is not exactly about that.

On the New Year's holidays 2016, I decided to amuse myself by playing Sid Meier's Civilization at the most difficult level. As you already understood, I’m pretty crooked as a gamer, so after a couple of unsuccessful games I decided to look for the trainer in the game folder, CIVHELP.EXE , which does quite simple things - changes the year and the treasury in the game. Having looked at the file size of 9 kilobytes and my moderately successful experience of reverse developing SkyCat, I made another rash decision - and why not pick something up, really! IDA warned me with the message “Possibly packed file, continue?”, But in vain. Miracles began under the hood.

Intrigue


After analyzing the file, the IDA produced a disappointing picture - a pitiful pinch of instructions for 300 bytes of weight was found in the file, everything else was detected as packed data. Well, it doesn't matter - we will load into the Turbo Debugger, set a breakpoint right after unpacking, launch it, make a memory dump. AND…
')

I got a pack of zeros instead of an unpacked program. This happens when there is some kind of protection against debugging in the executable file, or the code is overwritten during the program execution. In any case, the file cannot be unpacked for free, you need to at least have a look at the code of the unpacker.

In order not to tire the reader, whose attention was probably dulled slightly from the New Year holidays, I will briefly describe what happens in these 300 Spartan bytes. First, the unpacker code is moved to the higher memory addresses (so as not to lose its own code while unpacking), and the packed data segment is unpacked and transferred to the previous place. These manipulations seemed to me not particularly complicated, but the code for the unpacker seemed familiar. A similar thing unpacked sound files in SkyCat!


The voice of the robot in the screen saver is stored in a packed file

I was, to put it mildly, surprised. It is useful to check the assembler code of the sound unpacker in SkyCat and the CIVHELP code. They turned out to be identical up to the label name. The conclusion suggested itself - the unpacker must be sufficiently known to get into two programs from different authors. I did not wait for the third coming of the mysterious unpacker and decided to find out the name of the hero of the occasion. But how to do it with a handful of assembly code?

Detective Investigation


The unpacker turned out to be not very difficult, therefore, picking up its code when analyzing SkyCat, I understood its working principle: read the control word from the input stream, read bit-by-bit commands from this word — either copying a byte from the input stream to the output, or copying a part of an already decoded stream in the output, and when the control word is exhausted, scoop it again from the input stream until it is exhausted. If the reader is not afraid of the ancient prophecies in the terrible x86 assembler language, then you can familiarize yourself with the received code here . The rest of the university knowledge in my head suggested that the Lempel-Ziv algorithm and all its numerous descendants were built on such an idea. Accordingly, the circle of my searches narrowed (albeit only slightly) to the LZ family, leaving behind such remarkable manpower as Huffman , Burrows, Wheeler , Fano and Shannon . Having studied the impressive list of LZ family algorithms , I did not find a single implementation that used the approach of my mysterious decompressor - not to use when unpacking dictionaries, not to squander several bits per copy indication of a single character and other subtle points.

The next idea (which, in fact, a reasonable person would come to mind first) - signature analysis. Different programs often leave special byte markers in the files in order to understand which program has created this file and / or which program can read / edit / run it. For example, DOS executables start with the characters MZ , BMP images contain the BM header. As for SkyCat, I was sure that I knew the purpose of each byte, and therefore there could not have been any marks. In the CIVHELP.EXE file CIVHELP.EXE after the unpacking code, there were 5 bytes of data, which, when converted to a string, looked like *FAB* . A quick googling revealed that I could not find anything computer-based on this piece of text. The attempt at signature analysis failed to fail before it began.

In the excitement, I began to read on various websites about the successors of the Lempel and Ziv cases, to peer into their source codes, to study various logic of work, but nothing similar was found. After several hours of searching, I came across a collection of various compressors with source codes, began downloading archives one by one, unpacking, viewing the source code ... and suddenly my eye caught the familiar assembler code!

Chase


 ; "" - ,     , ;         ;  (C)  1991 

The file FROG.ASM contained such an amusing description, and below it the code of the compressor itself! It would seem that my search is over, but I still had a number of questions.
In the folder with the compressor lay a document that read:
Yu. D. Krasilnikov. File compression / expansion program and compressor .COM files.
The motive for writing these programs was the article by I. Taranenko “The procedure of data packing SQUEEZE” published earlier in Softpanoram. The procedure used the principle of the packer LZEXE. This algorithm is distinguished by simplicity, efficiency and very high speed of data unpacking.

Already from this it followed that Comrade Krasilnikov did not offer the packaging of EXE files, because Comrade Taranenko and the LZEXE utility were able to do this before him. The further description stated that FROG.ASM is generally an auxiliary utility for unpacking, the COM-packer code is written in C. Who then packed CIVHELP.EXE and which program? Why were the audio files in SkyCat packed as simple data?

I decided to dig in the direction of LZEXE . It turned out that its author, Fabrice Bellard, is a very cool programmer, and I should be ashamed that I still didn’t know anything about him. A mysterious line of *FAB* flashed through my head, and it became clear that this was the author's signature. However, the source code of its remarkable compressor Fabris did not provide. Some consolation was that Mitugu Currizono created the utility UNLZEXE , which razkoozhivala the result of life LZEXE to its original state. Well, this is what we need! I uncovered DosBox, typed in the command line
 UNLZEXE.EXE CIVHELP.EXE 

and got the message
 CIVHELP.EXE is not LZEXE file 


Deceit


- Yes, what is it! - I thought. - This line *FAB* not be in the file just like that!

Fortunately, at my disposal was a file UNLZEXE.C with a very concise source code. From it it became clear that for unpacking in the EXE-file at offset 1C should be located the line "LZ91" . We correct with HEX-editor 4 bytes - voila! We get as many as 12 from 8 kilobytes, the file is launched, IDA shows a large pile of code and a small group of human-readable lines. For some reason, the title turned out to be insidious bytes, I do not know. Fabrice claims that he released only two versions of the compressor, and both signatures were in place. My guess is that it was done by hand to mask the traces of using a compressor. Why was it to do when creating a small trainer for saves - a separate mystery.

The wisdom of the ancients


There was one question left - the origin of the compressor in the game SkyCat. I have no exact answer, but there is a guess. The frog program described above was published in Softpanoram, a programmer’s newsletter, founded in 1989 by Nikolai Bezrukov. The main objectives of "Softpanorama" were the free exchange of programs in source codes and protection against computer viruses. Since the community was born earlier than Fidonet spread in the country, communication took place by sending and copying floppy disks. It turns out this prehistoric Habr with a bias in programming, which deserves a separate serious article (and it is very desirable that someone from the then active participants write it). I think it is likely that after reading one of the editions of "Softpanoram" the developers of the domestic company Gamos could borrow open source code from the national programmer Krasilnikov.

Happy ending


Since my search for the origin of the mysterious algorithm was crowned with success, I calmed down and was satisfied. And the immersion in the archive of "Softpanorama" brought me to the unforgettable world of programming news of the nineties, which to some extent made me happy. I advise you to take part in the spirit of the era.

Source: https://habr.com/ru/post/274461/


All Articles