So it turned out that I am a big fan of the work of William Gibson . My acquaintance with this remarkable prose writer happened as a result of the deep fascination with the aesthetics of cyberpunk and the subsequent appeal to the “roots” of the genre, whose founding father is considered to be Gibson (even if he himself denies it in every way). The cyberspace trilogy ( Sprawl trilogy, 1984-1988 ), namely, the Neuromancer opening novel ( Neuromancer, 1984 ), Gibson popularized the ideas of cyberspace, virtual reality and the worldwide computer network (the term cyberspace itself was invented by the very same author first written in the story "The Burning Chrome " ( Burning Chrome, 1982 ), but the term was widespread only after the publication of " Neuromant "). In addition to the obvious influence on pop culture ( Sonic Youth , U2 , Ghost in the Shell , The Matrix , Deus Ex , Shadowrun, and many more titles, one way or another experienced by the author), there is an opinion about the less obvious influence of Gibson on information technology . In the preface to the anniversary edition of Neuromant , American science fiction writer Jack Womack asks:
Could it be that Gibson’s vision of the global information space ultimately became the reason why the Internet today looks the way it looks and works the way it works?
I tend to give a positive answer to this question.
I learned about the play of Neuromants at the very beginning of my acquaintance with the person of Gibson. The video game of the same name was developed by Interplay Productions ( Wasteland , Fallout ) and published by Mediagenic (now known as Activision ) in 1988 on the Amiga , Apple II , Apple IIGS , Commodore 64 and MS-DOS platforms . At one time, the game was warmly received by the press and gaming community, and in 1996 was included by Computer Game Magazine in the list of the top 150 video games of all time. Obsession - " port this game to Win64 ", appeared in my head during the next launch of " Neuromant " under DosBox . Having received the first results from its implementation, I decided to document my actions in the form of an article (in the future, I hope, a series of articles) that you see in front of you.
I don’t think I can finish this work. However, at the moment, I intend to continue, because for me it is:
I hope my experience in one way or another will be useful to you. After all, given the content of the original, do you find the process of reverse engineering of Neuromant itself somewhat ironic? :)
The Neuromant distribution kit that came to me includes one 16-bit MZ format, NEURO.EXE
, and two binaries, NEURO1.DAT
and NEURO2.DAT
(probably, resources). Downloading the "patient" to IDA immediately received the warning " Possibly packed file " followed by sp-analysis failed
- clear evidence that the file is either compressed, encrypted, or both. But rather the first. In an era of struggle for every byte of disk space, compression is a must. In a few minutes, I ran the DOS utility UNP . Having run the executable through it, he confirmed his hypothesis about compression:
> :\UNP411\UNP.EXE C:\NEURO\NEURO.EXE UNP 4.11 Executable file restore utility, written by Ben Castricum, 05/30/95 processing file : C:\NEURO\NEURO.EXE DOS file size : 94921 file-structure : executable (EXE) EXE part sizes : header 512, image 94409, overlay 0 bytes processed with : EXEPACK V4.05 or V4.06 action : decompressing... done new file size : 313376 writing to file : C:\NEURO\NEURO.EXE
The first layer of "ice" is broken, and now the IDA works silently and without errors.
I look at what I'm dealing with. [ FLASHFORWARD: then I did not think about the fact that for superficial studying you can use the programs of visualization of binary files. Having tried them now, I understand that I could save myself a lot of time. On the unpacked file, the binvis.io service draws a picture that you see on the left (the image is clickable).]
Having quickly run through the IDA I estimate that the code itself takes about 1/5 of the total volume (~ 64k, the upper part in the image). The code regularly encounters MS-DOS system calls ( int 21
) and calls to various standard Sishny functions. In the data segment, I find a large number of readable lines (blue sectors at the bottom). Most of the binary is occupied by zeros (black sectors) - most likely, statically allocated work arrays and structures.
In general, everything is cool! Taking into account the recognized system and library calls, many of which are related to file I / O - I decided to first reverse the resource management system.
While tracing the procedures for opening file descriptors, I came across a function that turned out to be the key to understanding the structure of .DAT
game files. [An obfuscated product of disassembling is far from being the most pleasant, in appearance, thing, but still, sometimes I will accompany my words with a code] . The function determines the number, opens the descriptor ( _dos_open
) and sets the read / write offset ( lseek
) of the file containing the resource, the name of which is passed to the function as a pointer to a null-terminated string. Let's look at one of the leading to this function ( locate_resource_in_dat
) call chains:
... mov ax, 4CD2h push ds push ax mov ax, 505Eh push ax ; resource_name call sub_127DA ; sub_127DA(char *resource_name, void *, void *) ... sub_127DA proc: ... arg_0 = word ptr 4 ; resource_name arg_2 = word ptr 6 arg_4 = word ptr 8 push bp mov bp, sp sub sp, 6 ... push [bp+arg_0] ; resource_name call locate_resource_in_dat ; locate_resource_in_dat(char *resource_name) ...
In the data segment, at the address 0x505E
is the line:
dseg:505E aConfigNmc db 'config.nmc',0
The resource’s locate_resource_in_dat
to a specific .DAT
is defined inside the locate_resource_in_dat
function, by comparing this static string ( sub_127DA("config.nmc", ...)
) with strings located in the data segment every 22 ( 0x16
) bytes, starting with an explicit specified addresses. Here is what is on one of them:
dseg:00A2 db 'R1.BIH', 0 dseg:00A9 db 0 dseg:00AA db 0 dseg:00AB db 0 dseg:00AC db 0 dseg:00AD db 0 dseg:00AE db 0 dseg:00AF db 0 dseg:00B0 db 0 dseg:00B1 db 0 dseg:00B2 db 0 dseg:00B3 db 0 dseg:00B4 db 0EEh dseg:00B5 db 5 dseg:00B6 db 0 dseg:00B7 db 0 dseg:00B8 db 'R1.PIC', 0 dseg:00BF db 0 dseg:00C0 db 0 dseg:00C1 db 0 dseg:00C2 db 0 dseg:00C3 db 0 dseg:00C4 db 0 dseg:00C5 db 0 dseg:00C6 db 0EEh dseg:00C7 db 5 dseg:00C8 db 0 dseg:00C9 db 0 dseg:00CA db 046h dseg:00CB db 013h dseg:00CC db 0 dseg:00CD db 0 dseg:00CE db 'R1.ANH', 0 ...
It looks like an array of structures of the form:
struct resource_t { char name[14]; // int offset; // int size; // }; struct resource_t resources[] = { { "R1.BIH", 0, 0x5EE }, { "R1.PIC", 0x5EE, 0x1346 }, ... };
There were two such arrays. One per .DAT
file. They are located at addresses 0xA2
and 0x81C
and consist of 87 and 151 elements respectively. In total, I counted 8 different types of resources (by extensions in names): .BIH
, .PIC
, .ANH
, .BIN
, .IMH
, .NMC
, .TXH
, .SAV
. Names with the .NMC
and .SAV
are found only once and speak for themselves: CONFIG.NMC
and SAVEGAME.SAV
. Speaking names are also for ress with the .IMH
extension: CURSORS.IMH
, SPRITES.IMH
, TITLE.IMH
and others. With the rest, it is not so obvious. There is an assumption that .TXH
is a text, and .PIC
is an image. [Now I'm pretty sure that the backs of locations are stored in .PIC
] . The content of the remaining types is still unknown.
Walking through the code under the debugger, I accidentally figured out the purpose of .BIN
resources. It all started with the fact that in the listing I did not find a place where I would initialize the 256-color VGA mode ( mode 0x13 ). However, the debugger output in DosBox showed that, and approximately, when this happens:
43597504: INT10:Set Video Mode 13 43597504: VGA:Blinking 0
Having traced the program up to this point, I found out that the code that directly performs these actions is loaded into memory from the TVGA.BIN
resource, and the program simply makes a call
to the address in which it is loaded. Thus .BIN
-y contain some, dynamically loaded, compiled code. Something like dynamic libraries, but not quite. And it is not clear why this was done that way. [So I thought, until I discussed it with the person who practiced in those times. It turned out to be quite standard, for Dos, the approach commonly used to reduce the size of executable files.]
I wanted to get some tangible result. I decided to try to promote the .IMH
format, because judging by the names with this extension ( SPRITES.IMH
) - this is the graphics. Of all the resources of this type, I chose one minimum size and saved it in a separate file:
CURSORS.IMH: 243 0x0000: 0001 0203 0400 060D 0009 0A0B 0000 000F 0x0010: 0001 0203 0405 080B 0809 0A0B 0C0D 0E0F 0x0020: 3F01 0000 0440 7EEC D0C2 2D0A 46A3 3FF2 0x0030: 7111 6230 640E 104C E0C5 8505 FCFF 5001 0x0040: 04C0 260F E2C0 85C2 5017 FA05 54B0 6C2D 0x0050: 7413 E9C7 451D D8B6 2C18 E714 7B8B D79B 0x0060: BBE5 EB60 4D2F EF70 3A1E 42FE D9C0 5DC0 0x0070: EBCD EE07 B9BF 821C 0141 BB15 360B 743A 0x0080: 4BC8 749E 45D7 5B26 63F4 24E9 DAEA 5CC6 0x0090: E859 D793 41F7 94DF B7AC 4DB7 EFC2 CC4F 0x00A0: 5D5F 66D1 E5E3 3F2B AC42 7C5E 3AF1 9F95 0x00B0: D7D9 E8E9 3D75 62DE 9B05 86E6 0EBF 794C 0x00C0: 1D0B 3A76 9A97 31BA 1274 ED75 2663 FE79 0x00D0: 175F BC87 FD58 9B6F DD0B 12F2 652F 12FB 0x00E0: 1E99 5E37 B24C 424C 4F4C AF1B 6CC4 BEC7 0x00F0: C99F C0
This is definitely not a bitmap, which means you need to figure out options for what it can be [in order of increasing complexity] :
The first option was dropped quickly enough. Having tried with a dozen different extensions and viewers, I could not open this file. It came to the exotic - googled that images with the .imh
extension .imh
used in a special astronomical (literally - they are used by astronomers) program complex IRAF (Image Reduction and Analysis Facility) . But it is past.
At this stage for me there was not much difference between the second and third options. Of course, knowing the specific algorithm, it would be much easier to implement it in a high-level language or, even better, take a ready-made implementation. However, not being an expert (or even an amateur) in the field of data compression, it is extremely difficult to identify an algorithm by disassembled code or by some signatures. In any case, the task was to allocate in IDA the code segment responsible for the decompression of data, and adapt it to the 64-bit Assembler (in my case - MASM ).
And I was lucky. My CURSORS.IMH
loaded into the main
one of the first:
... mov ax, 2 mov dx, seg seg009 push dx push ax mov ax, 506Ah ; "cursors.imh" push ax call sub_126CB ...
This was followed by a long and persistent tracing of the sub_126CB
function. But the game was worth the candle. The result of the work of this function is found at the address stored in dx
:
0x0000: 0000 0000 0800 0A00 0666 6666 6666 6600 0x0010: 6777 7777 7777 7760 6777 7777 7766 7776 0x0020: 0666 6676 6677 7776 0006 7767 7777 7766 0x0030: 0000 6667 7777 7676 0000 6776 6666 6776 0x0040: 0000 0666 6667 7766 0000 0067 7677 7660 0x0050: 0000 0006 6666 6600 0400 0000 0600 0C00 0x0060: 0000 3000 0000 0003 B300 0000 003B BB30 0x0070: 0000 03BB BBB3 0000 3BBB BBBB 3000 333B 0x0080: BB33 3000 003B BB30 0000 003B BB30 0000 0x0090: 003B BB30 0000 003B BB30 0000 003B BB30 0x00A0: 0000 0033 3330 0000 0B00 0400 0600 0900 0x00B0: 0000 0033 0000 0000 003B 3000 3333 333B 0x00C0: B300 3BBB BBBB BB30 3BBB BBBB BBB3 3BBB 0x00D0: BBBB BB30 3333 333B B300 0000 003B 3000 0x00E0: 0000 0033 0000 0400 0B00 0600 0C00 0033 0x00F0: 3330 0000 003B BB30 0000 003B BB30 0000 0x0100: 003B BB30 0000 003B BB30 0000 003B BB30 0x0110: 0000 333B BB33 3000 3BBB BBBB 3000 03BB 0x0120: BBB3 0000 003B BB30 0000 0003 B300 0000 0x0130: 0000 3000 0000 0000 0400 0600 0900 0000 0x0140: 3300 0000 0003 B300 0000 003B B333 3333 0x0150: 03BB BBBB BBB3 3BBB BBBB BBB3 03BB BBBB 0x0160: BBB3 003B B333 3333 0003 B300 0000 0000 0x0170: 3300 0000
This is already similar to bitmap! And if this is indeed the case, having twisted this data, I should get a picture. I try to make rows using different widths - to no avail. Pay attention to the values ​​of the 5th ( 0x08
) and 7th ( 0x0A
) bytes. Suppose this is the width and height, multiply, count the next 80 bytes ( 0x08 * 0x0A = 50 (80)
):
0x0008: 0666 6666 6666 6600 6777 7777 7777 7760 0x0018: 6777 7777 7766 7776 0666 6676 6677 7776 0x0028: 0006 7767 7777 7766 0000 6667 7777 7676 0x0038: 0000 6776 6666 6776 0000 0666 6667 7766 0x0048: 0000 0067 7677 7660 0000 0006 6666 6600
We make so that in a row there would be 8 bytes, and all rows there were 10:
0666 6666 6666 6600 6777 7777 7777 7760 6777 7777 7766 7776 0666 6676 6677 7776 0006 7767 7777 7766 0000 6667 7777 7676 0000 6776 6666 6776 0000 0666 6667 7766 0000 0067 7677 7660 0000 0006 6666 6600
This is it! [I knew this because I saw what the cursor looked like on the screen] . I try to wrap this data in a
.BMP
header with a standard palette - something draws, but not at all what I expected. I put a breakpoint in the debugger to name the state of the video memory bytes ( 0xA000+
) to see what is actually displayed on the screen. I understand that you need to add zeros [do a search in the browser for the value "06" and, in the array below, you will see something like a palm with a bulging index finger] :
00 06 06 06 06 06 06 06 06 06 06 06 06 06 00 00 06 07 07 07 07 07 07 07 07 07 07 07 07 07 06 00 06 07 07 07 07 07 07 07 07 07 06 06 07 07 07 06 00 06 06 06 06 06 07 06 06 06 07 07 07 07 07 06 00 00 00 06 07 07 06 07 07 07 07 07 07 07 06 06 00 00 00 00 06 06 06 07 07 07 07 07 07 06 07 06 00 00 00 00 06 07 07 06 06 06 06 06 06 07 07 06 00 00 00 00 00 06 06 06 06 06 06 07 07 07 06 06 00 00 00 00 00 00 06 07 07 06 07 07 07 06 06 00 00 00 00 00 00 00 00 06 06 06 06 06 06 06 00 00
I wrap up this (and remaining in the original array) data in BMP
and get the following images at the output (here, slightly enlarged):
It was found that the first 20 bytes of CURSORS.IMH
[and the rest .IMH
] are loaded into another area of ​​memory and do not participate in decompression. At first I thought it was a palette, but no, the correct colors are obtained using the standard one (in the picture above). [In the code, I found a place where this memory is accessed, but I haven’t figured out what exactly is happening there yet] .
Rewrote the function sub_126CB
, about 800 lines of MASM were issued. But it works, and now I can unpack any .IMH
resource. Illustration in the title, by the way, from the same place ( TITLE.IMH
). But the main character's sprites ( SPRITES.IMH
):
Yes, I can unpack any .IMH
resource, but not all at once. When transferring code to 64-bit Assembler, I made mistakes related to the transfer of digits when adding addresses, because of this, even on one resource, the program works one time. But this is fixable, you only need to rewrite this code as if I originally designed it for a 64-bit platform. Ideally, rewrite it in C , then these errors will be eliminated by themselves. However, by what I saw, I didn’t have an algorithm, and rewriting the obfuscated Assembler in C , there is a risk of getting even more obfuscated C. I needed help. After removing the data at key points, I posted a question on Reverse Engineering StackExchange in the hope that someone could use these data to determine whether I am dealing with a particular compression algorithm.
Currently I'm reversing an old MS-DOS Neuromancer game (Interplay Productions, 1988. Based on William Gibson's novel). For all the sprites. Sprites are bitmaps that compressed with some fancy algorithm. I have restored the 16-bit assembly to 64-bit assembly (I'm working on Windows 10 and using MASM in MSVS 2017 ). It results in ~ 800 lines of pretty obfuscated assembly (which became even more obfuscated after porting it to 64-bit). However, it makes it impossible to identify the actual algorithm.
And here is the problem. I would like to know the data compression algorithm. I will provide an example of this during the decompression process. If it’s not true, I’m correcting it. Thank you in advance!
Here we go, decompressing in-game cursors:
SRC: Initial data, 211 bytes stored in 512 byte buffer: 0x0000 3f 01 00 00 04 40 7e ec d0 c2 2d 0a 46 a3 3f f2 0x0010 71 11 62 30 64 0e 10 4c e0 c5 85 05 fc ff 50 01 0x0020 04 c0 26 0f e2 c0 85 c2 50 17 fa 05 54 b0 6c 2d 0x0030 74 13 e9 c7 45 1d d8 b6 2c 18 e7 14 7b 8b d7 9b 0x0040 bb e5 eb 60 4d 2f ef 70 3a 1e 42 fe d9 c0 5d c0 0x0050 eb cd ee 07 b9 bf 82 1c 01 41 bb 15 36 0b 74 3a 0x0060 4b c8 74 9e 45 d7 5b 26 63 f4 24 e9 da ea 5c c6 0x0070 e8 59 d7 93 41 f7 94 df b7 ac 4d b7 ef c2 cc 4f 0x0080 5d 5f 66 d1 e5 e3 3f 2b ac 42 7c 5e 3a f1 9f 95 0x0090 d7 d9 e8 e9 3d 75 62 de 9b 05 86 e6 0e bf 79 4c 0x00A0 1d 0b 3a 76 9a 97 31 ba 12 74 ed 75 26 63 fe 79 0x00B0 17 5f bc 87 fd 58 9b 6f dd 0b 12 f2 65 2f 12 fb 0x00C0 1e 99 5e 37 b2 4c 42 4c 4f 4c af 1b 6c c4 be c7 0x00D0 c9 9f c0
Processing starts from the SRC
and suspends on the 44th byte. This is the case for the SRC
buffer (actually, there is an address for the SRC
buffer):
0x0200 00 80 01 00 5c ea 3f 01 00 00 0x020A 02 00 02 00 0e 00 04 00 1f 00 05 00 04 00 04 00 0x021A 18 00 05 00 3a 00 07 00 0d 00 05 00 00 00 00 00 0x022A 05 00 04 00 df 00 08 00 0d 00 08 00 de 00 08 00 0x023A 05 00 07 00 00 00 00 00 00 00 00 00 00 00 00 00 0x024A 00 00 05 00 05 00 05 00 00 00 00 00 00 00 00 00 0x025A 00 00 00 00 00 00 00 00 0c 00 08 00 00 00 00 00 0x026A 00 00 00 00 0f 00 08 00 0e 00 08 00 00 00 00 00 ... ZEROES ... 0x02CA 1a 00 05 00 00 00 00 00 00 00 00 00 0c 00 05 00 0x02DA 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x02EA 04 00 05 00 00 00 00 00 00 00 00 00 00 00 00 00 ... ZEROES ... 0x038A 6e 00 07 00 1c 00 06 00 00 00 00 00 00 00 00 00 0x039A 00 00 00 00 00 00 00 00 09 00 08 00 00 00 00 00 ... ZEROES ... 0x040A 19 00 05 00 00 00 00 00 00 00 00 00 07 00 05 00 0x041A 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x042A 06 00 05 00 00 00 00 00 00 00 00 00 00 00 00 00 ... ZEROES ... 0x05EA 00 00 00 00 3b 00 07 00 00 00 00 00 08 00 08 00 0x05FA 36 00 06 00 0f 00 05 00 1e 00 05 00 01 00 04 00 0x060A 05 10 00 00 05 10 00 00 05 10 00 00 05 10 00 00 0x061A 05 10 00 00 05 10 00 00 05 10 00 00 05 10 00 00 0x062A 08 fb 00 00 08 66 00 00 07 0c 00 00 07 0c 00 00 0x063A 08 16 00 00 08 0a 00 00 08 1a 00 00 08 19 00 00 0x064A 04 ff 00 00 04 ff 00 00 04 ff 00 00 04 ff 00 00 0x065A 04 ff 00 00 04 ff 00 00 04 ff 00 00 04 ff 00 00 0x066A 04 ff 00 00 04 ff 00 00 04 ff 00 00 04 ff 00 00 0x067A 04 ff 00 00 04 ff 00 00 04 ff 00 00 04 ff 00 00 0x068A 05 38 00 00 05 38 00 00 05 38 00 00 05 38 00 00 0x069A 05 38 00 00 05 38 00 00 05 38 00 00 05 38 00 00 0x06AA 05 11 00 00 05 11 00 00 05 11 00 00 05 11 00 00 0x06BA 05 11 00 00 05 11 00 00 05 11 00 00 05 11 00 00 0x06CA 05 88 00 00 05 88 00 00 05 88 00 00 05 88 00 00 0x06DA 05 88 00 00 05 88 00 00 05 88 00 00 05 88 00 00 0x06EA 05 83 00 00 05 83 00 00 05 83 00 00 05 83 00 00 0x06FA 05 83 00 00 05 83 00 00 05 83 00 00 05 83 00 00 0x070A 04 03 00 00 04 03 00 00 04 03 00 00 04 03 00 00 0x071A 04 03 00 00 04 03 00 00 04 03 00 00 04 03 00 00 0x072A 04 03 00 00 04 03 00 00 04 03 00 00 04 03 00 00 0x073A 04 03 00 00 04 03 00 00 04 03 00 00 04 03 00 00 0x074A 04 08 00 00 04 08 00 00 04 08 00 00 04 08 00 00 0x075A 04 08 00 00 04 08 00 00 04 08 00 00 04 08 00 00 0x076A 04 08 00 00 04 08 00 00 04 08 00 00 04 08 00 00 0x077A 04 08 00 00 04 08 00 00 04 08 00 00 04 08 00 00 0x078A 05 33 00 00 05 33 00 00 05 33 00 00 05 33 00 00 0x079A 05 33 00 00 05 33 00 00 05 33 00 00 05 33 00 00 0x07AA 05 06 00 00 05 06 00 00 05 06 00 00 05 06 00 00 0x07BA 05 06 00 00 05 06 00 00 05 06 00 00 05 06 00 00 0x07CA 06 61 00 00 06 61 00 00 06 61 00 00 06 61 00 00 0x07DA 07 05 00 00 07 05 00 00 07 f9 00 00 07 f9 00 00 0x07EA 05 fd 00 00 05 fd 00 00 05 fd 00 00 05 fd 00 00 0x07FA 05 fd 00 00 05 fd 00 00 05 fd 00 00 05 fd 00 00 0x080A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x081A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x082A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x083A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x084A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x085A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x086A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x087A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x088A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x089A 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x08AA 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x08BA 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x08CA 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x08DA 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x08EA 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x08FA 02 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 0x090A 05 04 00 00 05 04 00 00 05 04 00 00 05 04 00 00 0x091A 05 04 00 00 05 04 00 00 05 04 00 00 05 04 00 00 0x092A 05 80 00 00 05 80 00 00 05 80 00 00 05 80 00 00 0x093A 05 80 00 00 05 80 00 00 05 80 00 00 05 80 00 00 0x094A 05 30 00 00 05 30 00 00 05 30 00 00 05 30 00 00 0x095A 05 30 00 00 05 30 00 00 05 30 00 00 05 30 00 00 0x096A 06 fc 00 00 06 fc 00 00 06 fc 00 00 06 fc 00 00 0x097A 07 60 00 00 07 60 00 00 08 0b 00 00 08 09 00 00 0x098A 04 01 00 00 04 01 00 00 04 01 00 00 04 01 00 00 0x099A 04 01 00 00 04 01 00 00 04 01 00 00 04 01 00 00 0x09AA 04 01 00 00 04 01 00 00 04 01 00 00 04 01 00 00 0x09BA 04 01 00 00 04 01 00 00 04 01 00 00 04 01 00 00 0x09CA 05 fe 00 00 05 fe 00 00 05 fe 00 00 05 fe 00 00 0x09DA 05 fe 00 00 05 fe 00 00 05 fe 00 00 05 fe 00 00 0x09EA 05 02 00 00 05 02 00 00 05 02 00 00 05 02 00 00 0x09FA 05 02 00 00 05 02 00 00 05 02 00 00 05 02 00 00
Processing continues from 44th byte of SRC
. The remaining bytes are processed with external intemediate buffer:
IMM: Intermediate data, 319 bytes: 0x0000 00 00 00 00 08 00 0a 00 ff 06 05 66 fe 00 61 05 0x0010 11 ff 60 04 00 fc 11 00 16 61 01 11 ff 01 01 11 0x0020 01 00 fe 06 60 02 11 01 00 fc 10 00 06 11 02 00 0x0030 fe 01 10 01 00 ff 01 03 11 02 00 fc 61 10 00 01 0x0040 01 10 01 00 fe 06 01 01 10 fe 01 06 02 00 fb 61 0x0050 10 11 10 60 04 00 00 00 06 00 0c 00 01 00 ff 30 0x0060 03 00 fe 03 83 03 00 fd 38 08 30 01 00 fc 03 80 0x0070 00 83 01 00 ff 38 01 00 f9 08 30 00 08 80 00 88 0x0080 01 00 ff 33 01 00 fe 03 30 19 00 fe 08 88 02 00 0x0090 0b 00 04 00 06 00 09 00 02 00 ff 33 04 00 fd 08 0x00A0 30 00 02 33 fc 00 83 00 08 01 88 fd 80 08 30 04 0x00B0 00 ff 83 04 00 fe 83 08 01 88 fd 80 08 30 02 33 0x00C0 fe 00 83 03 00 fd 08 30 00 04 00 0b 00 06 00 0c 0x00D0 00 ff 00 01 33 ff 30 02 00 fe 08 88 1a 00 ff 33 0x00E0 01 00 f9 03 30 00 08 80 00 88 01 00 ff 38 01 00 0x00F0 f9 08 30 00 03 80 00 83 02 00 fd 38 08 30 02 00 0x0100 fe 03 83 02 00 00 00 04 00 06 00 09 00 01 00 ff 0x0110 33 03 00 fe 03 80 03 00 fe 38 00 02 33 fd 03 80 0x0120 08 01 88 fe 80 38 04 00 ff 38 04 00 fd 03 80 08 0x0130 01 88 fc 80 00 38 00 02 33 fd 00 03 80 02 00
Finally, we have the result:
DST: The Result, 372 bytes: 0x0000 00 00 00 00 08 00 0A 00 06 66 66 66 66 66 66 00 0x0010 67 77 77 77 77 77 77 60 67 77 77 77 77 66 77 76 0x0020 06 66 66 76 66 77 77 76 00 06 77 67 77 77 77 66 0x0030 00 00 66 67 77 77 76 76 00 00 67 76 66 66 67 76 0x0040 00 00 06 66 66 67 77 66 00 00 00 67 76 77 76 60 0x0050 00 00 00 06 66 66 66 00 04 00 00 00 06 00 0C 00 0x0060 00 00 30 00 00 00 00 03 B3 00 00 00 00 3B BB 30 0x0070 00 00 03 BB BB B3 00 00 3B BB BB BB 30 00 33 3B 0x0080 BB 33 30 00 00 3B BB 30 00 00 00 3B BB 30 00 00 0x0090 00 3B BB 30 00 00 00 3B BB 30 00 00 00 3B BB 30 0x00A0 00 00 00 33 33 30 00 00 0B 00 04 00 06 00 09 00 0x00B0 00 00 00 33 00 00 00 00 00 3B 30 00 33 33 33 3B 0x00C0 B3 00 3B BB BB BB BB 30 3B BB BB BB BB B3 3B BB 0x00D0 BB BB BB 30 33 33 33 3B B3 00 00 00 00 3B 30 00 0x00E0 00 00 00 33 00 00 04 00 0B 00 06 00 0C 00 00 33 0x00F0 33 30 00 00 00 3B BB 30 00 00 00 3B BB 30 00 00 0x0100 00 3B BB 30 00 00 00 3B BB 30 00 00 00 3B BB 30 0x0110 00 00 33 3B BB 33 30 00 3B BB BB BB 30 00 03 BB 0x0120 BB B3 00 00 00 3B BB 30 00 00 00 03 B3 00 00 00 0x0130 00 00 30 00 00 00 00 00 04 00 06 00 09 00 00 00 0x0140 33 00 00 00 00 03 B3 00 00 00 00 3B B3 33 33 33 0x0150 03 BB BB BB BB B3 3B BB BB BB BB B3 03 BB BB BB 0x0160 BB B3 00 3B B3 33 33 33 00 03 B3 00 00 00 00 00 0x0170 33 00 00 00
Decompression completed and it's easy to derive bitmaps like this:
00 00 00 00 08 00 0A 00
08 * 0A = 50 (80)
Get next 80 bytes of DST:
06 66 66 66 66 66 66 00 67 77 77 77 77 77 77 60 67 77 77 77 77 66 77 76 06 66 66 76 66 77 77 76 00 06 77 67 77 77 77 66 00 00 66 67 77 77 76 76 00 00 67 76 66 66 67 76 00 00 06 66 66 67 77 66 00 00 00 67 76 77 76 60 00 00 00 06 66 66 66 00
Arrange those bytes assuming that 08 and 0A are width and height respectively:
06 66 66 66 66 66 66 00 67 77 77 77 77 77 77 60 67 77 77 77 77 66 77 76 06 66 66 76 66 77 77 76 00 06 77 67 77 77 77 66 00 00 66 67 77 77 76 76 00 00 67 76 66 66 67 76 00 00 06 66 66 67 77 66 00 00 00 67 76 77 76 60 00 00 00 06 66 66 66 00
Extend this with zeroes:
00 06 06 06 06 06 06 06 06 06 06 06 06 06 00 00 06 07 07 07 07 07 07 07 07 07 07 07 07 07 06 00 06 07 07 07 07 07 07 07 07 07 06 06 07 07 07 06 00 06 06 06 06 06 07 06 06 06 07 07 07 07 07 06 00 00 00 06 07 07 06 07 07 07 07 07 07 07 06 06 00 00 00 00 06 06 06 07 07 07 07 07 07 06 07 06 00 00 00 00 06 07 07 06 06 06 06 06 06 07 07 06 00 00 00 00 00 06 06 06 06 06 06 07 07 07 06 06 00 00 00 00 00 00 06 07 07 06 07 07 07 06 06 00 00 00 00 00 00 00 00 06 06 06 06 06 06 06 00 00
Thats it! Wrapping it in BMP header gives us a cute cursor image.
I was lucky again! It turned out that two algorithms were consistently applied there. The first is some unknown compression algorithm, and the second is a kind of Run-Length coding [here, I will not describe which one — I did this in response to my own question on the Reverse Exchange] . As a result, the second part, instead of ~ 400 lines of Assembler , fit in 50 lines in C :
typedef struct rle_hdr_t { uint32_t unknown; uint16_t width; uint16_t height; } rle_hdr_t; static int decode_rle(uint8_t *_src, uint32_t len, uint8_t *_dst) { uint32_t total_len = 0; uint8_t *src = _src, *dst = _dst; rle_hdr_t *rle; while (len) { uint32_t bm_size = 0, bm_width = 0, bm_height = 0; uint8_t *p = NULL; rle = (rle_hdr_t*)src; bm_width = rle->width; bm_height = rle->height; bm_size = bm_width * bm_height; memmove(dst, src, sizeof(rle_hdr_t)); total_len += sizeof(rle_hdr_t) + bm_size; src += sizeof(rle_hdr_t); dst += sizeof(rle_hdr_t); len -= sizeof(rle_hdr_t); p = dst; while (bm_size) { if (*src > 0x7F) { int i = 0x100 - *src++; len--; while (i--) { *dst++ = *src++; bm_size--; len--; } } else { int num = *src++, val = *src++; len -= 2; memset(dst, val, (size_t)++num); dst += num; bm_size -= num; } } for (uint32_t i = 0; i < bm_height - 1; i++) { for (uint32_t j = 0; j < bm_width; j++) { p[((i + 1)*bm_width) + j] ^= p[(i*bm_width) + j]; } } } return total_len; }
, .IMH
-. 27- 108 BMP
.
. , . ( .PIC
) . , github . . , :)
«». 2:
Source: https://habr.com/ru/post/352050/
All Articles