📜 ⬆️ ⬇️

Reverse Engineering ESP8266 - part 2

We continue the study of the module ESP8266. This time we will consider the process of downloading the firmware for disassembling.

The first part of the article is here .


Content


  1. Introduction
  2. ESP8266 Architecture
    • Memory card (address space)
    • Firmware format
    • Startup process
  3. Instruments
  4. Download firmware for research
  5. Xtensa Assembler
    • Registers
    • Basic operators
    • Functions
    • Conditional transitions
  6. Conclusion
  7. Links


4. Download firmware for research


Having prepared the necessary tools, we came to the most interesting part - loading and disassembling the firmware.
')
ELF

Let's start with the simplest — downloading an app.out file — firmware in ELF format created using the SDK.

As a rule, the app.out file is available when you have the source code for the firmware, which is much easier and more logical to study. However, to get acquainted with the features of the compiler, the location of the segments and see the names of the functions, I suggest to start with it.

After compilation and build in the build folder, we will have the app.out file, which is a compiled user code, data, libraries and debug information. In this form, the firmware cannot be loaded into the module, so after building the ELF file, the SDK converts app.out into one or two files in the firmware folder — 0x00000.bin and 0x40000.bin, which can be directly flashed.

Opening app.out in HIEW and looking at the table of segments (enter, F8, F6) we will see the following picture:


The VirtAddr column contains the addresses of the beginning of the segments in the Xtensa address space. Please note that three segments (.data, .rodata and .bss) will be loaded into the RAM area, the .text segment will be written to the address of the custom executable code, and the .irom0.text segment to the SDK library code. The remaining segments having a starting address equal to zero contain service information and will not be added to the firmware ready for uploading to the module.

Looking ahead, I’ll say that the .irom0.text segment will be copied to the 0x40000 file in its original form, while the .data, .rodata, .bss and .text segments will be compiled into the 0x00000.bin file, taking into account the format discussed above.

To load app.out into IDA, you need to do the following steps:

1. Open IDA Pro 6.6 or higher
2. Click "Go" - we will not open any files yet.
3. Open the menu item File - Script file and select the script definition of the processor xtensa.py
4. Download the app.out file, here you need to select the type of processor and click "Set":


5. In the next windows with a warning about an unknown type of machine and suggestions for downloading debugging information, click "Yes"
6. As a result, we get a file ready for investigation:


What is missing in it? There is a lack of a system ROM containing the basic functions of the module. If necessary, you can download it manually, we will do this in the next section.

System firmware module

We looked at a fairly simple upload to the IDA firmware as an ELF file. However, in practice, it is often necessary to examine the ready-made firmware extracted from the flash module (by connecting directly to the flash) or distributed as 0x00000.bin and 0x40000.bin files. Here you have to do a little manual work. Let's start by downloading the system ROM image. In the first part I gave a link to the archive with the file 40000000.bin - this is it. The sequence of actions is as follows:

1. Open IDA Pro 6.6 or higher
2. Click "Go"
3. Open the menu item File - Script file and select the script definition of the processor xtensa.py
4. Open the file 40000000.bin
5. Select the type of processor Tensilica Xtensa [xtensa] and click "Set"
6. Next, you need to specify the organization of the memory for the binary file to load correctly. Here we create a code segment at 0x40000000 and load our file into it:


7. The ROM image is loaded, but it is hard to read due to the lack of function names. Now let's load the script 40000000.idc, which will do additional work - determine the names of functions and create additional segments in the address space: File - Script file - 40000000.idc. Here is the result:


On this, the download of the system ROM can be considered complete, you can proceed to its research. The script has determined the names of the functions of the system ROM, now you can figure out what this or that function called from the SDK performs.

And here, by the way, is a function that copies user firmware from flash to SoC memory:


There is no such function in the SDK, so I gave it an arbitrary name.

But the firmware is not complete without downloading the user part - files 0x00000.bin and 0x40000.bin. Therefore, we will upload these files to the system ROM.

Custom firmware

We loaded the system ROM module into IDA, and the script prepared several segments for us to load the remaining parts. Let's start with a simple - download the code of libraries.

As I said above, the firmware file 0x40000.bin is an image of a code segment without any service information and is directly mapped into the address space of the processor at address 0x40240000. To load it in IDA we will do the following:

1. Make sure that the database 40000000.bin is open and the script 40000000.idc has created additional segments: RAM, ROM, IRAM, IROM
2. Select File - Load file - Additional binary file in the menu, open firmware file 40000.bin
3. In the next window, select the boot options. Note that loading is done by offsetting in paragraphs, i.e. instead of the address, we indicate the value 10h times less (we discard the last zero). The checkbox for creating a segment can be removed, it has already been created:


4. File uploaded. After specifying the beginning of the code (in this case, it is 4024000Ch), we get approximately the following picture:


Unlike the ELF file, the names of functions and variables will not be defined here, but there's nothing you can do about it.

Small lyrical digression
How to investigate the bare assembly code? How to understand what this or that function performs? For many well-known SDKs, IDA has FLIRT signatures that define the names of standard functions. In our case, this does not help, since the IDA does not know this SDK. So this work will have to be done manually. For example, here are a few methods in order of increasing complexity:

1. Find the signature of the investigated function in the disassembler of the ELF file compiled with the same SDK version. There is a chance that you will find it, and it will have a name (from the debug information). Including for this, I considered downloading the firmware in the ELF.
2. Known constants - the function can refer to text strings or binary data. With experience, many such constants are memorized by heart; if a constant is unfamiliar - google. Here is an example:


We see two remarkable constants. Google, with the very first link, gives a description of the strlen algorithm using these constants:


Comparing the implementation of the algorithm, it is safe to say that the address strlen is located at 40100E70h.
Or, such a piece of code immediately yields the division function:


3. Actually, the study of assembly code and an attempt to understand what the function performs. Sometimes it is possible to determine that we have a “local” implementation of a familiar algorithm, and sometimes not.
In any case, the skill of understanding assembly code comes with experience, so go for it!

Now go to the download file 00000.bin. We remember that this is not just an image, but a file with a structure that describes data and code segments. This is how it can look when viewed in hexadecimal:


First comes the 8 bytes of the general firmware header, which defines the number of segments and the entry point. Then come the segments themselves, also having 8 byte headers with the address and length.
To properly load them into IDA, I cut the data of each segment (without headers) into separate files, calling them at the download address:


Now it remains to load them into IDA. For each file, we perform a sequence of actions similar to loading the system ROM:

1. File - Load file - Additional binary file, select the data file
2. In the download parameters, we specify a segment (by the file name without the last zero); we do not create a segment.

That's it, now we have a fully loaded and ready to explore firmware!


In this part of the article, we reviewed the procedures for downloading various types of ESP8266 firmware for disassembling into IDA Pro. In the final part, we look at the features of the Xtensa processor, the differences from the x86 architecture, a set of registers and instructions.

Source: https://habr.com/ru/post/255153/


All Articles