On Habré there are many articles about how Swift / Objective-C runtime works, but for an even more complete understanding of what is happening under the hood, it is useful to climb to the lowest level and see how the iOS application code fits into binary files. In addition, of course, you have to climb under the hood when solving reverse engineering tasks. In this article, we will discuss the simplest constructs of Objective-C, and we'll talk about Swift and more complex examples in subsequent articles.
In the first two sections, I tried to highlight practical details in as much detail as possible so that the reader who wants to go through this tutorial does not need to google the “where to find the binary application file in Xcode” every two minutes. Then everything is as informative as possible.
We will study files with the 64-bit arm64 architecture. The objects of interest in a binary file are 16, 32-bit and 64-bit words written in a row and null-terminated strings, so it will be convenient for me to talk about them in C. For example, I will say that the description of the method in a binary file looks So:
struct objc_method { // : uint64 name_addr; // , null-terminated , , uint64 type; // , null-terminated , , uint64 imp_addr; // }
In addition to uint64, there are also 32-bit uint32 parameters, 16-bit uint16 parameters, and int64 and int32 are used for relative pointers.
IOS application binaries are in Mach-O format. Get an idea of this format here (chapter “Mach-O in Brief”). One convenient way to view Mach-O files is the Hopper disassembler. Download the trial version here .
To navigate in the hopper, it is convenient to know a couple of shortcuts:
Shift + S - list of sections
G - go to
To any address encountered in the assembler, you can double-click and go to it. In addition, the hopper parsit many names of different entities, and you can search by them (search string on the left closer to the top).
Sometimes it is useful to look at the unsigned binary code. To do this, you can select Hexadimal Mode on the above switch:
We will write the studied application ourselves. Let's create for this in Xcode Single View Application on Objective-C. For convenience, you can leave only the arm64 architecture in Build Settings. We will build (cmd + B) for the always available Generic iOS Device:
In principle, you can build it for a real device, but it is not worth it for a simulator, because a binary file will turn out to be very different (different architecture). So let our application is called InspectedObjc. For compactness, we will not use .h-files and we will write everything in .m-files. We create the InspectedObject.m file and set up a class in it with all sorts of varieties (code in the next section).
Do not forget to add it to the goal and build. We see in the Products folder the finished application:
Then Show in Finder and Show Package Contents on InspectedObjc.app. Ok, now you can feed the binary InspecteObjc file to the hopper.
This part, to some extent, is the explanation of the code from here . So, we will study how the class InspectedObject from this InspectedObject.m is laid out in a binary file:
#import <Foundation/Foundation.h> @protocol InspectedProtocol <NSObject> - (int)instanceMethod:(NSString *)string; + (NSNumber *)classMethod:(NSNumber *)number; @end @interface InspectedObject : NSObject<InspectedProtocol> { int intIvar; NSString __weak *weakStringIvar; NSNumber *strongNumberIvar; } @property(nonatomic, strong) NSString *strongStringProperty; @property(weak) NSNumber *weakNumberProperty; - (int)instanceMethod:(NSString *)string; + (NSNumber *)classMethod:(NSNumber *)number; @end @implementation InspectedObject - (int)instanceMethod:(NSString *)string { return intIvar; } + (NSNumber *)classMethod:(NSNumber *)number { return @234; } @end
We look in the binary file. The objc_classlist section is a list of class addresses:
struct objc_classlist { uint64 classes[num_classes]; }
In our binary file num_classes = 3. Hopper parses the names, and therefore it is clear that our class is the third:
The remaining two were generated when creating the Single View Application. You can identify the required class without parse names, for this you need to get the names yourself. How to do this is clear from what follows.
So, go to _OBJC CLASS $ _InspectedObject. In the hopper, it looks like this:
which corresponds to the following structure:
struct objc_class { uint64 metaclass_addr; // ; class methods uint64 superclass_addr; // , -- NSObject uint64 cache_addr; // __objc_empty_cache, uint64 vtable_addr; // 0, uint64 raw_data_addr; // raw_data (. ) }
Here you can see what methods roughly lie in the table of virtual functions.
We will call the instance variables of the class tracing paper with English - ivars (ivar). Class data:
struct raw_data { uint32 flags; // , uint32 instance_start; // , , uint32 instance_size; // uint32 reserved; // , - uint64 strong_ivar_layout_addr; // strong uint64 name_addr; // uint64 method_list_addr; // uint64 protocol_list_addr; // uint64 ivar_list_addr; // uint64 weak_ivar_layout_addr; // weak uint64 properties_list_addr; // }
Lists of methods, protocols, Ivars, and properties each consist of a 64-bit header and a sequence of some identical structures.
Let's start with a list of methods. It starts with a 64-bit header:
struct objc_list_header { uint32 flags; // uint32 size; // }
Next in a row are the following type of structure:
struct objc_method { uint64 name; uint64 signature; // uint64 implementation; // }
The signature may look, for example, like this: i24 @ 0: 8 @ 16. The figures here do not carry the semantic load, and the rest is decoded as follows :
i -> int - the return value
@ -> Objective-C object - 1st argument, self
: -> selector - 2nd argument
@ -> Objective-C object - 3rd argument (1st argument after “:”)
Exact Objective-C object types cannot be restored by signature.
Note that this list will include methods that we did not define, namely setters and property getters and the method - [InspectedObject .cxx_destruct], used in ARC (and Objective-C ++).
The protocol list header consists simply of a 64-bit list size. Next come the 64-bit protocol addresses that the class satisfies. The protocol in memory looks like this:
struct objc_protocol { uint64 isa_addr; // , , 0 uint64 name_addr; // uint64 protocols_addr; // , uint64 instance_methods_addr; uint64 class_methods_addr; uint64 optional_instance_methods_addr; uint64 optional_class_methods_addr; uint64 instance_properties_addr; }
Uncommented fields correspond to lists arranged similarly to the lists in the class's raw_data.
All Ivars are written to objc_ivar_list, starting with objc_list_header, and look like this:
struct objc_ivar { uint64 offset_addr; // , uint64 name_addr; // uint64 type_addr; // uint32 alignment; // , 8, uint32 size; // }
It should be noted that this list will also contain synthesized variables (in our case, NSString _strongStringProperty and NSNumber _weakNumberProperty). With the help of the "layout" fields of strong and weak variables from raw_data, it is possible to understand to which of the class members the object stores strong references and which of which are weak. The remaining variables will be assigned by value. Layout is a sequence of numbers from 1 to 15, ending in a zero byte. Every second number is the number of consecutive (in objc_ivar_list) variables of the same type and every second other number is the interval between blocks of consecutive variables of the same type. In our case, the variables are in the order of assign-weak-strong-strong-weak. First there are 2 non-strong variables, and then 2 strong. Therefore, the layout of strong variables is 0x22. Layout of weak variables - 0x1121. Hopper parses the layouts as strings, and therefore shows, for example, in the second case "\ x11!". To see the original byte sequence, you can go to Hexadimal Mode.
The property list starts with a header like objc_list_header and consists of the following structures:
struct objc_property { uint64 name_addr; // uint64 attributes_addr; // }
We describe the device attribute string . The line starts with “T” followed by the property type, then the properties are separated by a comma:
R - readonly
C - copy
& - retain (strong if using ARC)
N - nonatomic
G - custom getter
S - custom setter
D - dynamic
W - weak
P - property suitable for automatic garbage collection
t - type in the old encoding
The attribute string ends with the name Ivara of a property with the prefix V. For our property
@property(nonatomic, strong) NSString *strongStringProperty;
it turns out the attribute string "T @ \" NSString \ ", &, N, V_strongStringProperty".
Note that in the list of properties, as well as in the list of methods, there are some automatically generated:
@property(readonly) NSUInteger hash; @property(readonly, copy) NSString *description; @property(readonly, copy) NSString *debug_description; @property(readonly) id superclass; // , , "T#"
It is worth noting that Ivars for these properties are not synthesized.
For the full picture, it remains to say about class methods (class methods). These are ordinary methods, but the instance in them is a class and, therefore, they are stored in a class of a class, that is, in a metaclass. The metaclass also has raw_data and a list of methods. Putting it all together, we get the following restored class interface:
@interface InspectedObject : NSObject<InspectedProtocol> { int intIvar; NSString __weak *weakStringIvar; NSNumber *strongNumberIvar; NSString *_strongStringProperty; NSNumber __weak *_weakNumberProperty; } @property(nonatomic, strong) NSString *strongStringProperty; @property(weak) NSNumber *weakNumberProperty; @property(readonly) NSUInteger hash; @property(readonly, copy) NSString *description; @property(readonly, copy) NSString *debug_description; @property(readonly) id superclass; - (int)instanceMethod:(id)arg; - (void).cxx_destruct; - (id)strongStringProperty; - (id)setStrongStringProperty:(id)arg; - (id)weakNumberProperty; - (id)setWeakNumberProperty:(id)arg; + (id)classMethod:(id)arg; @end
Removing automatically generated, we get:
@interface InspectedObject : NSObject<InspectedProtocol> { int intIvar; NSString __weak *weakStringIvar; NSNumber *strongNumberIvar; } @property(nonatomic, strong) NSString *strongStringProperty; @property(weak) NSNumber *weakNumberProperty; - (int)instanceMethod:(id)arg; + (id)classMethod:(id)arg; @end
Total, not counting the loss of information about the exact types of Objective-C objects in the arguments, the interface is restored completely.
In conclusion, I want to make a small announcement: if you are interested in checking your code with an automatic analyzer, now is the time when this can be done absolutely free. Here you can read about Solar inCode and get trial access to one free scan.
Source: https://habr.com/ru/post/323346/
All Articles