We read binary files of iOS-applications. Part 2: Swift

We continue the series about reading binary files of iOS applications. To understand the technical details, it is recommended to read the first part here . In this article we will look at how Swift code fits into a binary file.

So, create a Single View Application on Swift and add the following Inspected.swift:

import Foundation class InspectedObject { var intVar : Int = 57 let stringConst = "const string" func instanceMethod(arg:Int) -> Int { return arg + 57 } func toBeOverriden() {} static func classMethod() {} } class SubInspectedObject: InspectedObject { var subConstInt = 1543; let subStringVar = "sub const string" func subInstanceMethod() {} override func toBeOverriden() {} }

It is worth noting that such code makes sense to build only in the debug configuration, since in the release build, swift will inline and devirt everything.

Find our class again via objc_classlist. Instead of the name, we see the mangled string: __TMC12InspectedApp15InspectedObject. I will not discuss here in detail the mangling algorithm of Swift, but this is not particularly necessary, because along with a fairly new Xcode, the swift-demangle utility is supplied, which lies along approximately the following path:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swift-demangle

Driving through swift-demangle, we get:

 _TMC12InspectedApp15InspectedObject ---> type metadata for InspectedApp.InspectedObject

That is, at this address is the description of the class InspectedObject, it is logical. We look at the description, we see the same structure as the Objective-C-class, but not quite:

Two 64-bit words before the beginning of the structure also refer to the description of the class.
The last bit of the pointer to raw_data is 1. This bit serves as an identifier for what class is written in Swift.
After some set of fixed fields comes part of a variable size, a virtual table of methods and other members of the class.
The raw_data structure is also present, but all the information that it contains is also in the class descriptor.

The device of a Swift-class in a binary file can be studied in source codes . A class entry is collected from the fields of the following classes from this file:

HeapMetadataHeaderPrefix (destructor),
TypeMetadataHeader (value witness table),
TypeMetadataHeader (kind = isa),
TargetClassMetadata (everything else).

Putting it together:

 struct swift_class { uint64 destructor_addr; //   uint64 witness_table_addr; //     ,         uint64 metaclass_addr; //   Objective-C uint64 superclass_addr; //   Objective-C uint64 cache_addr; //   Objective-C uint64 vtable_addr; //   Objective-C uint64 data_addr; //   Objective-C + 1   uint32 class_flags; //   ( ) uint32 inst_addr_point; // ,    ,     uint32 inst_size; //    uint16 inst_align_mask; //   uint16 reserved; //      uint32 class_size; //  - uint32 class_addr_point; // ,   ,     int64 descriptor_rel_addr; //      (. ) int64 ivar_destroyer; //          (  ) }

Swift flags is an object of type ClassFlags from here .

After this fixed structure, the members of the class go as follows:

Superclass members (recursively).
There should be some reference to the data of the parent, but in the current implementation there is always a zero 64-bit word.
Template parameters for this class.
Class variables (if ever Swift will support them in this form).
Virtual methods.

Let's look at the InspectedObject and SubInspectedObject classes in our generated binary file. Pay attention to the variable part after the destructor of variables. These are a few 64-bit words. They are not rasparseny hopper, and therefore look something like this in it (here 0x100008144 and 0x100008158 are recorded in a row):

Imagine this in a more digestible form. InspectedObject:

 0x1000041b4 // intVar getter 0x1000041c8 // intVar setter 0x1000041e0 // intVar.materializeForSet() -- ,      ,   intVar ( -- [](https://github.com/apple/swift/blob/swift-3.0.1-preview-2-branch/docs/proposals/Accessors.rst)) 0x100004108 // instanceMethod (arg : Int) -> Int 0x100004138 // InspectedObject.toBeOverriden() --      0x1000081d8 // InspectedObject.init () ->InspectedObject 0x10 //   intVar 0x18 //  stringConst SubInspectedObject: 0x1000041b4 // intVar getter 0x1000041c8 // intVar setter 0x1000041e0 // intVar.materializeForSet() 0x100004108 // instanceMethod (arg : Int) -> Int 0x100004344 // SubInspectedObject.toBeOverriden() --       0x10000447c // SubInspectedObject.init() -> SubInspectedObject -- init    init  0x10 //   intVar 0x18 //  stringConst

Here ends the members of the superclass. Further:

 0x1000043e8 // subConstInt getter 0x1000043fc // subConstInt setter 0x100004414 // subConstInt.materializeForSet() 0x100004334 // SubInspectedObject.subInstanceMethod () 0x30 //   subConstInt 0x38 //   subStringVar

Note a couple of moments.

First, the reference to the toBeOverriden () method is located at the same place in InspectedObject and SubInspectedObject. This allows Swift to call virtual methods indented from the beginning of the class.

Secondly, Swift does not generate some setters and getters, and it does not follow the seemingly logical rule "to generate Ivars for variables, but not for constant ones".
Third, we note that the names and interfaces of the methods were provided by the hopper, and it got them from the symbol table. However, the corresponding symbols are not needed for the program to function, so in practice they are cut out from a binary file. Therefore, usually information about the signatures of swift methods cannot be obtained from a binary file, except for the case that we will discuss later.

Let us now dwell on the class descriptor. Pointer to a descriptor signed. For example, in our binary file this pointer lies at the address 0x1000094a0 and is written 0xffffffffffffffdd9e8. 0xffffffffffffd9e8 is a hexadecimal no-0x2618 negative entry. We get: 0x1000094a0 - 0x2618 = 0x100006e88 - the address where the handle lies. The following data is stored in the descriptor:

 struct { int32 name_addr; //    uint32 num_fields; //   uint32 fields_offsets_vector_offset; //         int32 fields_names_addr; //        int32 fields_types_accessor_addr; //   ,     uint32 generic_pattern_and_kind; //     int32 metadata_accessor_addr; //    ,     ,      }

It turns out that information about the types of Ivars is not stored explicitly. However, it can be extracted from the code of the types types accessor method. For example, fields types accessor for InspectedObject has the following lines (arm64 assembler, you can get an idea of it here ):

Here, types are stored on the stack, links to which are located at addresses 0x100008000 and 0x100008008. We look, that there lies:

We see the __TMSS and __TMSi parsed hoppers, which are thrashed into Swift.String and Swift.Int. The corresponding characters are non-local and not cut from the symbol table.

So, putting everything together and assuming the absence of symbols corresponding to internal methods, we get the following restored interface of the InspectedObject class:

 class InspectedObject { var intVar : Int; var stringConst : String; func sub_100004108() func sub_100004138() }

Note that the classMethod () class method is generated as an independent function, and it is impossible to restore its presence using only binary code.

In general, the Swift-restored interface is rather poor. However, if a class has an Objective-C class as an ancestor, then it supports Objective-C compatibility mode, and all swift methods are wrapped in Objective-C methods, which allows you to recover names.

So, we add the inheritance from NSobject to the InspectedObject declaration:

 class InspectedObject : NSObject { ... }

We look in the binary file. Now that raw_data is full, we see all the methods declared, including setters, getters, and also ClassMethod () in the metaclass. Method names are slightly changed, for example, instead of “InstanceMethod” we see “instanceMethodWithArg:”. Let's look at the code of this method:

This is again the code on the arm64 assembler, and all we need to know about it is that calls to it from other methods correspond to the instructions bl. We see that the corresponding swift method is called. Even if we do not have a symbol table, this method can be calculated, since all other calls (bl instructions) are retain and release, their characters are not cut out.

ClassMethod is in the same way in the metaclass. Now the interface recovers much better:

 class InspectedObject { var intVar : Int let stringConst : String func instanceMethodWithArg(Int) -> Int func toBeOverriden() static func classMethod() }

Source: https://habr.com/ru/post/325644/

All Articles

We read binary files of iOS-applications. Part 2: Swift

More articles: