Objective-C Runtime for Si-Schnick. Part 2

Hello again. My series of articles is dedicated to those programmers who have switched from C to Objective-C, and would like to answer the questions “how exactly is Objective-C based on C?” And “how does all this happen from the inside?”.

Many thanks to all for the feedback; it is your interest that serves as an incentive for me to continue my articles on the thorough study of the Objective-C Runtime. I started this part with the subject of my articles, because I want to make a couple of clarifications:
')

My articles are not a manual on Objective C. We study the Objective-C Runtime so low level as to understand it at the level of the C language.
My articles are not a guide to the C language and debuggers. We go down to the level of the C language, but not lower. Therefore, issues such as the presentation of data in memory, I do not affect. It is assumed that you know all this without me.

Of course, articles will be of interest to other categories of programmers as well. But keep in mind these two points.

If you have not read the first article, I strongly recommend reading it first: http://habrahabr.ru/post/250955/ . And if you have already read, then welcome under cat.

We "call the methods," and the pedants "send messages."

In the previous article, we dealt with “calling methods” or, as it is also called, “sending messages”:

[myObj someMethod];

We came to the conclusion that at run time, such a construction ended up with a call to the objc_msgSend () function and got a good deal out with the selectors.

Now let's take a look at the objc_msgSend () function in detail to understand the principles of this notorious message sending to an object.

This function is called every time you call a method on an object. It is logical to assume that the speed of its work greatly affects the speed of the entire application. Therefore, if you look at the source code of this function, you will find that it is implemented in assembler for each of the platforms.

Before dealing with the source code, I suggest reading the documentation :

...

The message sending function does everything that is needed for dynamic linking:

First of all, it finds the procedure (method implementation) referenced by the selector. Since the same method can be implemented by completely different classes, the same procedure that it ( the objc_msgSend function, author's note ) searches for depends on the class of the recipient (to which we send the message, author's note ).
Then it calls this procedure, passing in the receiver object (a pointer to it) and all the arguments that were passed in the method call.
Finally, it returns the result of the procedure as its own result.

...

Lyrical digression

Already on the basis of documentation alone, we understand that the phrase “call a method” is absolutely correct when applied to Objective C. Therefore, if some clever guy corrects you, saying that it’s correct to say “send a message” and not “call a method”, then you can boldly send it to two well-known words - reading the documentation.

Well, with the second and third points everything is so clear. But with the first you need to understand a little more: how exactly a completely abstract selector is transformed into a very specific function.

Communicating with class methods in C

Once the familiar function objc_msgSend () first looks for a function that implements the method being called, then we can find this function and call it ourselves.

Let's write a small test program that will allow us to get acquainted with the call of methods a little closer:

 #import <Foundation/Foundation.h> #import <objc/runtime.h> @interface TestClass : NSObject - (void)someMethod; - (void)callSomeMethod; - (void)methodWithParam:(const char *)param; @end @implementation TestClass - (void)someMethod { NSLog(@"Hello from %p.%s!", self, _cmd); } - (void)callSomeMethod { NSLog(@"Hello from %p.%s!", self, _cmd); [self someMethod]; } - (void)methodWithParam:(const char *)param { NSLog(@"Hello from %p.%s! My parameter is: <%s>", self, _cmd, param); } @end int main(int argc, const char * argv[]) { TestClass * myObj = [[TestClass alloc] init]; [myObj someMethod]; [myObj callSomeMethod]; [myObj methodWithParam:"I'm a parameter"]; return 0; }

From the documentation we become aware that when calling the desired function, objc_msgSend () passes the parameters to it in the following order:

Pointer to the object whose method we called
The selector by which we called the method
The remaining arguments that we passed to the method

That is why our test program looks like this: in each of the methods we output self and _cmd to the log, which contain the pointer “to itself” and the selector, respectively.

If you run this program, the output will be something like:

2015-02-21 12: 43: 18.817 ObjCRuntimeTest [7092: 2454834] Hello from 0x1002061f0.someMethod!
2015-02-21 12: 43: 18.818 ObjCRuntimeTest [7092: 2454834] Hello from 0x1002061f0.callSomeMethod!
2015-02-21 12: 43: 18.819 ObjCRuntimeTest [7092: 2454834] Hello from 0x1002061f0.someMethod!
2015-02-21 12: 43: 18.819 ObjCRuntimeTest [7092: 2454834] Hello from 0x1002061f0.methodWithParam :! My parameter is: <I'm a parameter>

Now we will try to call these methods using the C language. To do this, we take from the object a pointer to a function that implements the method of our class. Considering that we work at the C level, we follow to determine the types that will allow us to work with pointers to our functions. Given all this, we have the following code in the function main ():

 int main(int argc, const char * argv[]) { typedef void (*MethodWithoutParams)(id, SEL); typedef void (*MethodWithParam)(id, SEL, const char *); TestClass * myObj = [[TestClass alloc] init]; MethodWithoutParams someMethodImplementation = [myObj methodForSelector:@selector(someMethod)]; MethodWithoutParams callSomeMethodImplementation = [myObj methodForSelector:@selector(callSomeMethod)]; MethodWithParam methodWithParamImplementation = [myObj methodForSelector:@selector(methodWithParam:)]; someMethodImplementation(myObj, @selector(someMethod)); callSomeMethodImplementation(myObj, @selector(callSomeMethod)); methodWithParamImplementation(myObj, @selector(methodWithParam:), "I'm a parameter"); return 0; }

Well, we have already called the methods exclusively using the means of the C language. The exception in this case was only the selectors, with whom we have already figured out enough in the previous article. And the methodForSelector: method remains the black box for us.

Message Engine in Objective C Runtime

The key point in implementing the message engine in Objective C Runtime is how the compiler represents your classes and objects.

If expressed in terms of the C ++ language, objects in RAM are created not only for each of the instances of your classes, but also for each class. That is, by describing a class that inherits the base class NSObject, and creating two instances of this class, during execution you will receive two objects created by you and one object of your class .

This very class object contains a pointer to an object of the parent class and a table of matching selectors and function addresses, called the dispatch table. It is with the help of this table that the objc_msgSend () function searches for the necessary function that needs to be called for the selector passed to it.

Each class that inherits from NSObject or NSProxy has an isa field, which is exactly the same as a pointer to a class object. When you call a method on an object, the objc_msgSend () function passes on the isa pointer to the class object, and searches in it for the address of the function that implements this method. If it does not find such a function, then it goes to an object of the class of the parent object and searches for that function there. This happens until the desired function is found. If the function was not found anywhere, including in an object of the NSObject class, then we all receive a known exception:

unrecognized selector sent to instance ...

And in fact...

At present, the rather slow process of searching for functions has been slightly improved. If you call a method of an object, then it, once found, will be placed in a certain cache table. Thus, if you call the methodForSelector: method on any object, the first time the search will be performed for the desired function, and when the function is found in an object of the NSObject class, it will be cached in the table of your class, and next time the search for this function will not take much time.

In addition, an exception will not occur immediately if the implementation of the method is not found. There is also a mechanism such as Message Forwarding .

Let's confirm this with real research based on the Objective-C Runtime source code and the NSObject class.

As we already understood, NSObject has a methodForSelector: method whose source code looks like this:

 + (IMP)methodForSelector:(SEL)sel { if (!sel) [self doesNotRecognizeSelector:sel]; return object_getMethodImplementation((id)self, sel); // self -     } - (IMP)methodForSelector:(SEL)sel { if (!sel) [self doesNotRecognizeSelector:sel]; return object_getMethodImplementation(self, sel); // self -     }

As we see, this method is implemented both for the class itself and for class objects. In both cases, the same object_getMethodImplementation () function is used:

 IMP object_getMethodImplementation(id obj, SEL name) { Class cls = (obj ? obj->getIsa() : nil); return class_getMethodImplementation(cls, name); }

Stop! What is this construction "(obj? Obj-> getIsa (): nil)"!? Indeed, in all articles we are told ...

And it all starts with the build settings for the Objective C Runtime project file:

CLANG_CXX_LANGUAGE_STANDARD = "gnu ++ 0x";
CLANG_CXX_LIBRARY = "libc ++";

And here is the implementation of the getIsa () method itself:

 inline Class objc_object::getIsa() { if (isTaggedPointer()) { uintptr_t slot = ((uintptr_t)this >> TAG_SLOT_SHIFT) & TAG_SLOT_MASK; return objc_tag_classes[slot]; } return ISA(); }

In general, it just so happens that any object in Objective-C must contain the isa field. And the class object is no exception.

All this pornography is pretty messy. The methodForSelector: method has an absolutely identical implementation, both as an object method and for a class method. The only difference is that in the first case, self points to our object, and in the second, to a class object.

Damn it, what the fuck !? How can we call obj-> getIsa () on a class object? What is going on there at all?

But the fact is that the class object really has the same field, which indicates the “class object for this class”. If expressed correctly, it indicates the metaclass . If you call an object method (the method that starts with the "-" sign), its implementation is searched for in its class. If you call a class method (starts with a “+” sign), its implementation is searched for in its metaclass.

I lied to you a little at the beginning of the article, saying that during execution, when you create two objects of your class, you get three objects: two instances of your class and an object of class. In fact, a class object is always created in conjunction with a metaclass object. That is, in the end, you get 4 objects.

To visually imagine the whole essence of this lawlessness, I will insert here a picture from this article:

Let's return to our case, where the function class_getMethodImplementation () is finally called via self:

 IMP class_getMethodImplementation(Class cls, SEL sel) { IMP imp; if (!cls || !sel) return nil; imp = lookUpImpOrNil(cls, sel, nil, YES/*initialize*/, YES/*cache*/, YES/*resolver*/); // Translate forwarding function to C-callable external version if (!imp) { return _objc_msgForward; } return imp; }

Inquisitive ones can see that the lookUpImpOrNil () function uses the lookUpImpOrForward () function, the implementation of which lies again on the Apple website . The function is written in C, which will make sure that everything works exactly as written in the documentation.

Summarizing

And finally, like last time, let's call the method exclusively using the C language:

 #import <Foundation/Foundation.h> #import <objc/runtime.h> @interface TestClass : NSObject @end @implementation TestClass + (void)someClassMethod { NSLog(@"Hello from some class method!"); } - (void)someInstanceMethod { NSLog(@"Hello from some instance method!"); } @end int main(int argc, const char * argv[]) { typedef void (*MyMethodType)(id, SEL); TestClass * myObj = [[TestClass alloc] init]; Class myObjClassObject = object_getClass(myObj); Class myObjMetaclassObject = object_getClass(myObjClassObject); MyMethodType instanceMethod = class_getMethodImplementation(myObjClassObject, @selector(someInstanceMethod)); MyMethodType classMethod = class_getMethodImplementation(myObjMetaclassObject, @selector(someClassMethod)); instanceMethod(myObj, @selector(someInstanceMethod)); classMethod(myObjClassObject, @selector(someClassMethod)); return 0; }

In fact, we are still far from understanding the mechanism of messages in Objective C. For example, we did not understand the return of the result from the called methods. But read about it in the following parts :).

Source: https://habr.com/ru/post/250977/

All Articles

Objective-C Runtime for Si-Schnick. Part 2

We "call the methods," and the pedants "send messages."

Communicating with class methods in C

Message Engine in Objective C Runtime

Summarizing

More articles: