“Smooth” object model or what is “syntactic sugar” worth?

In the last 15-20 years, an incredible number of new programming languages have appeared. Many of them are represented as “true” object-oriented languages or, at a minimum, languages that support object-oriented programming (OOP). In any case, it is especially emphasized that in these languages it is possible to conduct development, adhering to an object-oriented methodology. In order for a programming language to be an OOP language, it must implement some Object Model . Due to the fact that OOP has evolved from a type of programming into something that is closer to the concept and methodology, different languages began to adhere to more or less the same object model. But, in addition, these languages are overgrown with different syntactic constructions, some of which belong to the so-called “syntactic sugar”, which allow for some frequently performed actions on objects to be more compact.
Due to the fact that in most programming languages the object model is very complicated, we will try to define a description, so to speak, of a “flat” Object Model , removing from it everything that is possible.

The terminal symbol (term) of the type "object" is defined. The expression “terminal symbol of the type“ object ”” means only that special lexical constructions that are expressed at the language level are available for this term, we will consider only two of them - the assignment and the method call.
The value of the terminal symbol “object” is a reference to a certain entity, which is described in the following clauses.
The operation "=" is defined - assignment by reference. Assigning one terminal character ( term1 ) to another ( term2 ) means copying a reference to the same entity from term1 to term2 .
An entity that is a term value has a number of methods. The method is called and its parameters are transferred using lexical constructions "." and "()", that is, for example, term.method(param1, param2) .
An entity has an unordered collection of (other) entities that make up its state.
There is at least one entity that allows you to create other entities with a given set of methods and with a given initial state.
There is at least one entity that allows you to copy a state and a set of methods from one entity to another.
Methods and state elements when copying (clause 7) are not replaced if there are the same ones with the same name in the essence to which copying is made.

The object model, the description of which is given in paragraphs 1-8, will be called the “flat object model”.
Generally speaking, the above certain model is greatly simplified, for example, the default inheritance from the base type is not even considered, and there is even no inheritance mechanism supported in each object — for this it is assumed that there is one predefined entity that can perform this action with other entities.
A flat object model is enough to call the language that supports it object-oriented. Often referred to in the characterization of OOP terms: encapsulation, inheritance and polymorphism are taken into account in paragraphs 4 and 5, 6 and 7, 8, respectively. The word hack is the lexical construction of the language, which allows you to perform an action on objects in a manner not described in paragraphs 1- eight. At once I want to draw attention to the fact that language tools that are not dependent on the object model of this language in question and are generally intended for other purposes not related to manipulation of objects will not be considered hacks . For example, if we consider the implementation of the Objective-C compiler built on top of the C ++ compiler, then the C ++ object model is not taken into account when considering hacks in the main Objective-C object model. Let us make a brief description of how some languages, including those that appeared not so long ago, were overgrown with hacks and how successful these hacks were.

Don't need hacks

Indeed, why do we need hacks, when it is possible to write programs beautifully and clearly, and besides, the interpreter of such a language is simpler and faster in terms of compilation due to the fact that there are no additional syntactic structures that should be interpreted. The flat object model discussed above assumes a minimum of syntactic constructions — assignment and method call — and if you make a pure object-oriented language that does not include other constructions except constructions for working with a flat object model, then it will be very simple. Very close to the Smalltalk flat object model. It has two syntactic constructs specified for a flat object model — assignment and method invocation (sending a message) —but there are also several hacks. First, there are ways to literally create special objects - blocks, arrays, characters. But it is very neat and also useful khaki, there’s nothing to dig into. Another thing, in Smalltalk is a metamodel. Strangely enough, Smalltalk was created in the expectation that children can program on it, but even the well-worn programmer can hardly figure out the Smalltalk metamodel. In Smalltalk there are classes, and classes are an obvious hack, an object-oriented model can quite do without classes. But for Smalltalk, this is almost not a hack, because both classes and metaclasses are just global objects that, in particular, solve the problems of points 7 and 8, so Smalltalk's hacks are very neat, soft and hardly anyone something will cause complaints.
For example, there are no constructors in the usual sense of the word; there is a new method that is inherited when creating a subclass. But there is no special construct for inheritance either; there is just sending a subclass message to the class from which the inheritance originates. And classes are simply global objects with all the ensuing possibilities:

 Object subclass: #Person instanceVariableNames: 'Family Name ' ...

')
That is, the entire complexity of managing objects is presented not at the level of the syntax of the language, but at the level of the metamodel, which itself is written by all the rules into the object model of the language.
Self is very good for a flat object model, it doesn’t even have an assignment operation, which is replaced by a method call (message sending), so that in this sense it is even cleaner than the flat object model discussed above.

Hacks as needed

The old hacks that appeared long ago in many programming languages are classes, static members, open, closed, protected members, constructors, destructors. These are so familiar things that very few people (to put it mildly, “few people”) will attribute them to hacks. But the fact is that these things were introduced into programming, in order to make it even not so much simpler as clearer. Moreover, in statically typed languages, without classes, which at the same time constitute the type of objects, it is generally very difficult to do. Although, a statically typed language is not an ideal environment for OOP, frankly. In general, these are quite pertinent things, another is surprising how easy they have become accustomed in dynamically typed languages. And this trend continues, for example, in CoffeeScript. As well done in JavaScript, an object is a hash table, which is logical, because externally an object is just an unordered set of properties and methods, it only remains to add inheritance and polymorphism, which was done very well in JavaScript. In general, a prototyped language is very close to a flat object model, and JavaScript shows it well. But to make on the basis of such a successful language as JavaScript another language - CoffeeScript, with classes - it already looks like a tribute to traditions, although this approach is more familiar to someone.

A few more thoughts on why classes and static members are a hack on the object model. Classes in fact play the role of global objects that create other objects. But at the same time, the classes are outside the scope of the object model (C ++, Java), although there are cases when the classes are located within the object model and there are objects themselves (Smalltalk, Python). It turns out that if there were no classes, it would be possible to use global objects for the same purposes. Many would probably say that global objects, like variables, are bad. Global variables are bad, but global objects in the language where they replace classes are not, because there is no difference with classes. On the contrary, sometimes classes are more complicated than with global objects. Take Java for example. In order for the Java interpreter to "remember" that there is such a class and initialize its static part, you have to write Class.forName("** ") , that is, the class seems to be known to the environment, but in practice, it must also be shown obviously that he is.
Accordingly, static members can simply be made as ordinary (nonstatic) members of such global objects. But again, these types of hacks are quite normal, at least, this is recognized by an absolute majority of developers.

Generics can also be attributed to forced and therefore successful hacks, because they make it possible to overcome some of the limitations of statically typed languages when creating classes that share common logic. But, if Generics is just a type casting substitutions and checking them for compatibility at the compilation stage, then C ++ templates are another story, perhaps this is one of the most powerful hacks on the object model that allows you to perform static metaprogramming, which is clearly good ... if know how to use.

There is a very interesting point in how the constraints of the object model of one language migrated to another. PHP has an object model that is syntactically very similar to the Java model. All anything, but there is one small difference between them, Java is a statically typed language, PHP is dynamically. There is such a thing in PHP as specifying the type of the argument passed to the function. But despite the fact that in Java and PHP it looks syntactically the same, due to the fact that these languages have different types of typing, this indication of the type of argument plays a completely different role. In Java, this is an instruction to the compiler how to deal with an argument when compiling, while in PHP it is the use of Reflection, that is, a runtime check that the argument has such a supertype.

Neat Khaki

Of course, first of all, in the category of neat hacks, I would like to attribute redefinition / operator overloading. These are no longer forced hacks, but a very convenient extension of the object model. It is clear that all operator overloads can be absolutely definitely replaced by method calls, but thanks to their use, the code becomes more natural and expressive. These types of features can also be attributed to successful extensions of the object model.

Powerful hack

Of course, one of the most powerful hacks is the closure. Currying is possible with or without a closure. In general, functional programming (FP) has ceased to be the property of the highest caste of programmers and has come to familiar languages. But the thing is that the FP has come not just to popular languages, but has come to languages where OOP has already ruled, and these are, in general, “slightly” different directions. It is clear that the OP could not move the PLO, so the OP had no choice but to lie on the PLO from above (in a good sense, I just don’t know how to say otherwise). Externally, syntactically, the closures look cool, it seems that the FP really works “honestly”. But judging sensibly how the FP can work honestly, for example, in Python, where objects occupied everything from classes to functions and modules? That's right, no way, it works not the OP, but the OOP, into which all these wrappers, lambdas, decorators are converted by the interpreter. That is, it turns out that all these gadgets from the world of FP are nothing more than regular hacks over the PLO. What are they transformed into? In very specialized objects, for example, in C # - in objects that have a supertype Delegate . These objects override only one method, in which, in essence, the closure is implemented. And what if there were no delegates? If we talk about C #, there would be a little tension in this case, because there are no non-static inner classes that could snap into the context of the surrounding object, as is possible in Java. But it would not have come to a catastrophe; one could simply pass a reference to an expandable object while explicitly creating a delegate object:

 class SomeContext { public int number { private set; get; } public SomeContext(int initialNumber) { number = initialNumber; } public Addition addition { get { return new Addition(this); } } public class Addition { //      Addition public Addition(SomeContext context) { _contest = context; } public void add(int number) { _contest.number += number; //   private-,      } private SomeContext _contest; } } class Program { static void Main(string[] args) { SomeContext context = new SomeContext(9); SomeContext.Addition addToContext = context.addition; //      addToContext.add(6); Console.WriteLine(context.number); } }

Of course, using the delegate, it would be simpler:

 class SomeContext { public int number { private set; get; } public SomeContext(int initialNumber) { number = initialNumber; } public Addition addition { get { return delegate(int what) { number += what; }; } } public delegate void Addition(int number); } class Program { static void Main(string[] args) { SomeContext context = new SomeContext(9); SomeContext.Addition addToContext = context.addition; // external delegate logic addToContext(6); Console.WriteLine(context.number); } }

In both cases, we see that the object to which a separate logic is allocated (an object of type Addition ) is locked to the internal context of an extensible object of type SomeContext . And the allocated logic can be very nontrivial, for the sake of which the processing of this logic is delegated to another object. However, the first of these two solutions has the advantage that in the selected object its state can be saved, if it exists, and in the second case an object of type Delegate is created without saving the state, that is, between the delegate calls its current state is not saved. It would be possible to write even more concisely, using lambda expressions, but now it's not about that. So closures are not such a strong thing, although in simple cases it may look very concise.

Hack hack

Khaki over a flat object model can be useful and convenient. But in some cases there are a lot of them and it seems that they are no longer manageable. Certainly one of the most hacked C # languages. It seems that the creators in this language wanted to shove everything that was possible, and moreover they tried to shove the same thing more than once. In particular, because of this, for example, there are four different ways to define the same callback design. Something seems to be busting ... But all these constructs use the syntax of the language and therefore overload it. There are also many other oddities, how difficult is the guys from Microsoft decided to hack the object model of C #:

Two types of objects - reference and value. Explicit-implicit transitions (packing / unpacking) between them. Objects “by value” are extremely not typical, not only for a flat, but even for a somewhat familiar object model. And this is despite the fact that all objects, including “by value”, inherit Object, which is a reference. Strange, isn't it?
Non-virtual methods. Generally speaking, non-virtual methods are a disconnected polymorphism, that is, a way out of the rules is again the usual object model. But this is not the strangest thing. Classes with non-virtual methods can implement interfaces. That is, at the first level, when moving from the interface to the implementing class with non-virtual methods, polymorphism works by itself on this first level, and then stops.
A polymorphic link can be stopped at any level of the inheritance hierarchy by declaring methods as new .
Structures that by definition cannot contain virtual functions can implement interfaces. Structures are generally very interesting. The structure is a type-value, and the interface is an object (reference) type. Suppose that the SomeStructure structure implements the SomeInterface interface. There is a ready-made instance of the instance structure, having the type SomeStructure . SomeInterface anObject = instance variable SomeInterface anObject = instance . Attention question: anObject will refer to an instance of the instance structure by value or by reference? It turns out, by value, although the interfaces are of object (reference) type. That is, in this assignment, a copy of the structure instance is created, which then exists by reference. If you don’t know this in advance, you can get caught somewhere.

Of course, all these hacks are not critical, at least in C # there is a more or less normal object model as a subset of hacked, and a normal one can be followed.

findings

The fancy syntax does not always give some really great opportunities in the implementation of application logic. Most often, if you don’t say “always,” any “syntactic sweets” can be replaced with the usual insertions of objects and even within the framework of a flat object model, you can write complex things without arriving at poorly readable and poorly structured code. Even more, having a fairly simple object model at your disposal, you can gracefully and simply cope with scary beasts that are not taken by silver bullets , you can still have even more freedom to implement non-trivial logic, for example, as can be seen in comparing two fragments C # code that uses a solution based on a new class and based on a delegate. Where the decision is based on a class, you can save the state between calls to the object (delegate) in which the logic is made.
Another conclusion that can be made, the functionality of the object model can be better when rendered into a metamodel, and does not exist at the level of the syntax of the language. As an example, the Smalltalk metamodel was cited. In this case, we get no less functionality, but without “cluttering up” the syntax. But in practice, the opposite trend has recently been observed - programming languages are developing in the direction of increasing the complexity of syntax (C #, CoffeScript, Scala), and languages with pure object models are fading into oblivion. For example, the development of Self is completely stopped, and Smalltalk is barely alive.

Source: https://habr.com/ru/post/149882/

All Articles