I continue to lay out excerpts from the introductory course of our
company on industrial programming.
Part Three: Syntactic Sugar or the History of the Development of Languages
This part describes the history of the development of programming languages, as well as explains what OOP and functional programming are. Other parts can be found
here .
')
Syntactic sugar (syntactic sugar) - a general designation of additions to the syntax of the PL, which make using the language more convenient, but do not add new features to it.
The whole history of the development of PL is a history of increasing the sweetness of syntactic sugar.
Machine Languages
It all started with machine-dependent languages - languages, taking into account the structure and characteristics of certain computer platforms. Those who programmed on calculators remember how programs were compiled on them.
A dozen registers where the results of the calculations were recorded (where are these gigabytes of RAM?), A couple of shift registers (remember the Turing machine, yes, the registers used the data register to take the next command!), And the register of the command where it was necessary to write another operation (read value, write value, add value of two memory registers, etc.).
Architecture Neumann Background
Even the architecture of these devices did not always correspond to the von Neumann architecture — standard for modern computers.
Actually, the von Neumann architecture implies the separation of memory from the processor and the storage of mutable programs. Calculators are usually devices with a fixed set of programs being executed.
Actually, the transition to the von Neumann architecture gave rise to the ability to set automatically running programs from an external source - first from punched tapes and punched cards.
People programmed this way, punching a hole in the card corresponding to a specific register, thus, bit-wisely setting values in them. A lot of stories are connected with the way a program full of hundreds of punch cards in the literal sense of the word crumbled when clumsy technicians dropped these stacks of cardboard onto the floor.
Assembler
It was not very convenient to program machine codes, so an assembler appeared at the first opportunity — a language that repeats machine operations, but with human-readable commands and the ability to handle on paper to describe the algorithm not as a set of bits, but as some more meaningful text.
The assembler is also tied to the architecture of the machine (since its commands repeat the instructions of the processor), but the step into the abyss was already made and the languages began to overgrow more and more sugar crystals.
Stack languages
The first sign was the use of data stacks. The stack appeared to solve the problem of temporary storage of arbitrary data. Of course, the data can be stored in the register, but in this case you need to remember the name of each register from which you want to receive data.
The character of the stack is a special order of getting data from it: at any time in the stack, only the top element is available, i.e. last item pushed onto stack Unloading the top element from the stack makes the next element available, by analogy with the automatic horn - the first cartridge inserted into it can only be reached by the last one.
Now it may seem wildly inconvenient, but it allowed to create subroutines.
Before calling the subroutine, we populate a specially named stack with data. The subroutine, knowing in which order the parameters are put on the stack, can take them from there and use them when they are executed, and when done, put the results of their work into the same or a different stack. In addition, the main program has the ability to keep its data on the stack before transferring control to the subroutine. After returning control, the program simply restores its values from the stack and does not pay attention to the fact that the data in the processor's registers could be overwritten by the subroutine.
Macroassembler
The next step was a macro assembler. A macro assembler is a program for a macro processor, which in turn was a translator from a higher level language (macro assembler) into machine code. It became possible to create your own teams for, for example, using the stack.
Commands for working with the stack (push, pop), commands for copying stacks of data are born.
A macro assembler spawns higher-level languages, with teams that have dozens or even hundreds of processor commands. FORT, ALGOL, BASIC begin their journey ...
Modular languages
Having tasted the forbidden fruit of the extended syntax, the programmers did not stop and desired modularity: it’s so convenient to call a separately written module of the program and not go into its algorithm. The main thing is to know how it takes the input data and how it returns the result.
The assembler is replenished with commands that facilitate the naming and connection of modules, the transfer and return of control when calling various subroutines. Data exchange interfaces are evolving. The concept of a data structure arises.
Procedural languages
The logical addition to the modular language was the concept of a procedure or subroutine. The subroutine has two important features:
1. it is named, i.e. we can call a subroutine by name
2. having called a subroutine, we know for sure that it will return control to the same place it was called from
For example, in BASIC, the subroutine was called as GOSUB: Label :.
Function
Only one thing was missing: I wanted the variables of the parent program (from which the subroutine was called) not to be corrupted. But how was it? All variables in the global space, you will begin to use them in the subroutine - it overwrites them.
Thus, the concept of a function and local variables was invented: we call a named subroutine and pass some values there. The subroutine treats the passed values as local named variables.
With the development of the function has acquired the ability to return the result: before that, because, as it was - the return value was recorded in one of the global variables.
The function has the following features:
1. it is named
2. parameters are passed there
3. passed parameters are available as named parameters only inside the function, they are not visible outside the function
4. The function can use its local named parameters that are not visible outside this function.
5. function can return the result of work
An introduction to the syntax of a function harmoniously complements procedural programming languages.
Functional languages
The natural desire was to supplement functions with the opportunity to observe the local variables of the parent function in the called function when called.
To solve this, the genius of the twilight mind engenders the notion of execution context: this is the domain of named variables that functions can access at run time. This data area is made inheritance-expandable: when a child function is called, it creates its own context, populated with variables declared inside the daughter function. At the same time, outside the function-daughter these variables are not visible. But it will be available when calling granddaughter function, great-granddaughter function, and so on.
The ability to inherit the execution context is called closure.
The ability to fully work with the execution context generates functional programming languages.
Finally, they are created by adding the ability to pass a function as a parameter to call another function, as well as return the function as the result of the subroutine execution.
Data types
At the same time, the thought of programmers did not stand still. Programmers invented data types.
Initially, because as it was, the data were available exclusively in binary form - zero, and one.
People, to solve practical problems, it is more convenient to operate with abstractions of higher levels. Thus, integer data types appear without the possibility to indicate whether they are negative or not (byte, unsigned integer, unsigned long integer, etc.).
Then, as their development - data types with the ability to record a negative number (encoded by the first bit, in connection with which there were funny inco cases of inequality +0 and -0). Finally, for more convenient work with floating point, float and double float types emerged (as you might guess, double float is the same float, but with the ability to record more characters both before and after the comma).
Interesting is the byte representation of the float type - in principle, in order to transfer the number we need the same integer with the ability to specify a negative number and or not and indicating how many digits from the beginning of the number it is necessary to put a period.
For logical operations, in principle, the same zero and one were enough, but for the most kuziavosti they were wrapped in a boolean type with two true and false values (for which, in other matters, the same one and zero were standing).
The next type of programmers that was badly needed was an array. The data array was fundamentally different from the stack by the possibility of free access not only to the last stuck element, but in general to any number. The array was presented to programmers as glued cells, inside of which the data lay, for this initially the array was set at once of a certain size and this size could not be changed.
But, in fact, the cells are not necessarily filled? So it took the designation of an empty cell and the type null appears. In fact, at first he was a symbol with the code 0x0, which led to funny incidents, when this cell needed to write a zero value, and then read it and interpret it as null, and not as an unsigned integer with a value of 0.
To declare an array, a fragment of memory (buffer) was reserved, indicating how many cells will be located in this fragment, as well as which elements will be placed in it. And do not let Richie write to you in an array of type int an element of the form long! At best, subsequent elements were damaged, at worst, there was a buffer overflow and other non-array data could be damaged, located immediately behind the allocated memory buffer.
Strings, by the way, first appeared exactly as arrays of characters (I had to enter another data type - char, which essentially corresponded to byte). Because of this, the length of the lines had to be announced in advance.
In order to cope with strings of variable length, they invented to mark the end of the line with a null-marker. Ie, as before, the line was an array, but the length of this array was set at once large enough to hold any line (640kb of memory is enough for any program, yeah). The string started at the beginning of the array, and the end of it was marked as null-byte, what followed the null was not considered a string.
A good idea on paper to mark the end of the line with a null marker on closer inspection turned out to be terrible: nothing prevented you from adding a null in the middle of the line and having a bunch of lulz from it. So began the era of C-strings.
Links
The organization of work with data as with a memory buffer created an interesting opportunity when calling a function to transfer there not the data itself, but a link to it.
Previously, how was it? The values of the variables were transferred to the functions, these values were copied into the named variables of the functions in order to avoid damage to the original data.
But after all, just the value of the address of the selected memory fragment can be passed inside the function and then read out the variable of any available data type from it! So another type of data appeared - the link.
A link is a shortcut (link) to some variable for which a memory block is allocated. In the PL, methods of working with variables appear both by value (directly with this memory block) and by reference (we read the pointer from a variable, then we go through it and change the value in the memory there).
Data structures
It would seem that they fought for it and ran into it: isolated-isolated variables inside the functions, so as not to spoil them, and now we give a gun in our hands from which you can also shoot a leg!
But it was not there: passing variables by reference gave a unique opportunity to construct entire structures from simple data types - data structures!
For example, it became possible to organize a link to an array of links to arrays from ... So this same whole tree can be built!
Naturally, such an array in itself does not carry any practical value, since all this can be organized using simple data types, but if you add functions like addNode, removeNode working with a tree to the program, and pass a reference to the data structure to these functions, it turns out working and very seductive design.
Structural Languages
So it turns out that the programmer himself can create his own data types, convenient for his program - it is enough just to create a data structure and describe the functions of working with them!
This is how structural programming languages appear. They immediately add the ability to describe a new data type, the ability to somehow name this type and set some operations for it.
For example, a string can be represented not just as an array, but as a two-linked list with concatenation functions through the + operator and access to an arbitrary character through the [] operator.
Immediate rapid growth of structural languages (Pascal, C) begins, with the following features:
1. they have a formal language for describing data structures (* .h files in C)
2. they have the opportunity to give the described structure a name (BTree)
3. they have the ability to denote operations of work with this data structure
An object
The ability to create your own data types excites a desire among programmers to have functions within this data type to work with it.
As a response to the aspirations of these bright minds, the concept of the Object is born. An object is no longer just a data type, not just a link by which structured information is stored, but also functions for processing this information that are accessible via the same link.
Under this universal philosophy is brought:
“An object is an entity in virtual space, possessing a certain state and behavior, having specified values of properties (attributes) and operations on them (methods).”
Encapsulation
Deep philosophical studies make it possible to realize that an object has such a property as encapsulation, which is defined as the property of an object to combine data and methods of working with this data. Philosophers generally love recursive definitions.
The essence of encapsulation is simple: an object is not an object, if its state (that is, the data it contains) can be changed without applying the methods of the object. At the same time, it is considered that public variables of an object, accessible to everyone and everything for change, are also methods for changing the internal state of this object itself.
Actually, $ object-> property = 12345; It is considered equivalent to the $ object-> setProperty (12345); method, because without specifying the name of the $ object access to the $ property variable in the operation directly, you cannot get directly.
Inheritance
Even before the philosophers of the Object, programmers, when working with data structures, very much wanted and invented how to expand data structures, inherit the scheme of the parent structure in the child structure.
Creating the same Object that combines data with functions has created an interesting engineering problem, how to twist it: inherit the structure, inherit the functions, and add new features to the heir.
And the thing is that you have a function in the Parent Object, there is a function in the Heir Object, they do different things, but so that they have the same names - oops. The solution to this problem was called polymorphism.
Polymorphism
Philosophers and then jumped on their heels, giving the definition: “Polymorphism is the possibility of objects with the same specification having a different implementation.” The specification here refers to the names-signatures of methods for working with an object (including public variables).
There are many implementations of polymorphism, here are some of them:
- pure polymorphism (signature polymorphism)
- parametric polymorphism (polymorphism by method name)
- redefinition (abstraction, abstract classes)
- overload (incomplete replacement of the ancestor method by the descendant method)
Abstraction
Philosophical thought also did not stand still. After studying the properties of inheritance, philosophers realized that it can be replaced by abstraction.
Abstracting is such a thing ... How to explain? Here you have an Object - great, this is something material. And there is also an idea of how this object can be: what methods it should exhibit, what these methods should do, but without specifics, so abstractly (reminds customers, isn't it?). Actually, we have just described the interface of an object or an abstract ancestor whose covenants can be cast into the reality of code.
OOP
Actually, OOP is Object Oriented Programming. This is the ability of some programmers to work with objects within all three concepts: encapsulation, inheritance, and polymorphism. Well or encapsulation, abstraction and polymorphism.
The OOP paradigm has nothing to do with the MVC model (unlike the opinions of some PHP programmers). OOP is just working with data and data processing methods as with inherited objects.
Unlike procedural and structured programming paradigms, where if there are objects, they are not inheritable. Well, or there are no objects, all data is transferred in arrays, structures, allocated memory buffers.
Class-oriented programming
Object programming requires the creation of multiple objects (oddly enough). Accordingly, it is required to somehow organize the hierarchy of objects, to somehow miss them.
In response to these aspirations, the concept of a class instance was developed. What is a class? A class is a set of methods and functions without data. The class itself is something non-working, data is needed for work. Actually, in order to get a working object, you need to instantiate a class — say “create an object for me with the functions described in this class and the data that I will tell you now”.
In fact, a class is such a syntactic piece of sugar, which makes it possible not only to describe the object's API (as the interface does), but also to specify data processing functions.
The class system allows you to formally describe the properties of an object, the rules for inheriting properties of objects, the rules for accessing object data. The use of classes defines the class-oriented programming paradigm.
A class is a cool thing, but not necessary for OOP, since there are object-oriented languages that can be well managed without classes.
Prototype programming
Another way to set inheritance is a prototype. With prototype programming there are no instances of objects, the object exists in a unique form. But for each object you can set a prototype or prototypes - a list of objects, properties and methods of which it will inherit.
Historically, the model of inheritance through prototypes, which is shared by a language such as JavaScript, is older than through the description of classes. But class-oriented programming turned out to be more convenient for describing APIs and frameworks (and as you know, every mature Java programmer is required to write his own framework, as well as the maturity of a PHP programmer is determined using a self-written CMS), which is why it has become more common.