Statics in a programming language: what besides types?

It is known that programming languages are static and dynamic. In static languages, the types of all values are known at compile time. As a result, the compiler can check if the value is being used correctly, whether this or that operation applies to it. It is more pleasant to find out about the error at the time of compilation than during execution - less errors will be released during testing and will reach the user. For this mistakes and appreciate static languages.

But why is it limited to data types only? Let's try a little fantasizing what else the compiler could do.

Suppose we have a constant for the number of seconds in a day:

const int secondsPerDay = 24*60*60;

We did not multiply the numbers on the calculator, so that when reading the code it would be counted as the number was calculated (or maybe it was just that the calculator was not at hand). But the program is not obliged to re-multiply these numbers at each initialization! Let the compiler multiply them themselves and put the finished value. I suspect a good compiler will do just that.

Moving into the science fiction domain (both in the subject area and in C ++).
')
We decided to release a localized version of our program for Mars residents:

 enum Planet{ Earth, Mars } const Planet locale = Mars; int secondsPerDay() { if(locale == Earth) { return 24*60*60; } if(locale == Mars) { return 24*60*60+37*60+23; } assert(false, "unknown locale"); }

The secondsPerDay () function is completely constant. For compiled code, it always returns the same value. Therefore, our hypothetical compiler has the right to calculate the value of this function, and substitute it at the place of the call. And if we then add Venus, forgetting to fix the secondsPerDay (), then when compiling with locale = Venus, the assert will work (when compiled, not when executed, as it happens in real life).

Recall the pseudo-function sizeof (), known since the days of C. This construction looks like a normal function, but in fact the compiler calculates the number of bytes that the function argument takes in memory. During execution, no function call already occurs. You can also include C ++ templates in the area of computational calculations. But more flexible possibilities for some reason are not supported.

Consider an example more interesting. Let's write a function that will generate a SQL query based on the name of the structure type, and its fields:

 //  -  ,      //    -   T fetchById<T>(Connection conn, int id){ //  !!! assert(is_struct(T), name_of(T)+" is not struct"); assert(has_field(T, id), name_of(T)+ " doesn't contains field id"); string query = "SELECT "; //  !      foreach (field f in T) { if(field.index != 0) { query += ", "; } query += field.name; } query += "FROM " + name_of(T) + " WHERE ID=?ID"; //       ...................................................................................................................... //       Record r,         //   C++     Java?    T result = new T; //            foreach (field f in T) { result[f] = r.getFieldValue<f.type>(f.name); } return result; }

Now our function is no longer constant, but some of its fragments are constant, and the hypothetical compiler performs whole cycles, replacing them with constant values or sequences of operators.

And don't say RTTI is invented! RTTI is a run time type information, it works during program execution. Here the type information is used at compile time to execute constant instructions by the compiler.

So, having received the structure as a result of a function, we enjoy a static check when referring to the names of the structure, in order to access the record fields by string representation. However, if our structure in the program does not coincide with the actual fields in the database, we again fly into a runtime error. But at least the code on which it depends is localized in the definition of the structure. For safety net, we can write the same universal function, which will check the compliance of the database with the declared types, and, if necessary, perform a restructuring of the database.

It is curious that the above problem is solved in at least two static languages (in dynamic it is solved without problems).

The LINQ language built into C # and VB.NET allows you to statically work with databases. Many features were implemented quite legally: lambda expressions, anonymous types, operator forms of methods. But for interfacing the statically declared classes with tables and fields in the database, Microsoft applied eighth-level magic called “integration of SQL schema information into CLR metadata” [1].

Another example is the HaskellDB library [2,3]. Haskel himself, with his monads, can be regarded as first-level magic. But developers needed third-level magic in the form of a nonstandard language extension called “Trex Extensible Structures”. And despite this, for linking with the database, each structure field has to be given an excessive advertisement. Example of declaring a table with two fields:

students :: Table(name :: String, mark :: Char)
name :: r\name =>
Attr (name :: String | r) String
mark :: r\mark =>
Attr (mark :: Char | r) Char

To more clearly present the benefits of our innovations, I offer more examples.
We have two structures:

 struct Order{ Date date; Client client; Product product; int qty; Numeric cost; }; struct Sales{ Product product; int qty; Numeric cost; };

Somewhere we decided to coordinate the data from Order to Sales

 void foo(Order o, Sales s){ foreach(field f in Sales){ s[f.name] = o[f.name]; } }

Now we can add new structures to our fields without changing the foo () function! And if the calculation of foo () becomes impossible due to the mismatch of the fields, we get a compilation error.

An example from the field of object-oriented design:

 class Shape{ .............................. }; class Square: public Shape{ public: Square(); }; class Circle: public Shape{ public: Square(); }; .................................... // ,       //    ,   //     Shape* shapeFactory(string shapeName) { foreach(type T in descendants(Shape)){ if(name_of(T) == shapeName){ return new T; } return null; }

During the compilation process, shapeFactory () will turn into a regular function:

 Shape* shapeFactory(string shapeName) { if("Square" == shapeName){ return new Square; } if("Circle" == shapeName){ return new Circle; } return null; }

References:
1. LINQ: .NET Language-Integrated Query
2. The official page of the project HaskellDB
3. Daan Leijen, Eric Meijer. Domain Specific Embedded Compilers. An article about how HaskellDB works. The problem of SQL and programming language pairing is also discussed there.

Source: https://habr.com/ru/post/123261/

All Articles

Statics in a programming language: what besides types?

More articles: