Tuples in programming languages. Part 1

Now in many programming languages there is such a construction as tuples. Somewhere tuples in one way or another are built into the language, sometimes - again in one way or another - are implemented by means of libraries. C ++, C #, D, Python, Ruby, Go, Rust, Swift (as well as Erlang, F #, Groovy, Haskell, Lisp, OCaml and many others) ...
What is a tuple? Wikipedia gives a fairly precise definition : a tuple is an ordered set of fixed length. The definition, though accurate, is useless for us, and here's why: do most programmers think, why was this entity necessary? In programming, there are many data structures, both fixed and variable length; they allow you to store different values - both single-type and different types. Various arrays, associative arrays, lists, structures ... why else and tuples? And in languages with weak typing - and even more so, the difference between tuples and lists / vectors is completely blurry ... well, you cannot add elements to a tuple, so what? This may be misleading. Therefore, it is worthwhile to dig deeper and figure out why tuples are really needed, how they differ from other language constructs, and how to form an ideal syntax and semantics of tuples in an ideal (or close to ideal) programming language.

In the first part, we will consider tuples and tuple-like constructions in common and not very programming languages. In the second part, I will try to summarize and expand and propose the most universal syntax and semantics of tuples.

The first important thing that was not mentioned on Wikipedia: a tuple is a compile-time structure. In other words, it is a kind of entity that unites some objects at the compilation stage. And this is very important. Tuples are implicitly used in all programming languages, even in C and Assembler. Let's look for them in the same C, C ++, in any compiled language.
So, the function argument list is a tuple;
The initialization list of a structure or array is also a tuple;
The argument list of the template or macro is also a tuple.
The structure description, and even the usual code block, is also a tuple; only its elements are not objects, but syntactic constructions.

The tuples in the program are much larger than they appear at first glance. But they are all implicit; anyway, they are tightly bolted to some syntactic constructs. Explicit use of tuples in the old languages was not provided. In more modern languages, some possibilities of explicit use have begun to appear - but by no means all. Here we will consider mainly tuples of values - either variables or constants. Perhaps in the following sections I will consider the tuples of arbitrary syntax elements.
')
Let's start with the most obvious - returning multiple values from a function. Ever since school, I was surprised at such an injustice: why can a function take as many values as it wants, and return only one? Indeed, why is y = x * x a normal parabola, and y = sqrt (x) is some kind of trimmed half garbage? Is this not a violation of mathematical harmony? In programming, of course, you can return a structural object, but the essence remains the same: one object is returned, not several.

Direct implementation of multiple returns is in Go . A function can explicitly return multiple values. The syntax allows you to assign these multiple values to several variables, as well as perform group assignments and even permutations of the arguments in one operation. However, no other group actions other than assignment are provided.

func foo() (r1 int, r2 int) { return 7, 4 } x, y := foo() x, y = 1, 2 x, y = y, x

An interesting feature that you should pay attention to is the “batch” transfer of several return values of one function to another function.

 func bar(x int, y int) { } bar(foo())

Such a packet transmission is in itself extremely interesting. On the one hand, she seems quite elegant; but on the other hand, it is too “implicit”, non-universal. For example, if you try to add a third argument to bar, and try to combine the “packet” transmission and the normal

 bar(foo(), 100)

then it won't work - compile error.

Another interesting aspect is the non-use of return values. Recall C / C ++. In them (as well as in the overwhelming majority of other languages - Java, C #, ObjC, D ...) you could safely ignore the returned values when calling the function. In Go, this is also possible, and you can ignore both the single return value and the group. However, attempting to use the first return value and implicitly ignore the second results in a compilation error. It is possible to ignore, but explicitly - using the special character "_":

 x, _ := foo()

Those. the “all or nothing” principle works: you can either ignore all return values, or use — but also everything.

Rust has similar features. Similarly, functions can return multiple values; You can also initialize them with new values. In this case, the multiple assignment as such is absent, only initialization is possible. Similarly, you can use the "_" symbol for unused values. Similarly, you can ignore the returned values completely, or get them all completely. Also tuples can be compared:

 let x = (1i, 2i, 3i); let y = (2i, 3i, 4i); if x == y { println!("yes"); } else { println!("no"); }

We note this fact: we encountered the first operation on tuples, different from assignment. Also here there is another interesting possibility - the creation of named tuples and their subsequent use "as a whole".

In Swift, the possibilities are generally similar. Of the interesting - access to the elements of the tuple at a constant index through the point; the ability to assign names to elements of a tuple and access elements through them.

 let httpStatus = (statusCode: 200, description: "OK") print("The status code is \(httpStatus.0)") print("The status code is \(httpStatus.statusCode)")

Such tuples are already close to structures, but all the same structures are not. And here I would like to move away from the examples and move on to my own thoughts. The difference between tuples and structures is that a tuple is not a data type, it is something lower level; it can be said that a tuple is simply a (possibly named) group of (possibly named) compile-time objects. In this place, recall the languages C / C ++. The simplest array initialization structures and structures look like this:

 int arr[] = {1, 2, 3}; Point3D pt = {1, 2, 3};

Note that the initialization lists are generally identical in this case. And yet, they initialize completely different data objects. This behavior is generally not typical of a data type. But on the other hand, it is close to another interesting feature, which is sometimes (but rarely) found in programming languages - structural typing . The construction in braces is a typical tuple. By the way, in C there is a named initialization of the structure fields (the idea is very similar to Swift, by the way), which has not yet been dragged into C ++ 17:

 Point3D pt = {.x=1, .y=2, .z=3};

In C ++, we went a bit in a different direction: we introduced the concept of "unified initialization syntax and initialization lists". Syntactically, these are the same tuples that can be used to initialize objects; In addition to the old features, the unified initialization syntax allows you to transfer objects to functions and return them from functions as tuples.

  Point3D pt{10,20,30}; //    Point3D foo(Point3D a) { return {1, 2, 3}; //  "" } foo( {3,2,1} ); //  ""

Another interesting feature is initialization lists. They are used to initialize dynamic data structures, such as vectors and lists. Initialization lists in C ++ must be uniform, that is, all elements of the list must be of the same type. Technically, such lists form constant arrays in memory, for access to which the iterators std :: initializer_list are used. We can say that the template type std :: initializer_list is a special interface defined on the compiler level for uniform tuples (and in fact to constant arrays). Of course, initialization lists can be used not only in constructors, but also as arguments of any functions and methods. I think if in C ++ there would initially be some kind of template data type corresponding to a literal array and containing information about the length of this array, it would be quite suitable for the role of std :: initializer_list.

Also in the standard C ++ library (and in Boost) there are tuples implemented using templates. Since this implementation is not part of the language, the syntax is slightly cumbersome and non-universal. Thus, the type of a tuple has to be declared explicitly with the types of all fields indicated; to construct objects, use the function std :: make_tuple; to create a tuple "on the fly" (from existing variables), another pattern is used - tie, and accessing elements is done using a special template method that requires a constant index.

 std::tuple<int,char> t1(10,'x'); auto t2 = std::make_tuple ("test", 3.1, 14, 'y'); int myint; char mychar; std::tie (myint, mychar) = t1; // unpack elements std::tie (std::ignore, std::ignore, myint, mychar) = t2; // unpack (with ignore) std::get<2>(t2) = 100; char mychr = std::get<3>(t2);

The example uses unpacking with the special value std :: ignore. This exactly corresponds to the underscore "_" used for the same purpose for group returns from functions in Go and Rust.

In a similar way (although simplified compared to C ++), tuples are implemented in C # . For the creation, the methods Tuple.Create (), a set of template classes Tuple <> are used, for access to the elements - fields with fixed names Item1 ... item8 (which is used to achieve index constancy).

In the D language there is a rather rich support of tuples. Using the tuple construct, you can form a tuple, and - including - perform multiple returns from a function. To access elements of a tuple, indexing with constant indices is used. You can also construct a tuple using the Tuple pattern, which allows you to create a tuple with named fields.

 auto t = Tuple!(int, "number", string, "message")(123, "hello"); writeln("by index 0 : ", t[0]); writeln("by .number : ", t.number); writeln("by index 1 : ", t[1]); writeln("by .message: ", t.message);

Tuples can be passed to functions. For this, indexing with a range is applied. Syntactically, it looks as if one argument is being passed, and in fact, the tuple is expanded into several arguments at once. At the same time, in D, unlike Go, there is no requirement for exact equality of the number of arguments of the function and elements of the tuple, that is, you can mix the transmission of single arguments and tuples.

 void bar(int i, double d, char c) { } auto t = tuple(1, "2", 3.3, '4'); bar(t[0], t[$-2..$]);

In D, there are still many possibilities associated with tuples — Compile-time foreach to bypass the tuples at compile time, the AliasSeq template, the tupleof operator ... in general, all this requires a separate large article.

And finally, let's consider the implementation of tuples in a little-known extension of the C language - CForAll or C∀ ( funny, but at the time of writing this article I could not google the language site - it is very likely that it was closed long ago and there were simply no references left; that is why I regularly scan the network for New programming languages and downloading everything I can reach ).

Tuples in C∀ can be declared at the language level, enclosing the list of objects in square brackets. A tuple type is created in the same way - a list of types is enclosed in square brackets. Objects and types of tuples can be declared explicitly. Tuples can be passed to functions, where they are expanded into argument lists (as opposed to Go, where this is possible only if the tuple exactly coincides with the function argument list).

 [ int, int ] w1; // -    [ int, int, int ] w2; // -    void f (int, int, int); // ,    f( 1, 2, 3 ); //   f( [ 1, 2, 3 ] ); // -     f( w1, 3 ) //    f( w2 ) // -

Another interesting topic is nested tuples and the rules for their disclosure. C / C ++ also uses nesting - when initializing arrays of structures whose elements are also arrays and structures. In C∀, there are rules called “tuple coercions”, in particular, disclosure of tuples with internal structure (flattering) and vice versa, structuring, when a “flat” tuple adapts to a complex internal structure (although this possibility is quite controversial, the discussion will be in the next part). And all this applies only to assignment, there is no mention of using these features with other operations.

 [ a, b, c, d ] = [ 1, [ 2, 3 ], 4 ];

C∀ provides both group and multiple assignments.

 [ x, y, z ] = 1.5; [ x, y, z ] = [ 1, 2, 3 ];

and even using tuples to access structure fields

 obj.[ f3, f1, f2 ] = [ x , 11, 17 ];

Due to the lack of a compiler, it was not possible to test all these possibilities in practice, but this is certainly an excellent food for thought. Actually, the next part of the article will be devoted to these reflections.

Source: https://habr.com/ru/post/276871/

All Articles

Tuples in programming languages. Part 1

More articles: