Rust programming language overview

Rust is a new experimental programming language developed by Mozilla. The language is compiled and multiparadigmatic, it is positioned as an alternative to C / C ++, which in itself is interesting, since even there are not so many applicants for competition. You can recall D Walter Bright or Go from Google.
Rust supports function, parallel, procedural, and object-oriented programming, i.e. almost the entire spectrum of paradigms actually used in application programming.

I do not aim to translate the documentation (besides, it is very scarce and is constantly changing, since there has not been an official release of the language), instead I want to highlight the most interesting features of the language. Information is collected both from official documentation and from very few references to the language on the Internet.

')

First impression

The syntax of the language is built in the traditional C-like style (which can not but rejoice, since this is already the de facto standard). Naturally, the well-known errors of C / C ++ design are taken into account.
Traditional Hello World looks like this:

use std; fn main(args: [str]) { std::io::println("hello world from " + args[0] + "!"); }

An example is a little more complicated - the factorial calculation function:

 fn fac(n: int) -> int { let result = 1, i = 1; while i <= n { result *= i; i += 1; } ret result; }

As you can see from the example, functions are declared in a “functional” style (this style has some advantages over the traditional “int fac (int n)”). We see automatic type inference (the let keyword), the absence of round brackets in the while argument (similar to Go). The compactness of keywords is also immediately apparent. The creators of Rust really deliberately made all keywords as short as possible, and, to be honest, I like it.

Small but interesting syntactic features

You can insert underscores in numeric constants. Convenient thing, now this feature is added to many new languages.
0xffff_ffff_ffff_ffff_ffff_ffff
Binary constants. Of course, a real programmer should convert bin to hex in the mind, but it’s more convenient! 0b1111_1111_1001_0000
The bodies of any operators (even consisting of a single expression) must necessarily be enclosed in braces. For example, in C you could write if(x>0) foo(); , in Rust it is necessary to put braces around foo ()
But the arguments of if, while, and similar statements do not need to be enclosed in parentheses.
in many cases, code blocks can be considered expressions. In particular, it is possible for example:
```
 let x = if the_stars_align() { 4 } else if something_else() { 3 } else { 0 }; 
```
the syntax of the function declaration is first the fn keyword, then the argument list, the type of the argument is indicated after the name, then, if the function returns a value, the arrow "->" and the type of the return value
Variables are declared in the same way: the let keyword, the name of a variable;
let count: int = 5;
by default, all variables are immutable; The mutable keyword is used to declare variable variables.
the names of the basic types are the most compact of all that I have met: i8, i16, i32, i64, u8, u16, u32, u64, f32, f64
as mentioned above, automatic type inference is supported.

In language, there are built-in debugging tools:
The keyword fail completes the current process.
The log keyword outputs any language expression to the log (for example, to stderr)
The assert keyword validates the expression, and if it is false, terminates the current process.
The keyword note allows you to display additional information in case of a process crash.

Data types

Rust, like Go, supports structural typing (although, according to the authors, languages developed independently, so this is the influence of their common predecessors - Alef, Limbo, etc.). What is structural typing? For example, you have a structure declared in some file (or, in Rust terminology, “record”)
type point = {x: float, y: float};
You can declare a bunch of variables and functions with “point” argument types. Then, somewhere else, you can declare some other structure, for example
type MySuperPoint = {x: float, y: float};
and variables of this type will be fully compatible with variables of type point.

In contrast, nominative typing adopted in C, C ++, C # and Java does not allow such constructions. With nominative typing, each structure is a unique type, incompatible with other types by default.

Structures in Rust are called “records”. There are also tuples - these are the same entries, but with nameless fields. The elements of a tuple, unlike the elements of a record, cannot be mutable.

There are vectors - in something similar to ordinary arrays, and in something - the type std :: vector from stl. When initializing the list, square brackets are used, not curly ones as in C / C ++

 let myvec = [1, 2, 3, 4];

A vector, however, a dynamic data structure, in particular, vectors support concatenation.

 let v: mutable [int] = [1, 2, 3]; v += [4, 5, 6];

There are patterns. Their syntax is quite logical, without piles of “template” from C ++. Templates of functions and data types are supported.

 fn for_rev<T>(v: [T], act: block(T)) { let i = std::vec::len(v); while i > 0u { i -= 1u; act(v[i]); } } type circular_buf<T> = {start: uint, end: uint, buf: [mutable T]};

Language supports so-called tags . This is nothing more than a union from C, with an additional field — the code of the variant used (that is, something in common between union and enumeration). Or, from the point of view of the theory, an algebraic data type.

 tag shape { circle(point, float); rectangle(point, point); }

In the simplest case, the tag is identical to the listing:

 tag animal { dog; cat; } let a: animal = dog; a = cat;

In more complex cases, each element of the “enumeration” is an independent structure with its own “constructor”.
Another interesting example is the recursive structure, with the help of which an object of the “list” type is defined:

 tag list<T> { nil; cons(T, @list<T>); } let a: list<int> = cons(10, @cons(12, @nil));

Tags can participate in pattern matching expressions, which can be quite complex.

 alt x { cons(a, @cons(b, _)) { process_pair(a,b); } cons(10, _) { process_ten(); } _ { fail; } }

Pattern matching

To begin with, you can consider the pattern matching as an improved switch. The alt keyword is used, followed by the expression being analyzed, and then in the body of the operator, patterns and actions in case of coincidence with the patterns.

 alt my_number { 0 { std::io::println("zero"); } 1 | 2 { std::io::println("one or two"); } 3 to 10 { std::io::println("three to ten"); } _ { std::io::println("something else"); } }

Not only constants (as in C) can be used as “patterns”, but also more complex expressions — variables, tuples, ranges, types, placeholders (placeholders, '_'). You can add additional conditions using the when clause immediately following the pattern. There is a special variant of the operator for type matching. This is possible because the language has a universal variant type any , whose objects can contain values of any type.

Pointers. In addition to the usual "sishnyh" pointers, Rust supports special "smart" pointers with built-in reference counting - shared (Shared boxes) and unique (Unique boxes). They are somewhat similar to shared_ptr and unique_ptr from C ++. They have their own syntax: @ for shared and ~ for unique. For unique pointers instead of copying there is a special operation - moving:

 let x = ~10; let y <- x;

after such a move, the pointer x is deinitialized.

Closures, partial applications, iterators

From this place begins functional programming. Rust fully supports the concept of higher order functions — that is, functions that can take as arguments and return other functions.

1. The lambda keyword is used to declare a nested function or functional data type.

 fn make_plus_function(x: int) -> lambda(int) -> int { lambda(y: int) -> int { x + y } } let plus_two = make_plus_function(2); assert plus_two(3) == 5;

In this example, we have the function make_plus_function, which takes one argument “x” of type int and returns a function of type “int-> int” (here lambda is a keyword). This function is described in the function body. The lack of the “return” operator is a bit confusing, however, for FP it is a common thing.

2. The block keyword is used to declare a functional type, a function argument, which can be substituted with something similar to a block of ordinary code.

 fn map_int(f: block(int) -> int, vec: [int]) -> [int] { let result = []; for i in vec { result += [f(i)]; } ret result; } map_int({|x| x + 1 }, [1, 2, 3]);

Here we have a function, the input of which is a block — essentially a lambda function of the “int-> int” type, and an int type (on the syntax of the vectors below). The “block” itself in the calling code is written using the somewhat unusual syntax {| x | x + 1}. Personally, I prefer lambdas in C #, the symbol | stubbornly perceived as bitwise OR (which, by the way, is also in Rust, like all good old sish operations).

3. Partial application is the creation of a function based on another function with a large number of arguments by specifying the values of some arguments of this other function. To do this, use the bind keyword and the filler character "_":

 let daynum = bind std::vec::position(_, ["mo", "tu", "we", "do", "fr", "sa", "su"])

To make it clearer, I will say at once that this can be done on ordinary C by creating a simple wrapper, something like this:
const char* daynum (int i) { const char *s ={"mo", "tu", "we", "do", "fr", "sa", "su"}; return s[i]; }

But partial application is a functional style, not a procedural one (by the way, it is not clear from the above example how to make a partial application to get a function without arguments)

Another example: the add function is declared with two arguments int, which returns an int. Next, the single_param_fn functional type is declared, which has one argument, int, and returns int. With bind, two functional objects add4 and add5 are declared, built on the basis of the add function, which has partial arguments.

 fn add(x: int, y: int) -> int { ret x + y; } type single_param_fn = fn(int) -> int; let add4: single_param_fn = bind add(4, _); let add5: single_param_fn = bind add(_, 5);

Functional objects can be called as well as normal functions.

 assert (add(4,5) == add4(5)); assert (add(4,5) == add5(4));

4. Pure functions and predicates
Pure functions are functions that have no side effects (including those that do not call any other functions except pure ones). Such functions are extracted with the pure keyword.

  pure fn lt_42(x: int) -> bool { ret (x < 42); }

Predicates are pure functions that return a bool type. Such functions can be used in the typestate system (see further), that is, called at the compilation stage for various static checks.

Syntax macros
Planned feature, but very useful. In Rust, it is still at the initial development stage.

 std::io::println(#fmt("%s is %d", "the answer", 42));

An expression similar to printf, but executed at compile time (respectively, all argument errors are detected at the compilation stage). Unfortunately, there are very few materials on syntax macros, and they themselves are still under development, but there is hope that something like Nemerle macros will turn out.
By the way, unlike Nemerle, the decision to allocate macros syntactically using the # symbol is very literate: a macro is an entity that is very different from a function, and I consider it important to see at a glance where functions are called in code, and where macros.

Attributes

A concept similar to C # attributes (and even with similar syntax). For this special thanks to the developers. As you would expect, attributes add meta information to the entity they annotate,

 #[cfg(target_os = "win32")] fn register_win_service() { /* ... */ }

Another variant of the attribute syntax is invented - the same line, but with a semicolon at the end, annotates the current context. That is, what corresponds to the nearest curly brackets covering such an attribute.

 fn register_win_service() { #[cfg(target_os = "win32")]; /* ... */ }

Parallel computing

Perhaps one of the most interesting parts of the language. At the same time, the tutorial is not currently described at all :)
The program on Rust consists of a "task tree". Each task has an input function, its own stack, means of interaction with other tasks — channels for outgoing information and ports for incoming ones, and owns some of the objects in the dynamic heap.
Many Rust tasks can exist within a single operating system process. Rust tasks are “lightweight”: each task consumes less memory than the OS process, and switching between them is faster than switching between OS processes (here, probably, we mean all the “threads”).

The task consists of at least one function without arguments. The task is launched using the spawn function. Each task can have channels through which it transfers information to other tasks. A channel is a special chan type template that is parameterized by the data type of the channel. For example, chan is a channel for transmitting unsigned bytes.
For transmission to the channel, the send function is used, the first argument of which is the channel, and the second is the value to be transmitted. In fact, this function places the value in the internal channel buffer.
Ports are used to receive data. A port is a template port type that is parameterized by the port data type: port is a port for receiving unsigned bytes.
For reading from ports, the recv function is used, the argument of which is the port, and the return value is the data from the port. Reading blocks the task, i.e. if the port is empty, the task enters the waiting state until another task sends data to the port connected to the channel.
Linking channels to ports is very simple — by initializing a channel with a port using the chan keyword:

let reqport = port();
let reqchan = chan(reqport);

Multiple channels can be connected to one port, but not vice versa - one channel cannot be connected to several ports simultaneously.

Typestate

I did not find the generally accepted Russian translation of the notion of “typestate”, so I will call this “type states”. The essence of this feature is that, in addition to the usual type control adopted in static typing, additional contextual checks are possible at the compilation stage.
In one form or another, type states are familiar to all programmers — according to the compiler's messages “the variable is used without initialization”. The compiler determines where the variable in which there has never been a record is used for reading, and issues a warning. In a more general form, this idea looks like this: every object has a set of states that it can accept. In each state, valid and invalid operations are defined for this object. And the compiler can perform checks on whether a specific operation on an object in a particular place of the program is permissible. It is important that these checks are performed at compile time.

For example, if we have an object of the type "file", then it may have a state of "closed" and "open." And the read operation from the file is invalid if the file is closed. In modern languages, the read function usually throws an exception or returns an error code. A type state system could detect such an error at compile time — just as the compiler determines that a variable read operation occurs before any possible write operation, it could determine that the “Read” method, valid in the file open state, is called to the “Open” method that translates an object into this state.

In Rust, there is the concept of “predicates” - special functions that have no side effects and return the bool type. Such functions can be used by the compiler to call at compile time for the purpose of static checks of certain conditions.

Constraints (constraints) are special checks that can be performed at the compilation stage. To do this, use the keyword check.

 pure fn is_less_than(int a, int b) -< bool { ret a < b; } fn test() { let x: int = 10; let y: int = 20; check is_less_than(x,y); }

Predicates can be "hung" on the input parameters of functions in this way:

 fn test(int x, int y) : is_less_than(x,y) { ... }

Information on typestate is extremely small, so many points are not yet clear, but the concept is interesting anyway.

That's all. It is quite possible that I missed some interesting moments, but the article was already bloated. If you wish, you can now compile the Rust compiler and try to play with various examples. Information on the assembly is given on the official website of the language .

Source: https://habr.com/ru/post/135712/

All Articles