📜 ⬆️ ⬇️

Abstractions without overhead: types in Rust

In the previous post, we touched upon the two pillars of the Rust design (since in internal speech I constantly incline the name of the language, then I will use the Russian-language name “growth”, which seems more organic to me - approx. Transl.):

This post begins the story of the third pillar:

One of the C ++ mantras that makes it so suitable for system programming is the zero cost abstraction principle:
C ++ implementations obey the zero-cost principle: you do not pay for what you do not use [Straustrup, 1994]. Moreover: what you use is best coded.

- Bjarne Straustrup

This mantra was not always suitable for a grower who once had a mandatory garbage collector. But after a while, Rasta’s ambitions became lower-level, so now zero-value abstractions are the basic principle of the language.

The central concept of abstraction in rasta is traits.
')

Considering all the above, the type system is a secret weapon that gives the ergonomics and expressiveness of high-level languages, while maintaining low-level control over the execution of the code and the presentation of data.

This overview post, without diving into details, will go through each specified point, giving an idea of ​​how this approach allows to achieve these goals.

Basis: growing methods


Before diving into types, we will consider a small, but very important part of the language: the difference between the method and the function.

There are both methods and independent functions in the program, and they are closely related to each other:

 struct Point { x: f64, y: f64, } // a free-standing function that converts a (borrowed) point to a string fn point_to_string(point: &Point) -> String { ... } // an "inherent impl" block defines the methods available directly on a type impl Point { // this method is available on any Point, and automatically borrows the // Point value fn to_string(&self) -> String { ... } } 

Methods, like to_string , are called "own" because they are:


The first parameter of the method is always explicitly specified in the form “self”, and it can be self , &mut self , or &self - depending on the required level of ownership (it can be mut self , but with respect to ownership it is the same as self - comment perev.). Methods are called using a dot ( . ), As in a normal OOP, and the self parameter is implicitly borrowed if required by the method signature:

 let p = Point { x: 1.2, y: -3.7 }; let s1 = point_to_string(&p); // calling a free function, explicit borrow let s2 = p.to_string(); // calling a method, implicit borrow as &p 

Methods and auto-borrowing are important aspects of the ergonomics of the rasta, which support the simplicity of the API, for example, the process creation interface:

 let child = Command::new("/bin/cat") .arg("rusty-ideas.txt") .current_dir("/Users/aturon") .stdout(Stdio::piped()) .spawn(); 

Types as an interface


Interfaces describe the expectations of one code in relation to another, allowing each of the parts to change independently of each other. In the case of types, this description revolves around methods.

Take, for example, the following simple type describing hashing:

 trait Hash { fn hash(&self) -> u64; } 

To implement this type for any type, we need to write the hash method with the appropriate signature:

 impl Hash for bool { fn hash(&self) -> u64 { if *self { 0 } else { 1 } } } impl Hash for i64 { fn hash(&self) -> u64 { *self as u64 } } 

Unlike interfaces in languages ​​such as Java, C #, or Scala, new types can be implemented for already existing types (as is the case with Hash in the last example). That is, abstractions can be created as needed, and then applied to existing libraries.

Unlike own methods, methods of types are in scope only when their type is in scope. But if we assume that the Hash type is already in our scope, you can write true.hash() . Thus, the type implementation extends the set of methods available for this type.

Well and ... that's it! The definition and implementation of a type is nothing more than an abstraction of a common interface that several types satisfy.

Static scheduling


Everything becomes more interesting on the other hand - for users of types. The most common way to use types is through the use of type parametrism:

 fn print_hash<T: Hash>(t: &T) { println!("The hash is {}", t.hash()) } 

The print_hash function print_hash parameterized by an unknown type T , but requires that this type implement the Hash type. Which means that we can use it for values ​​of bool and i64 :

 print_hash(&true); // instantiates T = bool print_hash(&12_i64); // instantiates T = i64 

Parameters-type functions after compilation are developed into concrete implementations, as a result we get static dispatching. Here, as with the C ++ templates, the compiler will generate two copies of the print_hash function: by version for each type used instead of the type argument. In turn, this means that the internal call to t.hash() - the place where abstraction is used - has zero cost, since it will be compiled into a direct static call to the corresponding implementation of the hash method:

 // The compiled code: __print_hash_bool(&true); // invoke specialized bool version directly __print_hash_i64(&12_i64); // invoke specialized i64 version directly 

Such a compilation model is not very useful for functions like print_hash , but it is very convenient for more realistic use of hashing. Suppose that we also have a type to compare for equality:

 trait Eq { fn eq(&self, other: &Self) -> bool; } 

( Self type here will be replaced by the type for which this type is implemented; in the case of impl Eq for bool it will correspond to the type bool .)

We can define a dictionary type parameterized by type T for which the types Eq and Hash should be implemented:

 struct HashMap<Key: Hash + Eq, Value> { ... } 

Then a static compilation model for parametric types will provide several advantages:

Each use of a HashMap with specific Key and Value types will result in the creation of a separate specific HashMap type, which means that a HashMap can contain keys and values ​​directly in its buckets, without the use of indirect addressing. This saves space, reduces the number of pointer pointers, and allows you to more fully use the memory of the cache.

Each HashMap method will also be compiled into specialized for specified types of code. So there are no additional expenses for dispatching when calling the hash and eq methods. It also means that the optimizer will be able to work with completely specific code - that is, from the point of view of the optimizer, there are no abstractions. In particular, static dispatch allows inline type-parameterized methods.

At the same time, as in the case of C ++ templates, these properties of parametric types mean that you can write fairly high-level abstractions that are compiled into fully specific machine code “best-coded”.

However, unlike C ++ templates, the use of characters is completely checked for the correctness of types. That is, when you compile HashMap by itself, its code is checked for types only once, for the correct use of abstract types of Hash and Eq , and not every time when using specific types. Which means both clearer and earlier compilation errors for library authors, as well as lower typing costs for users of the language (read “faster compilation”).

Dynamic scheduling


We saw one model of compilation of types, when all abstractions are statically resolved upon compilation. But sometimes abstractions are not only needed for code reuse or modularity, sometimes abstractions play an important role during program execution and cannot be removed during compilation.

For example, GUI frameworks often use callbacks to react to events, such as mouse clicks:

 trait ClickCallback { fn on_click(&self, x: i64, y: i64); } 

GUI elements often feature support for registering multiple callbacks for a single event. Using parametric types, you could write something like this:

 struct Button<T: ClickCallback> { listeners: Vec<T>, ... } 

But here there is an obvious problem: each button will be specialized for only one type that implements ClickCallback , and this is reflected in a specific type of button. This is not what we need! We want one particular type of Button with a set of disparate event recipients, each of which can be of any specific type that implements the type ClickCallback .

One of the difficulties here is that we are dealing with a group of heterogeneous types, each of which can have a different size. So how do we arrange them in a vector? The solution is standard: using indirect addressing. We will save pointers to callbacks in vector.

 struct Button { listeners: Vec<Box<ClickCallback>>, ... } 

Here we use the type as if it were a type. In fact, in Rasta, types are “dimensionless” types , which roughly means that they can only be used via pointers, for example using Box (pointer to heap) or & (any pointer anywhere).

In type, a &ClickCallback or Box «-» T , ( ClickCallback ), , T ( on_click ). , , T . Button , .

— , . , . - « »: , , , .


, , , . :

. ClickCallback , . , , (Huon Wilson) . API. :

struct Pair<A, B> { first: A, second: B } impl<A: Hash, B: Hash> Hash for Pair<A, B> { fn hash(&self) -> u64 { self.first.hash() ^ self.second.hash() } }
Pair Hash , , . , API, . , :

#[derive(Hash)] struct Pair<A, B> { .. }
. ( ), (extension methods) C#. : , , , ! . «», : Send , Sync , Copy , Sized . — , , -. , #[derive] . , Send , Send . , — Send . . , . , : , - , , . , . -, ad hoc : , API, . -, : , . . + . , , , .
: , , .


— , — : 1.0 . :

. , : « - , Iterator », . , , . . , . . , . , , , , ( , , ). , . (, Higher-kinded types, HKT). , ( Vec , Vec — ). , . — , .

. , ( ), , , - ( DOM), GUI-, . — , (Niko Matsakis) . , , - .

, 1.0, - , , , . — , , . !
Box «-» T , ( ClickCallback ), , T ( on_click ). , , T . Button , .

— , . , . - « »: , , , .


, , , . :

. ClickCallback , . , , (Huon Wilson) . API. :

struct Pair<A, B> { first: A, second: B } impl<A: Hash, B: Hash> Hash for Pair<A, B> { fn hash(&self) -> u64 { self.first.hash() ^ self.second.hash() } }
Pair Hash , , . , API, . , :

#[derive(Hash)] struct Pair<A, B> { .. }
. ( ), (extension methods) C#. : , , , ! . «», : Send , Sync , Copy , Sized . — , , -. , #[derive] . , Send , Send . , — Send . . , . , : , - , , . , . -, ad hoc : , API, . -, : , . . + . , , , .
: , , .


— , — : 1.0 . :

. , : « - , Iterator », . , , . . , . . , . , , , , ( , , ). , . (, Higher-kinded types, HKT). , ( Vec , Vec — ). , . — , .

. , ( ), , , - ( DOM), GUI-, . — , (Niko Matsakis) . , , - .

, 1.0, - , , , . — , , . !

Box «-» T , ( ClickCallback ), , T ( on_click ). , , T . Button , .

— , . , . - « »: , , , .


, , , . :

. ClickCallback , . , , (Huon Wilson) . API. :

struct Pair<A, B> { first: A, second: B } impl<A: Hash, B: Hash> Hash for Pair<A, B> { fn hash(&self) -> u64 { self.first.hash() ^ self.second.hash() } }
Pair Hash , , . , API, . , :

#[derive(Hash)] struct Pair<A, B> { .. }
. ( ), (extension methods) C#. : , , , ! . «», : Send , Sync , Copy , Sized . — , , -. , #[derive] . , Send , Send . , — Send . . , . , : , - , , . , . -, ad hoc : , API, . -, : , . . + . , , , .
: , , .


— , — : 1.0 . :

. , : « - , Iterator », . , , . . , . . , . , , , , ( , , ). , . (, Higher-kinded types, HKT). , ( Vec , Vec — ). , . — , .

. , ( ), , , - ( DOM), GUI-, . — , (Niko Matsakis) . , , - .

, 1.0, - , , , . — , , . !

Box «-» T , ( ClickCallback ), , T ( on_click ). , , T . Button , .

— , . , . - « »: , , , .


, , , . :

. ClickCallback , . , , (Huon Wilson) . API. :

struct Pair<A, B> { first: A, second: B } impl<A: Hash, B: Hash> Hash for Pair<A, B> { fn hash(&self) -> u64 { self.first.hash() ^ self.second.hash() } }
Pair Hash , , . , API, . , :

#[derive(Hash)] struct Pair<A, B> { .. }
. ( ), (extension methods) C#. : , , , ! . «», : Send , Sync , Copy , Sized . — , , -. , #[derive] . , Send , Send . , — Send . . , . , : , - , , . , . -, ad hoc : , API, . -, : , . . + . , , , .
: , , .


— , — : 1.0 . :

. , : « - , Iterator », . , , . . , . . , . , , , , ( , , ). , . (, Higher-kinded types, HKT). , ( Vec , Vec — ). , . — , .

. , ( ), , , - ( DOM), GUI-, . — , (Niko Matsakis) . , , - .

, 1.0, - , , , . — , , . !
Box «-» T , ( ClickCallback ), , T ( on_click ). , , T . Button , .

— , . , . - « »: , , , .


, , , . :

. ClickCallback , . , , (Huon Wilson) . API. :

struct Pair<A, B> { first: A, second: B } impl<A: Hash, B: Hash> Hash for Pair<A, B> { fn hash(&self) -> u64 { self.first.hash() ^ self.second.hash() } }
Pair Hash , , . , API, . , :

#[derive(Hash)] struct Pair<A, B> { .. }
. ( ), (extension methods) C#. : , , , ! . «», : Send , Sync , Copy , Sized . — , , -. , #[derive] . , Send , Send . , — Send . . , . , : , - , , . , . -, ad hoc : , API, . -, : , . . + . , , , .
: , , .


— , — : 1.0 . :

. , : « - , Iterator », . , , . . , . . , . , , , , ( , , ). , . (, Higher-kinded types, HKT). , ( Vec , Vec — ). , . — , .

. , ( ), , , - ( DOM), GUI-, . — , (Niko Matsakis) . , , - .

, 1.0, - , , , . — , , . !

Box «-» T , ( ClickCallback ), , T ( on_click ). , , T . Button , .

— , . , . - « »: , , , .


, , , . :

. ClickCallback , . , , (Huon Wilson) . API. :

struct Pair<A, B> { first: A, second: B } impl<A: Hash, B: Hash> Hash for Pair<A, B> { fn hash(&self) -> u64 { self.first.hash() ^ self.second.hash() } }
Pair Hash , , . , API, . , :

#[derive(Hash)] struct Pair<A, B> { .. }
. ( ), (extension methods) C#. : , , , ! . «», : Send , Sync , Copy , Sized . — , , -. , #[derive] . , Send , Send . , — Send . . , . , : , - , , . , . -, ad hoc : , API, . -, : , . . + . , , , .
: , , .


— , — : 1.0 . :

. , : « - , Iterator », . , , . . , . . , . , , , , ( , , ). , . (, Higher-kinded types, HKT). , ( Vec , Vec — ). , . — , .

. , ( ), , , - ( DOM), GUI-, . — , (Niko Matsakis) . , , - .

, 1.0, - , , , . — , , . !

Box «-» T , ( ClickCallback ), , T ( on_click ). , , T . Button , .

— , . , . - « »: , , , .


, , , . :

. ClickCallback , . , , (Huon Wilson) . API. :

struct Pair<A, B> { first: A, second: B } impl<A: Hash, B: Hash> Hash for Pair<A, B> { fn hash(&self) -> u64 { self.first.hash() ^ self.second.hash() } }
Pair Hash , , . , API, . , :

#[derive(Hash)] struct Pair<A, B> { .. }
. ( ), (extension methods) C#. : , , , ! . «», : Send , Sync , Copy , Sized . — , , -. , #[derive] . , Send , Send . , — Send . . , . , : , - , , . , . -, ad hoc : , API, . -, : , . . + . , , , .
: , , .


— , — : 1.0 . :

. , : « - , Iterator », . , , . . , . . , . , , , , ( , , ). , . (, Higher-kinded types, HKT). , ( Vec , Vec — ). , . — , .

. , ( ), , , - ( DOM), GUI-, . — , (Niko Matsakis) . , , - .

, 1.0, - , , , . — , , . !

Source: https://habr.com/ru/post/257775/


All Articles