📜 ⬆️ ⬇️

Why is ownership / borrowing in Rust so complicated?

The original article was written by Ivan Sagalaev , living in Washington, husband of the notorious Alena C ++ .

The article itself.

Working with pure functions is simple: you pass arguments and get results, without any side effects. On the other hand, if the function produces side effects, such as changing your own arguments or global objects, then it is difficult to find the reasons for this. We are also used to the fact that if we see something like player.set_speed (5) , then we can be sure that the player object is going to change in a predictable way (and maybe send some signals somewhere).
')
The system of possession / borrowing of the Rust language is complicated and it creates a completely new class of side effects.

Simple example
Consider this code:
let point = Point {x: 0, y: 0}; let result = is_origin(point); println!("{}: {}", point, result); 

The experience of most programmers will not prepare for the fact that the point object suddenly becomes inaccessible after calling the is_origin () function! The compiler will not allow you to use it in the next line. This is a side effect, something happened to the argument, but this is not at all what you have seen in other languages.

This happens because the point object is moved (instead of being copied) to the function, and thus the function becomes responsible for its destruction. The compiler interferes with the use of the object after the change of ownership. There is a way to fix this: you need to either pass the argument by reference, or teach it to copy itself. It makes sense if you know about the "default move". But these things tend to come out randomly during some kind of innocent refactoring or, for example, when adding logging.

The example is more complicated
Consider a parser that takes some data from lexer and saves some state:
 struct Parser { lexer: Lexer, state: State, } impl Parser { fn consume_lexeme(&mut self) -> Lexeme { self.lexer.next() } pub fn next(&mut self) -> Event { let lexeme = self.consume_lexeme(); //    if lexeme == SPECIAL_VALUE { self.state = State::Closed //    } } } 

Unnecessary at first glance, consuming_lexeme () is simply a convenient wrapper around a long sequence of calls that I make in the above code. lexer.next () returns a self-contained lexeme by copying data from the lexer internal buffer. But now we want to optimize this so that the tokens contain only references to this data in order to avoid copying. Change the method declaration to the following:
 pub fn next<'a>(&'a mut self) -> Lexeme<'a> 

The mark “a” clearly tells us that the lifetime of the token is now related to the lifetime of the reference to lexer , with which we call the .next () method. Those. cannot live by itself, but depends on the data in the lexer buffer. And now Parser :: next () stops working:
 error: cannot assign to `self.state` because it is borrowed [E0506] : self.state    ,     self.state = State::Closed ^~~~~~~~~~~~~~~~~~~~~~~~~~ note: borrow of `self.state` occurs here :    `self.state` let lexeme = self.consume_lexeme(); ^~~~ 

Simply put, the Rust compiler tells us that as long as lexeme is available in this block of code, it will not allow us to change self.state - another part of the parser. But it is generally meaningless! The culprit here is consume_lexeme () . Although in reality we only need self.lexer , we tell the compiler that we are referring to the entire parser (note self ). And since this link can be changed, the compiler will not allow anyone to touch any part of the parser to change the data, now dependent on lexeme . Thus, we have a side effect again: although we did not change the actual types in the function signature and the code is still correct and should work correctly, a change of ownership unexpectedly prevents it from being further compiled.

Even though I understood the problem as a whole, it took me at least two days before it came to me and the correction became obvious.

Fix
The change consume_lexeme () , which allows referring only to lexer and not to the entire parser, fixed the problem, but the code did not look idiomatic due to replacing the dot notation with a call to a normal function:
 let lexeme = consume_lexeme(self.lexer); //     self.<-> 

Fortunately, Rust also allows you to go the right way. Since in Rust the definition of data fields ( struct ) is different from the definition of methods ( impl ), I can define my own methods for any structure, even if it is imported from another namespace:
 use lexer::Lexer; //   Lexer.       Lexer //     . impl Lexer { pub fn consume(&mut self) -> Lexeme { .. } } // ... let lexeme = self.lexer.consume(); //  ! 

Gracefully!

Checking borrowing in Rust is a great thing that makes you write more reliable code. But this is different from what you are used to, and it will take time to develop effective work skills.

Reader Reviews
Juarez: I got the impression that Rust adds unnecessary complexity with the implementation of the “default move” for elementary types. A programmer everywhere has the added burden of boxing links. In my opinion, it seems natural to think about:
a) “default copying” for elementary types
b) “default link” for composite types (structures, traits, etc.)
c) “move by default” for composite types in asynchronous methods - from case to case.
Did I miss something?
Ralf: Notice, however, that what you call the “effect” here is actually very, very different from the “effects” that people usually mean when they talk about “side effects”. The concepts of ownership and movement are concepts only compile time, it does not change what your code does. Consequently, this does not make reasoning about the behavior of your code more difficult, since the behavior does not change.

In fact, side effects are now much more manageable. This applies, in particular, to reasoning about unlimited effects, similar to those of C ++, where almost everywhere you can get access to all kinds of data under pseudonyms.

The duty of checking borrowing and possession is not a new side effect, it is about limiting existing side effects. If you own something or have a mutable link (which is necessarily unique), you can be sure that there are no unexpected (non-local) side effects of this object, because no one can have its alias. By this I mean that calling some function for some data (which you own) will never magically change it. If you have a shared link, you can be sure that there are no side effects, because no one can change the data. When the compiler tells you that the data is being moved and you cannot use it, this is not a new side effect. This “simple” compiler understanding of the side effects is necessary in order for it to be able to make sure that everything is under control.

In C ++, if you pass the Point parameter before a certain function, the compiler makes an incomplete copy, and if the Point contains pointers, then this can lead to confusion. Here, the Point object is safe to copy in a nearby context, but you must explicitly tell the compiler what you want:
 #[derive(Copy,Clone)] struct Point { ... } 

You may wonder why the compiler cannot understand this automatically. It would be exactly the same as for Send .
The problem here is the stability of the interface. If you are writing a library that exports a type to which Copy is applied, then the library must always keep this Copy type in the future. It must be the conscious choice of the author of the library to ensure that this type is and always will be Copy - due to the explicit annotation.

Afterword from the translator : the impetus for the translation of this article was the desire to find out what kind of "a completely new class of side effects" appeared in Rust. Although the article as a whole is curious, the author is in some misconception about a completely new class.

Source: https://habr.com/ru/post/278779/


All Articles