📜 ⬆️ ⬇️

Focusing on ownership

Note translator: the record is dated May 13, 2014, so some details, including the source code, may not correspond to the current state of affairs. The answer to the question of why a translation of such an old post is needed will be the value of its content for developing an understanding of such one of the fundamental concepts of the Rust language as possession.


Over time, I became convinced that it would be better to abandon the distinction between mutable and immutable local variables in Rust. At least, many people are skeptical about this issue. I wanted to state my position in public. I will cite various motives: philosophical, technical and practical, and also turn to the main defense of the current system. (Note: I viewed this as a Rust RFC, but decided that the tone was better for a blog post, and I don’t have time to rewrite it now.)


Explanation


I wrote this article quite strongly and I believe that the line I advocate will be correct. However, if we do not end up maintaining the current system, it will not be a disaster or something like that. It has its advantages, and in general I find it quite enjoyable. I just think we can improve it.


In a word


I would like to remove the distinction between immutable and mutable local variables and rename the &mut pointers to &my , &only or &uniq (it &uniq no difference to me). If only there was no keyword mut .


Philosophical motive


The main reason I want to do this is that I think it will make the language more consistent and easy to understand. Essentially, this reorients us from talk about changeability to talk about the use of pseudonyms ("aliasing") (which I will call "sharing" (see "sharing"), see below).


Variability becomes a consequence of the uniqueness that follows: "You can always change everything you have unique access to. Shareable data is usually immutable, but if you need, you can change it using a certain type of Cell type."


In other words, over time it became clear to me that problems with data races and memory security arise when you have both pseudonyms and volatility at the same time . A functional approach to solving this problem is to eliminate variability. Rust's approach would be to remove the use of pseudonyms. This gives us a story that can be told, and which will help us to understand.


Terminology note: I think that we should refer to the use of pseudonyms as division ( note: translator: hereinafter, instead of "aliasing" we use "sharing" in the meaning of "sharing" or "shared ownership", since neither "use of pseudonyms", Neither "pseudonymy" gives an understanding of what is at stake ). In the past, we avoided this because of its multi-threaded references. However, if / when we implement plans for paralleling the data that I proposed, then this connotation is not entirely inappropriate. In fact, given the close relationship between memory security and data races, I really want to promote this connotation.


Educational motive


I think that the current rules are more difficult to understand than they should be. It is not obvious, for example, that &mut T implies any shared ownership. In addition, the notation &mut T assumes that &T does not imply any changeability, which is not entirely accurate, due to types such as Cell . And it is impossible to agree on how to call them (“changeable / immutable links” is the most common, but this is not entirely correct).


In contrast, a type like &my T or &only T seems to simplify explanations. This is a unique link - naturally, you cannot force two of them to point to the same place. And variability is an orthogonal thing: it comes from uniqueness, but it also holds for cells. And the type &T is just its opposite, a shared link . RFC PR # 58 gives a number of similar arguments. I will not repeat them here.


Practical motive


Currently, there is a gap between borrowed pointers, which can be either shared or changeable + unique, and local variables, which are always unique, but can be changeable or immutable. The end result of this is that users must place mut ads on things that are not directly editable.


Local variables cannot be modeled using links.


This phenomenon arises from the fact that links are not as expressive as local variables. In general, this prevents the abstraction. Let me give a few examples to explain what I mean. Imagine that I have an environment structure that stores a pointer to an error counter:


 struct Env { errors: &mut usize } 

Now I can create instances of this structure (and use them):


 let mut errors = 0; let env = Env { errors: &mut errors }; ... if some_condition { *env.errors += 1; } 

OK, now imagine that I want to isolate the code that changes env.errors into a separate function. I would think that since the env variable is not declared as mutable, I can use an immutable & reference:


 let mut errors = 0; let env = Env { errors: &mut errors }; helper(&env); fn helper(env: &Env) { ... if some_condition { *env.errors += 1; //  } } 

But it is not. The problem is that &Env is a type with shared ownership ( note of the translator: as you know, there can be more than one immutable object reference at a time ), and therefore env.errors appears in a space that allows for separate ownership of the env object. For this code to work, I must declare env as mutable and use the &mut reference ( note of the translator: &mut to indicate to the compiler that the env is uniquely owned, since only one variable reference to the object can exist at a time, and the data race is excluded, and mut because you cannot create a mutable reference to an immutable object ):


 let mut errors = 0; let mut env = Env { errors: &mut errors }; helper(&mut env); 

This problem arises due to the fact that we know that local variables are unique, but we cannot put this knowledge into a borrowed link without making it mutable.


This problem occurs in a number of other places. So far we have written about this in different ways, but the feeling continues to haunt us that we are talking about a break, which simply should not be.


Type checking for closures


We had to bypass this restriction in the case of closures. The closures are mainly absorbed into structures such as Env , but not quite. This is due to the fact that I do not want to require local variables to be declared mut , if they are used via &mut in the closure. In other words, take some code, for example:


 fn foo(errors: &mut usize) { do_something(|| *errors += 1) } 

An expression describing a closure will actually create an instance of the Env structure:


 struct ClosureEnv<'a, 'b> { errors: &uniq &mut usize } 

Pay attention to the link &uniq . This is not what the end user can enter. It means "unique, but not necessarily changeable" pointer. This is required to pass type checking. If the user tried to write this structure manually, he would have to write &mut &mut usize , which in turn would require the errors parameter to be declared as mut errors: &mut usize .


Unpacked closures and procedures


I foresee that this restriction is a problem for unpacked closures. Let me elaborate on the design that I was considering. In principle, the idea was that the expression || is equivalent to some new structural type that implements one of the types of Fn :


 trait Fn<A, R> { fn call(&self, ...); } trait FnMut<A, R> { fn call(&mut self, ...); } trait FnOnce<A, R> { fn call(self, ...); } 

The exact type will be selected according to the expected type, as of today. In this case, closure consumers can write one of two things:


 fn foo(&self, closure: FnMut<usize, usize>) { ... } fn foo<T: FnMut<usize, usize>>(&self, closure: T) { ... } 

We ... probably want to fix the syntax, perhaps add sugar, such as FnMut(usize) -> usize , or save | usize | -> usize, etc. It is not so important, it is important that we will pass the closure by value . Please note that in accordance with the current DST (Dynamically-Sized Types) rules, it is permissible to pass type into a type by value as an argument, therefore the FnMut<usize, usize> argument FnMut<usize, usize> is a valid DST and is not a problem.


Aside : this project is not complete, and I will describe all the details in a separate message.


The problem is that the call to the closure will require the reference &mut . Since the closure is passed by value, users will again have to write mut where it looks out of place:


 fn foo(&self, mut closure: FnMut<usize, usize>) { let x = closure.call(3); } 

This is the same problem as in the Env example above: what actually happens here is that the FnMut type just wants a unique link, but since this is not part of the type system, it requests a mutable link.


Now we can perhaps get around this in different ways. One option we could do is to || the syntax would not be revealed to "a certain structural type," but rather to "a structural type or a pointer to a structural type, as dictated by the type inference." In this case, the caller could write:


 fn foo(&self, closure: &mut FnMut<usize, usize>) { let x = closure.call(3); } 

I do not want to say that this is the end of the world. But this is another step forward in the growing distortions that we need to go through in order to maintain this gap between local variables and references.


Other parts of the API


I have not done an exhaustive study, but, naturally, this distinction is creeping out somewhere else. For example, to read from a Socket , I need a unique pointer, so I have to declare it mutable. Therefore, sometimes this does not work:


 let socket = Socket::new(); socket.read() // :    

Naturally, according to my proposal, such code would work fine. You would still get an error message if you tried to read from &Socket , but then it would read something like "it is impossible to create a unique link to a shared link", which I personally consider more understandable.


But don't we need mut for security?


No, not at all. Rust programs would be equally good if you simply declared all bindings as mut . The compiler is perfectly capable of tracking which local variables change at any time — precisely because they are local to the current function. What the type system really cares about is uniqueness.


The value that I see in the current rules of application of mut , and I will not deny that it has value, is primarily that they help declare the intention. That is, when I read the code, I know which variables can be reassigned. On the other hand, I also spend a lot of time reading C ++ code and, frankly, never noticed that this was a major stumbling block. (The same goes for the time I spent reading code in Java, JavaScript, Python, or Ruby.)


It's also true that I sometimes find bugs because I declared the variable as mut and forgot to change it. I think that we could get similar advantages with the help of other, more aggressive checks (for example, none of the variables used in the cycle condition change in the body of the cycle). I personally can not remember to face the opposite situation: that is, if the compiler says that something must be changeable, it basically always means that I forgot the keyword mut somewhere. (Consider: when was the last time you responded to a compiler error about an unacceptable change, doing something other than restructuring the code to make the change valid?)


Alternatives


I see three alternatives to the current system:


  1. The one that I introduced, where you just throw away "volatility" and track only the uniqueness.
  2. The one where you have three reference types: & , &uniq and &mut . (As I wrote, this is actually the type system we have today, at least in terms of the borrow checker.)
  3. A more rigorous version, in which "non-mut" variables are always considered to be separate. This would mean that you have to write:


     let mut errors = 0; let mut p = &mut errors; // ,  `p`   ,  `mut`. *p += 1; 

    You need to declare p as mut , because otherwise the variable will be considered to be separate, although it is a local variable and, therefore, a change of *p not allowed. What is strange in this scheme is that the local variable does NOT allow separate possession, and we know for sure, because when you try to create its alias, it will move, it will launch a destructor, etc. That is, we still have the concept of "owned", which is different from "does not allow separate ownership."


    On the other hand, if we described this system, saying that volatility is inherited through &mut pointers, without even talking about shared ownership, it might make sense.



Of these three, I definitely prefer # 1. It is the simplest, and now I am most interested in how we can simplify Rust, keeping its character. Otherwise, I prefer the one we have right now.


Conclusion


In principle, I find that the current rules on variability have some value, but they are expensive. They are a sort of flowing abstraction: that is, they tell a simple story that turns out to be incomplete. This leads to confusion when people move from the initial understanding, in which &mut displays how variability works, to a full understanding: sometimes mut needed only to ensure uniqueness, and sometimes it is achieved without the keyword mut .


Moreover, we must act with caution in order to maintain a fiction, which mut stands for variability, not uniqueness. We have added special cases for the borrower to check for closures. We need to make the rules regarding &mut volatility more complex in general. We must either add mut to the closures so that they can be called, or we can make the clocks syntax re-arranged in a less obvious way. And so on.


In the end, everything turns into a more complex language as a whole. Instead of just thinking about shared ownership and uniqueness, the user should think about shared ownership and volatility, and both of them are somehow messed up.


I do not think it's worth it.


')

Source: https://habr.com/ru/post/418735/


All Articles