Rust: Type States

Previously, in Rust, there were states of types that were removed from the language even before the official release of the first version. In this article I will reveal to you the secret: Rust supports type states.

I ask under the cat.

Wait, what are "state types"?

Let's look at the object that represents the file — let's call it, say, MyFile . Before MyFile is opened, it cannot be read from. After MyFile is closed, you cannot read from it. Type states are a mechanism that allows the borrowing analyzer to prevent the following errors:

 fn read_contents_of_file(path: &Path) -> String { let mut my_file = MyFile::new(path); my_file.open(); // :   . // ,   `my_file.open()`  . let result = my_file.read_all(); my_file.close(); my_file.seek(Seek::Start); // :   `my_file`. result }

In this example, we made two errors:

read from a file that may not have been successfully opened;
moved the pointer to the current location in the file that is closed.
In most programming languages, we can easily create an API for MyFile , which will make the first error impossible by throwing an exception when the file cannot be successfully opened. Some standard libraries have decided to deviate from this rule for flexibility, but this possibility exists in the language itself.

The second error, however, is much more difficult to catch. Most programming languages support features that make it difficult to make such an error. This is most often done by closing the file at the end of the object's area of life. The only non-academic language I know that can prevent this error is Rust.

Simple "type states" in Rust

How do we implement this in Rust?

The easiest way is to introduce the appropriate types to represent operations on MyFile .

 impl MyFile { // `open` -    `MyFile`. pub fn open(path: &Path) -> Result<MyFile, Error> { ... } //   `seek`   `MyFile`. pub fn seek(&mut self, pos: Seek) -> Result<(), Error> { ... } //   `read_all`   `MyFile`. pub fn read_all(&mut self) -> Result<String, Error> { ... } // `close`  `self`,  `&self`   `&mut self`, //  ,   ""  (  // (moves)  ). pub fn close(self) -> Result<(), Error> { ... } } impl Drop for MyFile { // ,     `MyFile`,   //    . fn drop(&mut self) { ... } }

Rewrite the top example:

 fn read_contents_of_file(path: &Path) -> Result<String, Error> { let mut my_file = MyFile::open(path)?; //    `?` .   ,  //   ,    . // **    `MyFile`  //   `MyFile::open`. //  `my_file`    `MyFile`,  , //     . let result = my_file.read_all()?; // . my_file.close(); // . //   `my_file.close()` "" `my_file`,   //   . my_file.seek(Seek::Start)?; // :  . result }

This also works in more complex cases:

 fn read_contents_of_file(path: &Path) -> Result<String, Error> { //  . let mut my_file = MyFile::open(path)?; let result = my_file.read_all()?; // . if are_we_happy_yet() { my_file.close(); // . } //   `my_file.close()` "" `my_file`,    //       ( `are_we_happy_yet()` //  true). my_file.seek(Seek::Start)?; // :  . result //     `my_file`,    . }

The Rust type system checks to make sure that the variable is not used after it has been "consumed" (consumed, moved). For example, my_file.close() eaten a variable.

Even if we tried to hide the variable somewhere and try to use it again after calling my_file.close() , we would be stopped by the compiler:

 fn read_contents_of_file(path: &Path) -> Result<String, Error> { //  . let mut my_file = MyFile::open(path)?; let result = my_file.read_all()?; let mut my_file_sneaky_backup = my_file; //   `my_file`  `my_file_sneaky_backup`,   //       `my_file`. my_file.close(); // :  . my_file_sneaky_backup.seek(Seek::Start)?; result //     `my_file`,    . }

Let's try to trick the compiler by making the file available after it has been closed:

 fn read_contents_of_file(path: &Path) -> Result<String, Error> { let my_shared_file = Rc::new(RefCell::new(MyFile::open(path)?)); // `my_shared_file` -  (shared)     // `MyFile`,      Java, C#, Python. let result = my_shared_file.borrow_mut() .read_all()?; // Valid let my_shared_file_sneaky_backup = my_shared_file.clone(); //   ,     // `my_shared_file`  . // ,       . my_shared_file_sneaky_backup.seek(Seek::Start)?; // . my_shared_file.seek(Seek::Start)?; //  . // ,      `my_shared_file`, //    `my_shared_file_sneaky_backup`  , //       Java, C#, Python! //      `my_shared_file.close()`,   //  `MyFile`     ,  , //     "" . my_shared_file.close(); // Error, detected by the compiler my_shared_file_sneaky_backup.seek(Seek::Start)?; result //     ,    . }

We were once again stopped by the compiler: without using unsafe , we cannot bypass the invariant - seek cannot be called after close .

This example shows the first brick of type states in Rust: a typed move operation . So far so good. However, we have considered only the simple case in which files can only be opened or closed .

Let's see if we can work with more complex cases.

Difficult "type states"

Instead of files, consider the following network protocol:

The sender sends "HELLO".
The recipient receives "HELLO", responds with the message "HELLO, YOU".
The sender receives "HELLO, YOU", responds with a random number.
The recipient receives the number of the sender, responds with the same number.
The sender receives the same number from the recipient, responds with "BYE".
The recipient receives the "BYE" of the sender, replies "BYE, YOU".
Return to step 1.

All other messages are ignored.

We can come up with Sender (and Receiver ) to make sure that the operations take place in the correct order. At the moment we are not worried about the definition of a correspondent or number.

Let's unite the typified movements with other equipment - phantom types - this technique is common in strictly-typed functional programming languages.

 //  ,    ,   . //    ,   ( " "). struct SenderReadyToSendHello; struct SenderHasSentHello; struct SenderHasSentNumber; struct SenderHasReceivedNumber; struct Sender<S> { ///    I/O. inner: SenderImpl; ///    ,      . state: S; } ///        . impl<S> Sender<S> { ///    . fn port(&self) -> usize; ///  . fn close(self); } ///        ///  SenderReadyToSendHello. impl Sender<SenderReadyToSendHello> { ///        , ///     . fn send_hello(mut self) -> Sender<SenderHasSentHello> { self.inner.send_message("HELLO"); Sender { ///    I/O. ///    ,  , ///        . inner: self.inner, ///     . ///      . state: SenderHasSentHello } } } ///         SenderHasSentHello. impl Sender<SenderHasSentHello> { /// ,     "HELLO, YOU", ///  . /// ///     `SenderHasSentNumber`. fn wait_respond_to_hello_you(mut self) -> Sender<SenderHasSentNumber> { // ... } /// If the receiver has sent "HELLO, YOU", respond with number and /// return the sender in state `SenderHasSentNumber`. /// /// Otherwise, return the unchanged state. fn try_respond_to_hello_you(mut self) -> Resuklt<Sender<SenderHasSentNumber>, Self> { // ... } } /// The following method may be called only in a state SenderHasSentNumber. impl Sender<SenderHasSentNumber> { /// Wait until the receiver has sent number, respond "BYE". /// /// Return the sender in state `SenderReadyToSendHello` fn wait_respond_to_hello_you(mut self) -> Sender<SenderReadyToSendHello> { // ... } ///    ,     ///   `SenderReadyToSendHello`. /// ///    . fn try_respond_to_hello_you(mut self) -> Result<Sender<SenderReadyToSendHello>, Self> { // ... } }

It is clear that Sender can only work according to the following protocol:

from step 1 ( SenderReadyToSendHello , can proceed to step 3);
from step 3 ( SenderHasSentHello , can only remain in step 3 or
go to step 5);
from step 5 ( SenderHasSentNumber , can only remain in step 5 or
go back to step 1).
Any attempts to deviate from the protocol will be blocked by the type system.

When you need to work with network protocols, device drivers, industrial devices with specific security instructions or OpenGL / DirectX / other — in a word, with everything that involves complex interaction with the hardware — you will appreciate this mechanism and the guarantees it provides.

Welcome to the world of type states.

Quick note: behind "type states"

By the way, continuing our example with the network, what if we want to save the number sent by Server to check that the answer matches? We can save the number in SenderHasSentNumber :

 struct SenderHasSentNumber { number_sent: u32, }

The compiler will check that the code will access number_sent only when the sender is in the SenderHasSentNumber state.

We will lose (slightly) in performance. The compiler will not be able to optimize the transformation of Sender between identical representations, but this is usually worth it.

Closing words

I hope that this quick demonstration has convinced you of the power provided by typed movement combined with phantom types . This is a great tool to keep your code safe. It is used in many places in the standard Rust library and in many well-designed third-party libraries.

Now I do not know another PL, which would provide the semantics of typed movements (I note, C ++ has untyped semantics of movement), I think that other languages will eventually include the same mechanism if it is in demand. By the way, I can not do without it :)

Source: https://habr.com/ru/post/350372/

All Articles