One of the first things I wrote on Rust was the structure with the &str
field. As you understand, the borrowing analyzer did not allow me to do many things with it and severely limited the expressiveness of my APIs. This article aims to demonstrate the problems encountered when storing raw & str links in the fields of structures and ways to solve them. In the process I'm going to show some intermediate API, which increases the usability of such structures, but at the same time reduces the efficiency of the generated code. In the end, I want to provide an implementation that is both expressive and highly effective.
Let's imagine that we are doing some kind of library to work with the API of example.com, and we will sign each call with a token, which we define as follows:
// Token example.io API pub struct Token<'a> { raw: &'a str, }
Then we implement the function new
, which will create an instance of the token from &str
.
impl<'a> Token<'a> { pub fn new(raw: &'a str) -> Token<'a> { Token { raw: raw } } }
Such a naive token works well only for static lines of the &'static str
, which are directly embedded in the binary. However, imagine that the user does not want to embed the secret key in the code, or he wants to load it from some secret repository. We could write this code:
// , let secret: String = secret_from_vault("api.example.io"); let token = Token::new(&secret[..]);
Such an implementation has a big limitation: the token cannot survive the secret key, which means that it cannot leave this area of the stack.
And what if Token
will store String
instead of &str
? This will help us to get rid of specifying the parameter of the structure lifetime, turning it into the owning type.
Let's make changes to the token and the new function.
struct Token { raw: String, } impl Token { pub fn new(raw: String) -> Token { Token { raw: raw } } }
All places where String
provided must be corrected:
// let token = Token::new(secret_from_vault("api.example.io"))
However, it hurts the usability &'str
. For example, such code will not compile:
// let token = Token::new("abc123");
The user of this API will have to explicitly convert &'str
to String.
let token = Token::new(String::from("abc123"));
You can try using &str
instead of String
in the new function, hiding String::from
in the implementation, but in the case of String
it will be less convenient and will require additional memory allocation on the heap. Let's see how it looks.
// new - impl Token { pub fn new(raw: &str) -> Token { Token(String::from(raw)) } } // &str let token = Token::new("abc123"); // - String, // new let secret = secret_from_vault("api.example.io"); let token = Token::new(&secret[..]); // !
However, there is a way to make new take arguments of both types without having to allocate memory in the case of a String.
In the standard library there is a type of Into
, which will help solve our problem with new. Type definition looks like this:
pub trait Into<T> { fn into(self) -> T; }
The into
function is quite simple: it takes self
(something that implements Into
) and returns a value of type T
Here is an example of how this can be used:
impl Token { // // // &str String pub fn new<S>(raw: S) -> Token where S: Into<String> { Token { raw: raw.into() } } } // &str let token = Token::new("abc123"); // String let token = Token::new(secret_from_vault("api.example.io"));
Many interesting things happen here. First, the function has a generic raw
argument of type S
, where the string restricts the possible type of S
to those that implement the type Into<String>
.
Since the standard library already provides Into<String>
for &str
and String
, our case is already processed by it without additional gestures. [one]
Although now this API has become much more convenient to use, it still has a noticeable flaw: the &str
transfer to new
requires memory allocation for storage as a String
.
The standard library has a special container called std :: borrow :: Cow ,
which allows us, on the one hand, to preserve the convenience of Into<String>
, and on the other, to allow the structure to own values of type &str
.
Here is a scary looking Cow definition:
pub enum Cow<'a, B> where B: 'a + ToOwned + ?Sized { Borrowed(&'a B), Owned(B::Owned), }
Let's understand this definition:
Cow<'a, B>
has two generalized parameters: the lifetime 'a
and some generalized type B
, which has the following limitations: 'a + ToOwned + ?Sized
.
Let's look at them in more detail:
B
cannot have a shorter lifetime than 'a
ToOwned
- B
must implement the ToOwned
type, which allows you to transfer borrowed data to possession, making a copy of it.?Sized
- The type B
size may not be known at compile time. This does not matter in our case, but it means that the types of objects can be used with Cow
.There are two choices that the Cow
container is capable of storing.
Borrowed(&'a B)
- Reference to some object of type B
, while the lifetime of the container is exactly the same as that of its associated value B
Owned(B::Owned)
- Container owns the value of the associated type B::Owned
enum Cow<'a, str> { Borrowed(&'a str), Owned(String), }
In short, Cow<'a, str>
will either be &str
with lifetime 'a
, or it will be a String
that is not related to this lifetime.
That sounds cool to our type of Token
. It will be able to store both &str
and String
.
struct Token<'a> { raw: Cow<'a, str> } impl<'a> Token<'a> { pub fn new(raw: Cow<'a, str>) -> Token<'a> { Token { raw: raw } } } // let token = Token::new(Cow::Borrowed("abc123")); let secret: String = secret_from_vault("api.example.io"); let token = Token::new(Cow::Owned(secret));
Now Token
can be created from either the owning type or the borrowed type, but using the API has become less convenient.Into
can make the same improvements for our Cow<'a, str>
, as I did for a simple String
earlier. The final implementation of the token looks like this:
struct Token<'a> { raw: Cow<'a, str> } impl<'a> Token<'a> { pub fn new<S>(raw: S) -> Token<'a> where S: Into<Cow<'a, str>> { Token { raw: raw.into() } } } // . let token = Token::new("abc123"); let token = Token::new(secret_from_vault("api.example.io"));
Now the token can be transparently created from both &str
and from String
. The token-related lifetime is no longer a problem for
data created on the stack. You can even send a token between threads!
let raw = String::from("abc"); let token_owned = Token::new(raw); let token_static = Token::new("123"); thread::spawn(move || { println!("token_owned: {:?}", token_owned); println!("token_static: {:?}", token_static); }).join().unwrap();
However, an attempt to send a token with a non-static link lifetime will fail.
// let raw = String::from("abc"); let s = &raw[..]; let token = Token::new(s); // thread::spawn(move || { println!("token: {:?}", token); }).join().unwrap();
Indeed, the example above does not compile with an error:
error: `raw` does not live long enough
If you crave more examples, please take a look at the PagerDuty API client , which uses Cow extensively.
Thank you for reading!
If you go looking for Into<String>
implementations for & str and String, you will not find them. This is because there is a generic implementation of Into for all types that implement the type From, it looks like this.
impl<T, U> Into<U> for T where U: From<T> { fn into(self) -> U { U::from(self) } }
Translator's note: the original article does not say a word about the principle of Cow operation or Copy on write semantics.
If, in brief, when creating a copy of a container, real data is not copied, the real separation is made only when you try to change the value stored inside the container.
Source: https://habr.com/ru/post/282708/
All Articles