gear idea
Possible Rust

Learning what’s possible in Rust.
Jump to Navigation

3 Things to Try When You Can't Make a Trait Object

Feb 2nd, 2021 · Pattern · #trait objects

Trait objects are Rust’s usual mechanism for dynamic dispatch, and when they work they’re wonderful, but many Rust programmers have struggled with the question of when a trait can become a trait object, and what to do when a trait they’re using can’t. This post describes several options for handling an inability to create a trait object, discusses their trade-offs, and describes why the trait object limitations exist in the first place, and what those limitations are exactly.

Imagine I have a type containing a collection of fields, and I want to optionally serialize or not serialize those fields depending on their presence. Imagine that it’s a Url type, and that the code looks something like this.

Skip this content.
use serde::{ser::SerializeStruct, Serialize, Serializer};
use std::convert::From;
use url::Url as FullUrl;

// This `Url` may or may not contain any of these fields.
struct Url<'a> {
    scheme: Option<&'a str>,
    username: Option<&'a str>,
    password: Option<&'a str>,
    host: Option<&'a str>,
    path: Option<&'a str>,
    fragment: Option<&'a str>,
    query: Option<&'a str>,
}

impl<'a> From<&'a FullUrl> for Url<'a> {
    // We construct this `Url` from the `url` crate's `Url` type
    // (called `FullUrl` here).
    fn from(other: &FullUrl) -> Url<'_> {
        let scheme = Some(other.scheme());
        let username = match other.username() {
            "" => None,
            u => Some(u),
        };
        let password = other.password();
        let host = other.host_str();
        let path = Some(other.path());
        let fragment = other.fragment();
        let query = other.query();

        Url {
            scheme,
            username,
            password,
            host,
            path,
            fragment,
            query,
        }
    }
}

impl<'a> Serialize for Url<'a> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: Serializer,
    {
        // We need to know the number of fields we're serializing
        // ahead of time when we serialize a struct, so we'll make a
        // Vec and fill it, then use `Vec::len` to get the number
        // of fields we're actually serializing.
        let mut fields: Vec<(&str, Box<dyn Serialize>)> = Vec::new();

        if let Some(scheme) = self.scheme {
            fields.push(("scheme", Box::new(scheme)));
        }

        if let Some(username) = self.username {
            fields.push(("username", Box::new(username)));
        }

        if let Some(password) = self.password {
            fields.push(("password", Box::new(password)));
        }

        if let Some(host) = self.host {
            fields.push(("host", Box::new(host)));
        }

        if let Some(path) = self.path {
            fields.push(("path", Box::new(path)));
        }

        if let Some(fragment) = self.fragment {
            fields.push(("fragment", Box::new(fragment)));
        }

        if let Some(query) = self.query {
            fields.push(("query", Box::new(query)));
        }

        let mut url = serializer.serialize_struct("Url", fields.len())?;

        for (name, value) in &fields {
            url.serialize_field(name, value)?;
        }

        url.end()
    }
}
Code 1

An initial attempt to use a trait object to optionally serialize fields based on their presence in the struct.

This example doesn’t compile because you can’t turn Serialize into a trait object (more on why that is later). In this situation, you have several options available.

Option 1: Try an Enum

As described in Enum or Trait Object, trait objects represent an open set of types, while enums represent a closed set of types. With a trait object, any type that implements the trait may be converted to a trait object and used as one. With an enum, only types present in the variants of the enum can be represented. Trait objects are more accepting, but because we know less about them, they’re also more restrictive.

Sometimes we may be trying to use a trait object in a context when an enum will do.

If you’re trying to turn a trait object into an enum and failing, ask yourself if you know all the possible types you’d want to use in the place of that trait object. If you know all of those types, define an enum with a variant per type instead, and use that enum where you’re currently trying to use the trait object.

In the case of our Url example, we can use this strategy easily!

Skip this content.
use serde::{ser::SerializeStruct, Serialize, Serializer};
use std::convert::From;
use url::Url as FullUrl;

struct Url<'a> {
    scheme: Option<&'a str>,
    username: Option<&'a str>,
    password: Option<&'a str>,
    host: Option<&'a str>,
    path: Option<&'a str>,
    fragment: Option<&'a str>,
    query: Option<&'a str>,
}

impl<'a> From<&'a FullUrl> for Url<'a> {
    fn from(other: &FullUrl) -> Url<'_> {
        let scheme = Some(other.scheme());
        let username = match other.username() {
            "" => None,
            u => Some(u),
        };
        let password = other.password();
        let host = other.host_str();
        let path = Some(other.path());
        let fragment = other.fragment();
        let query = other.query();

        Url {
            scheme,
            username,
            password,
            host,
            path,
            fragment,
            query,
        }
    }
}

// This enum has a variant for each field.
enum Field<'a> {
    Scheme(&'a str),
    Username(&'a str),
    Password(&'a str),
    Host(&'a str),
    Path(&'a str),
    Fragment(&'a str),
    Query(&'a str),
}

impl<'a> Serialize for Url<'a> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: Serializer,
    {
        // Notice we're using `Field` instead of `Box<dyn Serialize>` now!
        let mut fields: Vec<(&str, Field)> = Vec::new();

        if let Some(scheme) = self.scheme {
            fields.push(("scheme", Field::Scheme(scheme)));
        }

        if let Some(username) = self.username {
            fields.push(("username", Field::Username(username)));
        }

        if let Some(password) = self.password {
            fields.push(("password", Field::Password(password)));
        }

        if let Some(host) = self.host {
            fields.push(("host", Field::Host(host)));
        }

        if let Some(path) = self.path {
            fields.push(("path", Field::Path(path)));
        }

        if let Some(fragment) = self.fragment {
            fields.push(("fragment", Field::Fragment(fragment)));
        }

        if let Some(query) = self.query {
            fields.push(("query", Field::Query(query)));
        }

        let mut url = serializer.serialize_struct("Url", fields.len())?;

        for (name, value) in &fields {
            // Manually dispatching based on the field variant we have.
            match value {
                Field::Scheme(s) => url.serialize_field(name, s)?,
                Field::Username(s) => url.serialize_field(name, s)?,
                Field::Password(s) => url.serialize_field(name, s)?,
                Field::Host(s) => url.serialize_field(name, s)?,
                Field::Path(s) => url.serialize_field(name, s)?,
                Field::Fragment(s) => url.serialize_field(name, s)?,
                Field::Query(s) => url.serialize_field(name, s)?,
            }
        }

        url.end()
    }
}
Code 2

Fixing our initial example using an enum representing the fields.

In this case, one tradeoff is that you end up implementing the dispatch to each of the variants yourself, which is more manual work, but has the benefit of compiling.

note Info 1 Yes, this example is slightly contrived

Skip this content.

Technically, this example is contrived because each of the fields has the same type, &str, so you could just use &str directly instead of trying to use a trait object at all, or even an enum. In that case, use the type itself. Hopefully it still illustrates the concept.

Option 2: Try Type Erasure

This second option is a bit more complex, and comes courtesy of the inimitable David Tolnay and his erased-serde library. In this library, David illustrates an interesting trick, which I’ll show here.

Skip this content.
// This code taken from David Tolnay's erased-serde library,
// with modifications to explain the parts.

// Suppose these are the real traits from Serde.
trait Querializer {}

// This trait is not object safe.
trait Generic {
    fn generic_fn<Q: Querializer>(&self, querializer: Q);
}

// Implement the trait for reference to some type `T` which
// implements the trait.
impl<'a, T: ?Sized> Querializer for &'a T where T: Querializer {}

// Implement the trait for boxed pointers to some type `T` which
// implements the trait.
impl<'a, T: ?Sized> Generic for Box<T>
where
    T: Generic,
{
    fn generic_fn<Q: Querializer>(&self, querializer: Q) {
        (**self).generic_fn(querializer)
    }
}

/////////////////////////////////////////////////////////////////////
// This is an object-safe equivalent that interoperates seamlessly.

trait ErasedGeneric {
    // Replace the generic parameter with a trait object.
    fn erased_fn(&self, querializer: &dyn Querializer);
}

// Impl the not-object-safe trait for a trait object of the
// object-safe trait.
impl Generic for dyn ErasedGeneric {
    // Depending on the trait method signatures and the upstream
    // impls, could also implement for:
    //
    //   - &'a dyn ErasedGeneric
    //   - &'a (dyn ErasedGeneric + Send)
    //   - &'a (dyn ErasedGeneric + Sync)
    //   - &'a (dyn ErasedGeneric + Send + Sync)
    //   - Box<dyn ErasedGeneric>
    //   - Box<dyn ErasedGeneric + Send>
    //   - Box<dyn ErasedGeneric + Sync>
    //   - Box<dyn ErasedGeneric + Send + Sync>
    fn generic_fn<Q: Querializer>(&self, querializer: Q) {
        self.erased_fn(&querializer)
    }
}

// If `T` impls the not-object-safe trait, it impls the
// object-safe trait too.
impl<T> ErasedGeneric for T
where
    T: Generic,
{
    fn erased_fn(&self, querializer: &dyn Querializer) {
        self.generic_fn(querializer)
    }
}

fn main() {
    struct T;
    impl Querializer for T {}

    struct S;
    impl Generic for S {
        fn generic_fn<Q: Querializer>(&self, _querializer: Q) {
            println!("querying the real S");
        }
    }

    // Construct a trait object.
    let trait_object: Box<dyn ErasedGeneric> = Box::new(S);

    // Seamlessly invoke the generic method on the trait object.
    //
    // THIS LINE LOOKS LIKE MAGIC. We have a value of type trait
    // object and we are invoking a generic method on it.
    trait_object.generic_fn(T);
}
Code 3

An example of the type erasure pattern, courtesy of David Tolnay.

The type erasure trick being shown here is to start with a trait that’s not object safe because some of its methods contain generic parameters (as is the case in Serde), then create an equivalent trait which replaces the generic parameters with trait objects (this requires the traits present as bounds on the generic parameters are themselves object-safe), finally impl the not-object-safe trait for trait objects of the object-safe trait.

If that was a bit dense, don’t worry. What’s happening here is we’re resolving the central problem of making a trait object for a trait with a generic parameter, namely: how do I know what particular code to dispatch to? With a generic parameter, that code may vary based on the parameter present. With a trait object, the dispatching can be resolved dynamically, so you’re all good.

The trick of implementing a trait for a trait object is also neat. Trait objects are their own types, implemented as two pointers (one to the data and one to the vtable containing the information necessary for dynamic dispatch), and as their own types they can implement traits.

The magic here is that once this erased version of a trait is present, you can create the types you need and have them impl the original trait as normal, and then simply create a trait object of the erased trait instead. Because of the blanket impl saying “anything that implements the original trait impls the erased trait too,” you’re all good to make this trait object, and then to call functions on the original trait exactly as you were before, because the erased trait object you’ve made impls the original trait!

note Info 2 Check out erased-serde to see how it works!

Skip this content.

I’m not going to include the modified version of my example here, because the code for an erased version of Serialize that’s needed is a bit more complicated than can be shown in a blog-post level code snippet (in particular, there’s some trickyness around defining the Ok variant of the Result returned by Serialize::serialize). I encourage you to check out the library yourself to see how it’s done!

One trade-off with this approach is that you’ve replaced what would previously have been a static dispatch to any methods on your generic type parameter with dynamic dispatch on a trait object. The benefit is that it now compiles!

One limitation is that there are other reasons a trait may not be object safe, and this doesn’t address those cases. This trick only resolves the problem when a trait isn’t object safe due to the presence of generic parameters in its methods.

Option 3: Change the Trait

Finally, and most frustratingly, you can try to change the trait. If you own the definition of the trait, maybe this isn’t too bad. If you don’t own the trait definition, then it may be challenging to convince the owner of the relevant crate to change their definition. At the very least, you’ll have moved out of the realm of things which you immediately control.

In terms of how to change the trait, for that you’ll need to understand exactly why a trait may not be object safe, and so we finally come to the rules for object safety.

What makes a trait object safe?

So, what is object safety, and when is a trait object safe?

“Object safety” is a property of a trait that says “this trait can be turned into a trait object.” It’s accompanied by a set of rules (this is a simplification of the rules, and omits a number of complexities and corner cases):

  • A trait can’t have methods that do any of the following:
    • Take self by value as the receiver without a Sized.
    • Include any associated functions (which don’t take self at all).
    • Reference the Self type outside of the receiver position, except to access associated types for Self’s impl of the current trait or any supertraits.
    • Include any generic parameters.

As mentioned in “How to Read Rust Functions, Part 1,” unsized receivers don’t work. This is due to underlying hardware requirements, and means that for a trait object, where the receiver is inherently an unsized type (no Sized trait), self isn’t permissible.

Associated functions are disallowed because an associated function with no other parameters is nonsensical (where would any data to operate on come from?), and associated functions with other parameters ought to just be free functions which are handled outside of the trait object system.

References to Self outside of the receiver are disallowed because the type of Self is erased at runtime, the only thing that’s known is how to dispatch based on the info in the vtable. So any reference to Self elsewhere would be referencing an unknown (at runtime) type. The one exception is that accessing associated types for the current trait or any supertrait is permitted, as we do know that information at runtime (it’s part of the vtable info).

Finally, calls to generic methods are disallowed because generic methods in Rust are monomorphized at compile time, converted into distinct functions for each concrete type they’re called with, and there’s no clear way to monomorphize in the presence of a trait object as self.

So, all of these rules collectively disallow nonsensical code which would fail to function is allowed. They’re not optional rules, although they can be annoying at times.

Conclusion

Hopefully this has been a helpful guide through resolving the thorny problem of not being able to turn a trait into a trait object. To review, in that situation:

  • If you know the set of types you’ll use for the trait object, make an enum.
  • If the trait isn’t object safe due to the presence of generic parameters, use the type erasure technique.
  • Otherwise, change the trait to address whatever other issues are making it non-object-safe.

note Info 3 Thank You

Skip this content.

A huge thank you to Huon Wilson’s writing about object safety, which is still (~6 years later) the best on the subject. Thank you as well to the creators of the implementation of object safety checks in the Rust compiler, to which I referred while writing this section. The full set of rules is more complex than I’ve reflected here, and I encourage anyone deeply interested to read the source.

Possible Rust succeeds when more people can join in learning about Rust! Please take a moment to share, especially if you have questions or if you disagree!

Share on Twitter

Discussions