If you’ve worked with Rust for a while, you’ve probably heard the phrase “making illegal states unrepresentable”. It’s a phrase that’s often used when people praise Rust’s type system. But what exactly does it mean? And how can you apply it to your own code?
What is illegal state?
Imagine we’re writing an application that manages a list of users.
Looks simple enough, but is it correct?
What happens if we create a user with an empty username?
let user = User ;
Intuitively, we know that this is not what we want, but the compiler can’t help us. We did not give it enough information about usernames. Already, with this simple example, we managed to introduce illegal state.
Now, how can we fix this?
The Type System Is Your Friend
Consider the String
type. It’s a type that represents
an arbitrary sequence of unicode characters. In our case, we need much stricter
constraints. For a start, we want to make sure that the username is not
empty.
Whenever you’re uncertain how to model something in Rust, start by defining your basic types — your domain. That takes some practice, but your code will be much better for it.
In our case, we want to define a type that represents a username.
;
Note how the constructor now returns a Result
.
Also note, that wrapping the String
in a struct is a zero-cost abstraction.
The compiler will optimize it away, so there’s no performance penalty!
We can now use this type in our User
struct.
See how the compiler now guides us towards idiomatic Rust code?
It’s subtle, but username
is now of type Username
instead of String
.
This means we have much stronger guarantees around our own type as we can’t accidentally create a user with an empty username.
The username has to be constructed before:
let username = new?;
let birthdate = from_ymd;
let user = User ;
Side Note: How do we get rid of Username::new
?
You could implement TryFrom
:
use TryFrom;
let user = User ;
What About the Birthdate?
A new user that is 1000 years old is probably not a valid user. Let’s add some constraints.
use Datelike;
;
No mocking, no complicated setup, testing becomes a breeze.
Our User
struct now looks like this:
Adding More Constraints
It might sound simple, trivial even, but this is a very powerful technique.
What’s important is that you’re handling errors at the lowest possible level. In
this case, when you create the Username
object — and not when you insert it into your database for example.
This will make your code much more robust and easier to reason about, and it’s quick to add more constraints as you go along. For example, we might want to make sure that the username is not shorter than 3 characters, not longer than 256 characters, and that it contains only alphanumeric characters or dashes and underscores:
;
I’ve added some usage examples, which will be shown in the
documentation of the Username
struct. This is a great way to document your
constraints and to show how to use your types! As an added bonus, you can run
these examples as tests with cargo test --doc
.
Here’s a link to the code on the Rust Playground.
Does This Really Prevent Illegal States?
The keen reader might have noticed that we could still create invalid objects manually:
let username = Username; // uh oh
In any real-world scenario, we would probably encapsulate our logic in a module and only expose a constructor function to the outside world:
If we now tried to create a Username
object from the outside, we’d get a
compiler error:
let username = Username;
error: tuple ;
With that, the only way to create a Username
object is by using our new
function:
let username = new?;
This means, illegal states are avoided for users of our module.
In a way, we only made them “unconstructable”, though.
If we really wanted, we could model our struct to avoid illegal states at compile time, but it would be rather tedious to work with.
We get the benefit of compile-time safety, but at the cost of ergonomics. However, this pattern can be useful in other cases, as we will see in an article about compile-time invariants.
Library Support
I personally prefer to write validation functions as shown above, but you could consider using a validation library like validator instead.
Conclusion
If possible, use self-contained, custom types to model your domain. It will improve your system design, making it easier to test and reason about. Handle errors at the lowest possible level (as early as possible).