Working on something completely unrelated, I stumbled across this comment in a Rust project:
// TODO: Make sure this has at least one element
let kafka_brokers = vec!;
The intent here was to make sure that there is at least one Kafka broker (server) to connect to.
Right now, the program would happily compile with an empty vector. We'd have to add a runtime check to make sure that the vector is not empty:
That's not great, because we have to remember to validate the vector before we use it. During refactoring, the check could be accidentally removed, which would lead to a runtime error.
Wait, there's a crate for that!
Maybe you know about the vec1
crate,
which provides a Vec1
type that can only be constructed with at least
one element.
let kafka_brokers = vec1!; // works
let kafka_brokers = vec1!; // compile error
Now the program would not compile if we tried to use vec1!
with zero
elements, which is exactly what we want!
Type-driven development
There's a deeper lesson here, which applies to idiomatic Rust code in general:
If in doubt, lean into the type system to enforce invariants at compile-time.
An invariant is a condition that must always hold true. You want to enforce these invariants as early as possible.
I recently learned about the term type-driven development, a practice that emphasizes the use of types to ensure program correctness.
To see how it works, let's try to implement a basic version of vec1
ourselves.
The vec1
macro
Let's create a macro that behaves like vec!
, but doesn't allow an empty vector.
The macro handles two cases:
- The first case matches a single element, followed by zero or more elements.
- The second case matches an empty vector and will panic at compile-time.
;
=> ;
}
We use compile_error!
,
from the standard library, which will always fail to compile with the given
error message.
This does indeed look like it solves our problem!
However, it only does so superficially.
There's a nifty bug hiding in disguise.
Since we return an ordinary Vec
, the information that there must be at least one element is not a part of the type and the invariant will not be enforced by the type system later on.
We only handle the case where we construct the vector, but not when we use it.
Nothing keeps us from passing an empty vector to a function that requires at least one element. As a result, we would still need our brokers.is_empty()
runtime check from above. Dolorous.
All of the hard work was for naught.
But, can we do better?
Second Try: Implementing a Vec1
type
The trick is to create a new type, Vec1
, that behaves like an ordinary Vec
,
but encapsulates the knowledge we just gained about our input.
We do this by using two fields internally, first
and rest
:
// Note: Fields are not public, so we can enforce
// invariants during construction
Let's update our macro to return a Vec1
instead of a Vec
:
};
=> ;
}
To make our Vec1
behave like any ordinary Vec
, we can implement the same traits.
For example, we probably want to iterate over the elements:
The associated IntoIter
type is a bit tricky, but it's just a way to chain
together the first element and the rest of the elements without additional
allocations.
We also want to index into our Vec1
:
To behave exactly like a Vec
, we would need to implement a lot more
traits,
but remember that the goal here is to learn about the general pattern of using
types to enforce invariants.
Note that we cannot simply implement Deref
and DerefMut
to
delegate to the underlying Vec
, because that would allow us to mutate the
vector in a way that violates our invariant. Furthermore, we would need a custom
implementation of remove
and pop
to make sure that the vector is never
empty.
On the other hand, your own types are probably very specific to your use-case, so you might not need to implement as many traits.
With that, let's write some tests:
We now have a vector with at least one element. If you want to play around with the code, here's a link to the Rust playground.
In any real-world scenario, you would probably want to use the vec1
crate
instead of rolling your own implementation. And yes, the vec1
crate
implements all the necessary traits to make Vec1
behave like a Vec
in a similar fashion to what we did above.
Using a fixed-size array
If we assume that the list of Kafka brokers doesn't change at runtime (a reasonable assumption, given that it's a configuration setting) we could also use a fixed-size array.
By using an array, we can statically allocate the whole collection, which is
more efficient than using a Vec
(a dynamically allocated datatype on the heap).
Here is the same code as above, but using an array instead of a Vec
.
};
=> ;
}
Note that the compiler will infer N
(the number of elements in rest
)
as well as T
(the type of elements).
Our trait implementations now expect a const
parameter, e.g.:
The unit tests are the same as before. Here's the entire code.
I like the fact that this avoids any runtime overhead, which can be helpful in memory-constrained environments or in situations where dynamic allocation is not possible.
It depends on your use-case to decide whether this is a better approach than
using a vec1
. It's fun how all of the guarantees can be enforced without any
runtime overhead, though!
Further Improvements
One could think of a few ways to add more checks. For example, since we know that Kafka brokers get represented as URLs, we could also enforce that invariant at compile-time.
Which additional checks you want to add depends on your use-case.
For instance, if Kafka brokers are a central part of your application, you might want to create your own KafkaBroker
type that enforce additional invariants.
Here's a nice general rule of thumb:
Model your data using the most precise data structure you reasonably can.
— Alexis King in Parse, don't validate
Conclusion
Rust has great support for type-driven design, which can guide you towards robust and idiomatic code to enforce invariants at compile-time.
Always be on the lookout for ways to let the type-system guide you towards stronger abstractions.
If you liked this, you might also be interested in my previous post on making illegal states unrepresentable in Rust.