Working on something completely unrelated, I stumbled across this comment in a Rust project:
// TODO: Make sure this has at least one element
let kafka_brokers = vec!;
The intent here was to make sure that there is at least one Kafka broker (server) to connect to.
Right now, the program would happily compile with an empty vector. We'd have to add a runtime check to make sure that the vector is not empty:
That's not great, because we have to remember to validate the vector before we use it. During refactoring, the check could be accidentally removed, which would lead to a runtime error.
Wait, there's a crate for that!
Maybe you know about the vec1
crate,
which provides a Vec1
type that can only be constructed with at least
one element.
let kafka_brokers = vec1!; // works
let kafka_brokers = vec1!; // compile error
Now the program would not compile if we tried to use vec1!
with zero
elements, which is exactly what we want!
Type-driven development
There's a deeper lesson here, which applies to idiomatic Rust code in general:
If in doubt, lean into the type system to enforce invariants at compile-time.
An invariant is a condition that must always hold true. You want to enforce these invariants as early as possible.
I recently learned about the term type-driven development, a practice that emphasizes the use of types to ensure program correctness.
To see how it works, let's try to implement a basic version of vec1
ourselves.
Variant 1: The vec1
macro
Let's create a macro that behaves like vec!
, but doesn't allow an empty vector.
The macro handles two cases:
- The first case matches a single element, followed by zero or more elements.
- The second case matches an empty vector and will panic at compile-time.
;
=> ;
}
}
We use compile_error!
,
from the standard library, which will always fail to compile with the given
error message.
This does indeed look like it solves our problem!
However, it only does so superficially.
Since we return an ordinary Vec
, the information that there is at least one
element is lost and the invariant is not enforced by the type system later on.
Nothing keeps us from passing an empty vector to a function that requires at
least one element, so we would still need our brokers.is_empty()
runtime check
from above.
We haven't gained much.
Can we do better?
Variant 2: Implementing a Vec1
type
The trick is to create a new type, Vec1
, that behaves like an ordinary Vec
,
but encapsulates the knowledge we just gained about our input.
We do this by using two fields internally, first
and rest
:
// Note: Fields are not public, so we can enforce
// invariants during construction
Let's update our macro to return a Vec1
instead of a Vec
:
;
=> ;
}
}}
To make our Vec1
behave like any ordinary Vec
, we can implement the same traits.
For example, we probably want to iterate over the elements:
We also want to index into our Vec1
:
To behave exactly like a Vec
, we would need to implement a lot more
traits,
but remember that the goal here is to learn about the general pattern of using
types to enforce invariants.
Note that we cannot simply implement Deref
and DerefMut
to
delegate to the underlying Vec
, because that would allow us to mutate the
vector in a way that violates our invariant. Furthermore, we would need a custom
implementation of remove
and pop
to make sure that the vector is never
empty.
On the other hand, your own types are probably very specific to your use-case, so you might not need to implement as many traits.
With that, let's write some tests:
We now have a vector with at least one element. If you want to play around with the code, here's a link to the Rust playground.
In any real-world scenario, you would probably want to use the vec1
crate
instead of rolling your own implementation.
Variant 3: Using an Array
If we assume that the list of Kafka brokers doesn't change at runtime (a reasonable assumption, given that it's a configuration value) we could also use a fixed-size array.
By using an array, we can statically allocate the whole collection, which is
more efficient than using a Vec
(a dynamically allocated datatype on the heap).
Here is the same code as above, but using an array instead of a Vec
.
use Index;
};
=> ;
=> ;
}
Note that we have to introduce a special case for a single element.
That's because we need to know the value N
for our type parameter, but
it varies depending on the number of elements in rest
.
Our trait implementations now expect a const
parameter, e.g.:
The unit tests are the same as before.
I like the fact that this avoids any runtime overhead, which can be helpful in memory-constrained environments or in situations where dynamic allocation is not possible.
It depends on your use-case to decide whether this is a better approach than
using a vec1
. It's fun how all of the guarantees can be enforced without any
runtime overhead, though!
Further Improvements
One could think of a few ways to add more checks. For example, since we know that Kafka brokers get represented as URLs, we could also enforce that invariant at compile-time.
Which additional checks you want to add depends on your use-case. As a general rule of thumb, I like to follow this advice:
Model your data using the most precise data structure you reasonably can.
— Alexis King
Conclusion
Rust has great support for type-driven design, which can guide you towards robust and idiomatic code to enforce invariants at compile-time.
Always be on the lookout for ways to let the type-system guide you towards stronger abstractions.
You might also be interested in my previous post on making illegal states unrepresentable in Rust.