Be Simple

The phone buzzes at 3 AM.

You roll out of bed, open your laptop, and see this in the logs:

thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: 
Error("data did not match any variant of untagged enum Customer at line 1 column 15")', 
src/parsers/universal.rs:47:23

You open the codebase and find this:

pub struct UniversalParser<T: DeserializeOwned> {
    format: Box<dyn DataFormat>,
    _marker: std::marker::PhantomData<T>,
}

impl<T: DeserializeOwned> UniversalParser<T> {
    pub fn parse(&self, content: &str) -> Result<Vec<T>, Box<dyn std::error::Error>> {
        self.format.parse(content)
    }
}

A few thoughts rush through your head:

“What the hell is a PhantomData?”
“Why is there a trait object?”
“This is going to be a long night.”

The error must be buried somewhere in the interaction between that DataFormat trait, the generic parser, and serde deserialization. You scroll through 200 lines of trait implementations and generic constraints. Each layer adds another level of indirection. The stack trace is 15 levels deep. It’s like peeling an onion… it makes you cry.

You run git blame and curse the colleague who wrote this code. Whoops, it was you a few months ago.

Quick rewind. The phone buzzes at 3 AM.

You roll out of bed, open your laptop, and see this in the logs:

Error: CSV parse error at line 847: invalid UTF-8 sequence at byte index 23

You find this code:

#[derive(Debug, Deserialize)]
pub struct Customer {
    pub name: String,
    pub email: String, 
    pub phone: String,
}

pub fn parse_customers(csv_content: &str) -> Result<Vec<Customer>, csv::Error> {
    let mut reader = csv::Reader::from_reader(csv_content.as_bytes());
    reader.deserialize().collect()
}

All right, we seem to be parsing some customer data from a CSV file.

You look at line 847 of the input file and see corrupted character encoding. You remove the bad line, deploy a fix, and go back to sleep.

Don’t Be Clever

Rust programmers tend to be very clever. Too clever for their own good at times. Let me be the first to admit that I’m guilty of this myself.

We love to stretch Rust to its limits. After all, this is Rust! An empowering playground of infinite possibility. Shouldn’t we use the language to its full extent?

Nothing in Rust forces us to be fancy. You can write straightforward code in Rust just like in any other language. But in code reviews, I often see people trying to outsmart themselves and stumbling over their own shoelaces. They use all the advanced features at their disposal without thinking much about maintainability.

Here’s the problem: Writing code is easy. Reading it isn’t. These advanced features are like salt: a little bit can enhance the flavor, but too much can ruin the dish. And advanced features have a tendency to overcomplicate things and make readability harder.

Software engineering is all about managing complexity, and complexity creeps in when we’re not looking. We should focus on keeping complexity down.

Of course, some complexity is truly unavoidable. That’s the inherent complexity of the task. What we should avoid, however, is the accidental complexity, which we introduce ourselves. As projects grow, accidental complexity tends to grow with them. That is the cruft we all should challenge.

And simplicity also has other benefits:

Simplicity is prerequisite for reliability.

I don’t always agree with Edsger W. Dijkstra, but in this case, he was spot-on. Without simplicity, reliability is impossible (or at least hard to achieve). That’s because simple systems have fewer moving parts to reason about.

Good code is mostly boring, especially for production use. Simple is obvious. Simple is predictable. Predictable is good.

Why Simple is Hard

But if simplicity is so obviously “better,” why isn’t it the norm? Because achieving simplicity is hard! It doesn’t come naturally. Simplicity is typically not the first attempt but the last revision. ¹

Simplicity and elegance are unpopular because they require hard work and discipline to achieve.

Well put, Edsger.

It takes effort to build simple systems. It takes even more effort to keep them simple. That’s because you constantly have to fight entropy. Going from simple to more complex is much easier than the reverse.

Let’s come back to our 3 AM phone call.

The first version of the code was built by an engineer who wanted to make the system “flexible and extensible.” The second was written by a developer who just wanted to parse a CSV file. Turns out there never once was a need to parse anything other than CSV files.

Of course, the original author likely had good reasons for their design. Maybe prior experience might have suggested that more formats would be needed in the future? Or the team wanted to simplify testing by mocking the data format and ended up with a set of abstractions that turned out to be limiting? And shouldn’t the problem have been discovered during code review?

These are all valid points! But code reviews also take effort, and it’s easy to miss the forest for the trees. A less experienced Rust developer might not have the experience to evaluate the situation properly and saying “I don’t get it” also takes courage. And even if the code review had caught the complexity, it might have consumed a lot of energy to convince the original author to simplify the code or to change the design because of deadlines or sunk cost fallacies. It’s easy to judge past decisions with hindsight; that’s not my intention. Quite the opposite, actually: a series of individually perfectly reasonable decisions can lead to an overly complex, unmaintainable system. And taken in isolation, each small bit of complexity looks harmless. But complexity quickly compounds.

One lesson here is that the path to complexity is paved with good intentions. Another is that senior engineers should pair with junior ones more often to avoid knowledge silos.

More experienced developers tend to use more abstractions because they get excited about the possibilities. And I can’t blame them, really. Writing simple code is oftentimes pretty boring. It’s much more fun to test out that new feature we just learned. But after a while we forget how Rust beginners feel about our code: it’s the curse of knowledge.

Remember: abstractions are never zero cost. ²

Not all abstractions are created equal.

In fact, many are not abstractions at all — they’re just thin veneers, layers of indirection that add complexity without adding real value.

– Fernando Hurtado Cardenas

Abstractions cause complexity, and complexity has very real consequences. At some point, complexity will slow you down because it causes cognitive load. And cognitive load matters a lot because our mental capacity is finite.

The people who are starting with Rust are often overwhelmed by the complexity of the language. Try to keep that in mind as you get more proficient. If you fail to do that, you might alienate team members who are not as experienced as you, and they might give up on the project or Rust altogether.

Furthermore, if you switch companies and leave behind a complex codebase, the team will have a hard time maintaining it and onboarding new team members. The biggest holdup is how quickly people will be able to get up to speed with Rust. Don’t make it even harder on them. From time to time, look at Rust through a beginner’s eyes.

Generics Are A Liability

For some reason I feel compelled to talk about generics for a moment…

Not only do they make the code harder to understand, they can also have a real cost on compile times. Each generic gets monomorphized, i.e. a separate copy of the code is generated for each type that is used with that generic at compile time.

My advice is to only make something generic if you need to switch out the implementation right now. Resist premature generalization! (Which is related—but not identical to—premature optimization.)

“We might need it in the future” is a dangerous statement. The classic example is “we might need to switch databases in the future” leading to over-abstracted data access layers that nobody asked for. Be careful with that assumption because it’s hard to predict the future. ³

Your beautiful abstraction might become your biggest nemesis. If you can defer the decision for just a little longer, it’s often better to do so.

Generics also have an impact on the “feel” of the entire codebase. If you use a lot of generics, you will have to deal with the consequences everywhere. You will have to understand the signatures of functions and structs as well as the error messages that come with them. The hidden compilation cost of generics is hard to measure and optimize for.

Use generics sparingly. The thinking should be “this is generic functionality” instead of “I could make this generic.”

Let’s say you are working on a public API. A function that will be used a lot will need to take some string-based data from the user. You wonder whether you should take a &str or a String or something else as an input to your functions and why.

fn process_user_input(input: &str) {
    // do something with input
}

That’s quite simple and doesn’t allocate. But what if the caller wants to pass a String?

fn process_user_input(input: String) {
    // do something with input
}

We take ownership of the input. But hold on, what if we don’t need ownership and we want to support both?

fn process_user_input(input: impl AsRef<str>) {
    // do something with input
}

That works. But do you see how the complexity goes up?

Behind the scenes, it monomorphizes the function for each type that implements AsRef<str>.

That means that if we pass a String and a &str, we get two copies of that function. That means longer compile times and larger binaries.

Wait, what if we need to return something that references the input and we need the result to live as long as the input?

fn process_user_input<'a, S>(input: &'a S) -> &'a str 
where 
    S: AsRef<str> + ?Sized,
{
    // do something with input
}

Oh wait, we might need to send this across threads:

fn process_user_input<'a, S>(input: &'a S) -> &'a str 
where 
    S: AsRef<str> + Send + Sync + ?Sized,
{
    // do something with input
}

See how we went from a simple &str parameter to this monstrosity? Each step seemed “reasonable,” but we’ve created a nightmare that nobody wants to read or debug.

The problem is so simple, so how did that complexity creep in? We were trying to be clever! We were trying to improve the function by making it more generic. But is it really “better”?

All we wanted was a simple function that takes a string and does something with it.

Stay simple. Don’t overthink it!

Say we’re writing a link checker and we want to build a bunch of requests to check the links. We could use a function that returns a Vec<Result<Request>>.

fn create_requests(urls: Vec<String>) -> Vec<Result<Request>> {
    urls.into_iter()
        .map(|url| create_request(&url))
        .collect()
}

Or, we could return an iterator instead:

fn create_requests(urls: Vec<String>) -> impl Iterator<Item = Result<Request>> {
    urls.into_iter().map(|url| create_request(&url))
}

The iterator doesn’t look too bad, but the vec is simpler. What to do? The caller likely needs to collect the results anyway. Since we’re processing a finite set of URLs, the link checker needs all results to report successes/failures, and the results will probably be iterated multiple times. Memory usage isn’t a big concern here since the number of URLs in a document is typically small. All else being equal, the vec is probably the simpler choice.

Simple Code Is Often Fast Code

There’s a prejudice that simple code is slow. Quite the contrary! It turns out many effective algorithms are surprisingly simple. In fact, some of the simplest algorithms we’ve discovered are also the most efficient.

Take quicksort or path tracing, for example. Both can be written down in a handful of lines and described in a few sentences.

The idea is pretty simple and can fit on a napkin:

Try to get the first element from the list. If there is no element, the list is empty and we’re done.
Split the list into two sublists: elements smaller than the pivot and elements larger than or equal to the pivot.
Sort each sublist recursively.
By combining the sorted smaller list, the pivot, and the sorted larger list, you get the fully sorted list!

Here’s an ad-hoc version of this algorithm in Rust:

pub fn quicksort(mut v: Vec<usize>) -> Vec<usize> {
    let Some(pivot) = v.pop() else {
        return v;
    };

    let (smaller, larger) = v.into_iter().partition(|x| x < &pivot);

    quicksort(smaller)
        .into_iter()
        .chain(std::iter::once(pivot))
        .chain(quicksort(larger))
        .collect()
}

It’s not too far off from the description.

Yes, my simple implementation only supports usize right now and for any real-world use case, you should use the built-in sort algorithm, which is 20x faster than the above version in my benchmarks, but my point is that simple code packs a punch. On my machine, this implementation sorts 100k numbers in 1 millisecond.

This is an O(n log n) algorithm. It’s as fast as it gets for a comparison-based sort and it’s just a few lines of code. ⁴

Often, simple code can be optimized by the compiler more easily and runs faster on CPUs. That’s because CPUs are optimized for basic data structures and predictable access patterns. And parallelizing work is also easier when that is the case. All of that works in our favor when our code is simple.

Somewhat counterintuitively, especially when you’re doing something complicated, you should be extra careful to keep it simple. Simplicity is a sign of deep insight, of great understanding, of clarity—and clarity has a positive effect on the way a system functions. And since complicated systems are, well, complicated, that extra clarity helps keep things under control.

What I appreciate about Rust is how it balances high-level and low-level programming. We should leverage that more often. Most of the time, I write Rust code in a straightforward manner, and when that extra bit of performance becomes critical, Rust always lets me go back and optimize.

Keep Your Users in Mind

Most of the code you’ll write for companies will be application code, not library code. That’s because most companies don’t make money writing libraries, but business logic. There’s no need to get fancy here. Application code should be straightforward.

Library code can be a slightly different story. It can get complicated if it ends up being an important building block for other code. For example, in hot code paths, avoiding allocations might make sense, at which point you might have to deal with lifetimes. This uncertainty about how code might get used by others can lead to overabstraction. Try to make the common case straightforward. The correct path should be the obvious path users take.

Say you’re building a base64 encoder. It’s safe to assume that most people will want to encode a string (probably a unicode string like a &str) and that they want to use a “canonical” or “standard” base64 encoding. Don’t expect your users to jump through hoops to do the most common thing. Unless you have a really good reason, your API should have a function like this somewhere:

/// Encode input as Base64 string
fn base64_encode(input: &str) -> String;

Yes, you could make it generic over AsRef<[u8]> or support multiple alphabets:

/// Generic base64 encoder supporting multiple alphabets
fn base64_encode<T: AsRef<[u8]>>(input: T, alphabet: Base64Alphabet) -> String;

…and you might even offer a builder pattern for maximum flexibility:

let encoded = Base64Encoder::new()
    .with_alphabet(Base64Alphabet::UrlSafe) // What is UrlSafe?
    .with_decode_allow_trailing_bits(true) // Huh?
    .with_decode_padding_mode(engine::DecodePaddingMode::RequireNone) // I don't even...
    .encode("Hello, world!");

But what most users care about is how to get an encoded string:

let encoded = base64_encode("Hello, world!");

You could call the function above base64_encode_simple or base64::encode_standard to make it clear that it’s a simplified version of a more generic algorithm.

Simplicity is especially important when working with other developers because code is a way to communicate ideas, and you should strive to express your ideas clearly. It’s fine to offer additional functionality, but don’t make the easy thing hard in the process.

Tips For Fighting Complexity

Controlling complexity is the essence of computer programming. – Brian Kernighan and P.J. Plauger, The Elements of Programming Style

Start Small

Jerry Seinfeld had two writing modes: creating mode and editing mode.

When in creating mode, he’s exploring possibilities and letting ideas flow freely.
When in editing mode, he’s refining, cutting, and polishing.

These modes require different mindsets, and trying to do both simultaneously leads to paralysis. As a consequence, Seinfeld would never edit while creating because it would kill the creative flow.

The same principle applies to coding. Don’t try to architect the perfect solution on your first attempt. Write the naïve implementation first, then let your inner editor refine it. Switch off that inner critic. Who knows? You might just come up with a simpler design!

Resist the Temptation To Optimize Early

It can be tempting to use all of these fine, sharp tools you have at your disposal. But sharp tools they are! To master Rust is to say “no” to these tools more often than you say “yes.”

You might see an optimization opportunity and feel the urge to jump at it. But time and again, I see people make that optimization without prior validation. The result is that the code performs the same or becomes slower or is otherwise less ideal than the previous version. Measure twice, cut once.

Defer Refactoring

That might sound counterintuitive. After all, shouldn’t constant refactoring make our code better as we go?

The problem is that we have limited information at the time of writing our first prototype. If we refactor too early, we might end up in a worse place than where we started.

Take the CSV exporter from the beginning as an example: a smart engineer saw an opportunity to refactor the code in order to support multiple input formats. That locked us into a place where we had a generic exporter, which became a huge debugging burden while preventing us from seeing a better abstraction had we deferred the refactoring. Maybe we would have noticed that we’re always dealing with CSV data, but we could decouple data validation from data exportation. If we had seen that, it would have led to better error messages like:

Error: Customer 123 has invalid address field:
Invalid UTF-8 sequence at byte index 23: Address: "123 M\xE9n St."

This opportunity was lost because we jumped the gun and refactored too early.

I propose solving the problem at hand first and refactoring afterward. That’s because refactoring a simple program is way easier than doing the same for a complex one. Everyone can do the former, while I can count on one hand those who can do the latter. Preserve the opportunity to refactor your code. Refactoring might look like the smart thing to do at the time, but if you allow the simple code to just stick around for a little longer, the right opportunity for the refactor might present itself.

A good time to reflect is when your code starts to feel repetitive. That’s a sign that there’s a hidden pattern in your data. The right abstraction is trying to talk to you! It’s fine to make multiple attempts at an abstraction. See what feels right. If none of it does, just go back to the simple version and document your findings.

Performance Crimes Are “OK”

Rust is super fast, so you can literally commit all the performance crimes you like. Clone liberally, iterate over the same data structure multiple times, use a vector where a hashmap would be better.

It simply doesn’t matter. Hardware is fast and cheap, so put it to work.

Despite this reality, there’s a noticeable tendency among Rust developers to add significant complexity or extra code to avoid any and all allocations in code that isn’t even performance-critical. I think that’s because Rust is very anal about memory management, which makes developers hyper-aware of these costs. This awareness can get in the way of simplicity. Sometimes using Arc or Box is not worth optimizing away if it keeps the code simple.

If you’d like to learn more, I gave a talk on that topic titled The Four Horsemen of Bad Rust Code.

Be Curious But Conservative

All of the above doesn’t mean you should not learn about the good abstractions. It’s fun to learn and to be knowledgeable.

But you can focus on learning new concepts without hurting yourself. Understanding macros, lifetimes, interior mutability, etc. is very helpful, but in everyday “normal” Rust code you almost never make use of these concepts, so don’t worry about them too much and keep moving.

Use all the features you need and none you don’t.

Don’t Confuse Simple with Sloppy

Crucially, writing simple software does not mean implementing “good-enough-for-now” software. That approach surely will lead to technical debt and maintenance nightmares further down the line as workarounds get stacked on top of each other. No, simple software can still be well-designed software, but it keeps the current requirements in mind. The goal is not to take shortcuts. Quite the contrary: the goal is to avoid unnecessary complexity, which often is the root cause for having to take shortcuts later on when bad systems design prevents a cleaner solution. In code reviews, I pay extra attention to ensuring we don’t cut corners to avoid doing the necessary design work.

How to Recognize The Right Level of Abstraction

One litmus test I like to use is “Does it feel good to add new functionality?”

Good abstractions tend to “click” together. It just feels like there’s no overlap between the abstractions and no grunt work or extra conversions needed. The next step always feels obvious. Testing works without much mocking, your documentation for your structs almost writes itself. There’s no “this struct does X and Y” in your documentation. It’s either X or Y. Explaining the design to a fellow developer is straightforward. This is when you know you have a winner.

Getting there is not easy. It can take many iterations. The result of that process is what you’ll often see in popular libraries.

The right abstractions guide you to do the right thing: to find the obvious place to add new functionality, the right place to look for a bug, the right spot to make that database query.

All of that is easier if the code is simple. That’s why experienced developers always have simplicity in mind when they build out abstractions.

It’s possible that the original sin of a smart engineer is to optimize a thing that should not exist in the first place. Cross that bridge when you get there.

Write Code for Humans

Be clear, not clever. Write code for humans, not computers.

Simplicity is clarity. Simplicity is to succinctly express the essence of a thing. Simplicity is about removing the unnecessary, the irrelevant, the noise. Simple is good. Be simple.

I think there’s a similarity to writing here, where elegance (which is correlated with simplicity, in my opinion) requires an iterative process of constant improvement. The editing process is what makes most writing great. In 1657, Blaise Pascal famously wrote: “I have only made this letter longer because I have not had the time to make it shorter.” I think about that a lot when I write. ↩
For reference, see “The Power of 10 Rules” by Gerard J. Holzmann of the NASA/JPL Laboratory for Reliable Software. ↩
I should know because I passed on a few very risky but lucrative investment opportunities because I lacked the ability to accurately predict the future. ↩
Of course, this is not the most efficient implementation of quicksort. It allocates a lot of intermediate vectors and has O(n^2) worst-case performance. There are optimizations for partially sorted data, better pivot selection strategies, and in-place partitioning. But they are just that: optimizations. The core idea remains the same. ↩

Idiomatic Rust