My first time using Rust "for real"
- Published on
Or "Rust vs C++" if you like things spicy.
I used Rust for the first time as part of a serious project and it was a great experience. In fact, it was useful enough that I ported the rest of the project to Rust.
This post goes into depth on some of my thoughts on Rust and how it compares to C++.
But first, a Haiku:
I know C++
I make pretty fast software
All jokes aside, I think it's important context to mention that I'm pretty experienced with C++. I've used it almost daily for several years as part of my job. I've worked on very large C++ codebases and small ones as well. I've also worked on safety critical systems where performance is important.
Thoughts on Rust
When you first start using Rust, it takes a lot longer to get to the point where things compile (vs other languages), but it feels much more likely to work once you get to that point.
One of the core parts of Rust is its borrow checker. A lot of the language's safety guarantees come from this feature.
I find it useful to think of the borrow checker almost as a "compile-time read/write lock": At any given time, only one writer (mutable reference) or any number of readers (immutable references) are allowed for a given variable. This rule, along with a few others, apply to basically every variable in Rust*.
This might seem a little limiting, but I ended up really liking the added "restrictions" because of the additional compile time safety guarantees they provide.
We'll talk about some of these guarantees and "restrictions" later in the post.
For now, let's talk about a bunch of things that help during the journey of getting to your code to compile:
The error messages and warnings are exquisite. They'll generally give you more explanation than the bare minimum and help you solve the issue.
Instead of just saying what is wrong, the messages usually go a step or two further and tell you why the code is wrong and how to fix it.
Third party dependencies
In C++ projects, I tend to use bazel as a build system. It was created at Google and is super powerful. Bazel has a very steep learning curve, but once you learn how to use it in depth, there's a lot you can do with it (caching, sysroots, distributed builds, cross compiling, code coverage, pulling in dependencies, etc).
The Rust ecosystem has a package manager called Cargo.
In a super generalized/handwavy way, npm/yarn is to nodejs as Cargo is to Rust.
Cargo is great. It makes it so much easier to pull in third party dependencies. In C++, there's a lot of build system wrangling to pull in complex dependencies. For example, pulling in a project that uses cmake into a bazel project is non-trivial. Even pulling in a bazel project into a bazel project can be non-trivial. This isn't just a bazel problem; there isn't really a "standard" build system/package manager for C/C++.
I'm talking about dependencies built from source, not static or shared libs (which also have some issues). I'll touch on this more below.
I know that a few C++ package managers exist, but none that are universally accepted as "the way to do it." There are also several build systems out there which makes this more difficult.1
Cargo is the way to build packages in Rust. This means pulling in a dependency is usually as easy as adding a line to your
I didn't expect this to have as much of an impact as it did, but I was much less hesitant to pull in third party dependencies and this turned out to be quite useful (as we'll talk about in more depth below).
Pretty much all Rust packages (which are called crates) are distributed on https://crates.io/
Because there's a standard way of distributing libraries (and some standard documentation practices), basically all rust libraries have docs on https://docs.rs/.
This helps reduces cognitive load when working with several new libraries.
For large C++ projects, I've noticed fairly poor performance (+ high CPU usage) from the C++ tools vscode extension. Linting, errors, and warnings also break every once in a while.
Anecdotally, the VS Code Rust extension seems to have much better performance and the iteration cycles seem faster (typing code -> warnings/errors).
I touched on this a little above, but being able to pull in dependencies easily is a game changer.
Profiling and Benchmarking
In Rust, I can pull in the
criteron crate for microbenchmarking and get profiles and flamegraphs "for free" by using the
Google Benchmark is generally a fairly easy dependency to add to a C++ project regardless of the build system you're using (unless you're running with ASan, TSan, and/or UBSan - more on that below), but adding
criteron as a dependency to a rust project is a literally a one line change.
Actually using either Benchmark or Criteron is a little more work, but the amount of additional effort is about the same so I'm ignoring it
Running a benchmark and getting a flamegraph without manually running any other tools is great.
And I can still use
perf directly if I want.
gdb work out of the box with Rust which likely means a lot of the debugging tools you're used to will work.
There's enough content here for several posts so I'll try and keep it concise.
Generally, high performance I/O bound code will run significantly faster/more efficiently when using async code. This basically means your code does something else while it's waiting for I/O operations to complete (instead of yielding back to the OS scheduler).
C++ doesn't really have strong, built-in, high performance
async support. You could build something with
promises, but the built-in versions aren't really full featured (e.g. no support for executors).
Upcoming versions of C++ are adding support for things like this, but it's not currently plug-and-play.
Folly is a C++ library from Facebook that has lots of really useful C++ primitives including fibers, async, executors, futures, and a bunch of other things. Unfortunately, it's really non-trivial to pull into a project (as a from-source dependency).
async functions built-in and the
tokio async runtime is great. It makes building high performance, I/O-bound applications much easier.
Serialization, tracing, logging, CLI, etc.
Rust has crates that make a ton of things easy:
serde: serialize/deserialize arbitrary structs into a bunch of formats
bincode: uses ^ to serialize/deserialize structs to chunks of bytes
tracing: trace functions/bits of code and record how long they took along with additional information you want to store
tracing-chrome: Generate traces using ^ that can be opened with
log: A standard logging interface
clap: CLI + argument parsing
You can do most/all of the above in C++ (e.g.
spdlog for logging), but Rust makes it easy because it's straightforward to pull in and use these dependencies.
Also, Rust macros make things like serializing/deserializing structs usually as simple as adding
#[derive(Serialize, Deserialize)] above your struct. Speaking from experience, C++ serialization/deserialization is definitely not that easy.
Rust also interoperates well with C and C++ (aka FFI). You can call C functions and expose Rust functions to C using the
extern keyword. As you might expect, there are crates that make this even easier. For example:
bindgencrate automatically generates Rust bindings for arbitrary C libraries (and some C++ libraries).
libccrate lets you make calls to the C standard library from rust
This is super useful when incrementally bringing Rust into a C or C++ project.
Note: I did run into a few rough edges here around symbol visibility and exposing C symbols in transitive crates, but fixed it by restructuring my project.
Being able to quickly and easily try dependencies (and switch them out if they don't work) is really powerful.
Obviously, being able to pull in dependencies easily has its downsides (e.g. the JS
left-pad fiasco), but I think the pros outweigh the cons.
One of Rust's big selling points is memory safety. Specifically:
- No data races
- No null pointer dereferences
- "You will never endure a dangling pointer, a use-after-free, or any other kind of Undefined Behavior (a.k.a. UB)." - the Rust docs
If safe Rust code compiles, we have the above guarantees.
Unfortunately, C and C++ don't give us those guarantees. Because of that, there are a lot of tools that try to help.
Some tools2 from Google that are built into LLVM/Clang:
- AddressSanitizer (or ASan): Detects use-after-free (+ other things)
- MemorySanitizer (or MSan): Detects reads of uninitialized memory
- ThreadSanitizer (or TSan): Detects data races
- UndefinedBehaviorSanitizer (or UBSan): Detects various kinds of undefined behavior
One important thing to note is that all of the above require running your program and some have fairly significant runtime overhead.
Hmm, that list looks suspiciously similar to the list of compile-time guarantees we get from Rust.
I've mentioned "from source" C++ dependencies a few times and you might be wondering "why not just use static libs or shared libs?" That's a good question and one of the answers is that TSan requires building all code (including dependencies) with the
-fsanitize=thread flag in order to work properly.
So if you want to test your C/C++ code for data races with TSan, you need to build everything from source (or somehow get instrumented libraries).
I have run into a few issues with Rust:
- Things like specialization with Rust generics aren't stable
- Thread locals are sometimes slow and are also awkward to use
There are some other issues I'm aware of that I haven't experienced directly. Things like async not working in traits or lots of generics slowing down compile times (because of monomorphization).
As an aside, heavy usage of templates in C++ can significantly slow down compile times in large codebases. I haven't experienced the "slow compile times with generics" Rust issue so I can't really intelligently compare Rust and C++ here 🤷♂️
In general, I liked my experience with Rust enough that I ported the rest of the project from C++ to Rust.
There are some downsides, but especially for low-level stuff, I really like the tradeoff of the added "restriction" of the borrow checker/longer time to runnable code for more confidence in the program working correctly early on.
Sometimes the borrow checker seems a little too aggressive, but it's not wrong and I've found myself getting good at predicting when it's going to complain about something.
There are "escape hatches" like
unsafe, inline assembly, etc. that are helpful, but it's really useful to be able to trust that, generally, memory corruptions and (certain types of) race conditions aren't possible. Especially if you're writing low-level, high-performance code.
In general, I think a goal of programming languages should be to strike a good balance between making it relatively easy to express yourself and relatively difficult to introduce unintended behavior. I think Rust strikes a very good balance here while having similar performance characteristics to C++.
I have many more nuanced thoughts in this area, but this post is already getting a bit long. If you want to see an in-depth post about tradeoffs when picking programming languages/frameworks for a project (or maybe even a more formal analysis of "Rust vs C++"), let me know on Twitter.
I spent some time thinking about this, but I don't really see a case where I'd start something new as a C++ project instead of starting it as a Rust one. There most likely are cases where C++ is a better choice (because it's quite rare for one thing to be strictly "better" without any tradeoffs), but none that I anticipate running into in the near future.
If you haven't tried Rust before and have an opportunity to do so, I'd recommend giving it a shot!
If you enjoyed this article, please follow me on Twitter. I'd really appreciate it, thanks!
Random aside: Bazel (open source) is based on Google's internal build system, Blaze. Some people who worked with Blaze/Bazel at Google created and open-sourced Buck at Facebook, which has a lot of similarities to Bazel, but is not directly compatible. PyTorch has independent build configuration for CMake, Buck, and Bazel. ↩