Title is slightly misleading but the content is good. It's the "Safe Rust" in the title that's weird to me. These apply to Rust altogether, you don't avoid them by writing unsafe Rust code. They also aren't unique to Rust.
A less baity title might be "Rust pitfalls: Runtime correctness beyond memory safety."
It is consistent with the way the Rust community uses "safe": as "passes static checks and thus protects from many runtime errors."
This regularly drives C++ programmers mad: the statement "C++ is all unsafe" is taken as some kind of hyperbole, attack or dogma, while the intent may well be to factually point out the lack of statically checked guarantees.
It is subtle but not inconsistent that strong static checks ("safe Rust") may still leave the possibility of runtime errors. So there is a legitimate, useful broader notion of "safety" where Rust's static checking is not enough. That's a bit hard to express in a title - "correctness" is not bad, but maybe a bit too strong.
No, the Rust community almost universally understands "safe" as referring to memory safety, as per Rust's documentation, and especially the unsafe book, aka Rustonomicon [1]. In that regard, Safe Rust is safe, Unsafe Rust is unsafe, and C++ is also unsafe. I don't think anyone is saying "C++ is all unsafe."
You might be talking about "correct", and that's true, Rust generally favors correctness more than most other languages (e.g. Rust being obstinate about turning a byte array into a file path, because not all file paths are made of byte arrays, or e.g. the myriad string types to denote their semantics).
Mostly, there is a sub culture that promotes to taint everything as unsafe that could be used incorrectly, instead of memory safety related operations.
That subculture is called “people who haven’t read the docs”, and I don’t see why anyone would give a whole lot of weight to their opinion on what technical terms mean
but it is false advertising when it's used all over the internet with: rust is safe!
telling the whole world to rtfm for your co-opting of the generic word "safe" is like advertisers telling you to read the fine print: a sleazy tactic.
It's not that either, and you are validating the GP's point. Rust has a very specific 'unsafe' keyword that every Rust developer interpret implicitly and instinctively as 'potentially memory-unsafe'. Consequently, 'safe' is interpreted as the opposite - 'guaranteed memory-safe'. Using that word as an abbreviation among Rust developers is therefore not uncommon.
However while speaking about Rust language in general, all half-decent Rust developers specify that it's about memory safety. Even the Rust language homepage has only two instances of the word - 'memory-safety' and 'thread-safety'. The accusations of sleaziness and false accusations is disingenuous at best.
There is, since the zero is used as a niche value optimisation for enums, so that Option<NonZero<u32>> occupies the same amount of memory as u32.
But this can be used with other enums too, and in those cases, having a zero NonZero would essentially transmute the enum into an unexpected variant, which may cause an invariant to break, thus potentially causing memory unsafety in whatever required that invariant.
> which may cause an invariant to break, thus potentially causing memory unsafety in whatever required that invariant
By that standard anything and everything might be tainted as "unsafe", which is precisely GP's point. Whether the unsafety should be blamed on the outside code that's allowed to create a 0-valued NonZero<…> or on the code that requires this purported invariant in the first place is ultimately a matter of judgment, that people may freely disagree about.
EDIT: A summary of this is that it is impossible to write a sound std::Vec implementation if NonZero::new_unchecked is a safe function. This is specifically because creating a value of NonZero which is 0 is undefined behavior which is exploited by niche optimization. If you created your own `struct MyNonZero(u8)`, then you wouldn't need to mark MyNonZero::new_unchecked as unsafe because creating MyNonZero(0) is a "valid" value which doesn't trigger undefined behavior.
The issue is that this could potentially allow creating a struct whose invariants are broken in safe rust. This breaks encapsulation, which means modules which use unsafe code (like `std::vec`) have no way to stop safe code from calling them with the invariants they rely on for safety broken. Let me give an example starting with an enum definition:
// Assume std::vec has this definition
struct Vec<T> {
capacity: usize,
length: usize,
arena: * T
}
enum Example {
First {
capacity: usize,
length: usize,
arena: usize,
discriminator: NonZero<u8>
},
Second {
vec: Vec<u8>
}
}
Now assume the compiler has used niche optimization so that if the byte corresponding to `discriminator` is 0, then the enum is `Example::Second`, while if the byte corresponding to `discriminator` is not 0, then the enum is `Example::First` with discriminator being equal to its given non-zero value. Furthermore, assume that `Example::First`'s `capacity`, `length`, and `arena` fields are in the in the same position as the fields of the same name for `Example::Second.vec`. If we allow `fn NonZero::new_unchecked(u8) -> NonZero<u8>` to be a safe function, we can create an invalid Vec:
fn main() {
let evil = NonZero::new_unchecked(0);
// We write as an Example::First,
// but this is read as an Example::Second
// because discriminator == 0 and niche optimization
let first = Example::First {
capacity: 9001, length: 9001,
arena: 0x20202020,
discriminator: evil
}
if let Example::Second{ vec: bad_vec } = first {
// If the layout of Example is as I described,
// and no optimizations occur, we should end up in here.
// This writes 255 to address 0x20202020
bad_vec[0] = 255;
}
}
So if we allowed new_unchecked to be safe, then it would be impossible to write a sound definition of Vec.
Yeah, anything can (and should) be marked unsafe if it could lead to memory safety problems. And so if it potentially breaks an invariant which is relied on for memory safety, it should be marked unsafe (conversely, code should not rely on an unchecked, safe condition for memory safety). That's basically how it works, Rust has the concept of unsafe functions so that libraries can communicate to users about what can and can't be relied on to keep memory safety without manual checking. This requires a common definition of 'safe', but it then means there isn't any argument about where the bug is: if the invariant isn't enforced by the compiler in safe code, then other code should not rely on it. If it is, then the bug is in the unsafe code that broke the invariant.
> Whether the unsafety should be blamed on the outside code that's allowed to create a 0-valued NonZero<…> or on the code that requires this purported invariant in the first place is ultimately a matter of judgment, that people may freely disagree about.
It's not, though. NonZero<T> has an invariant that a zero value is undefined behavior. Therefore, any API which allows for the ability to create one must be unsafe. This is a very straightforward case.
Because of cult like belief structures growing up around rust, it's clear as day for us on the outside, I see it from the evangelists in the company I work for "rust is faster and safer to develop with when compared to c++", I'm no c++ fan but it's obviously nonsense.
I feel people took the comparison of rust to c and extrapolated to c++ which is blatantly disingenuous.
The cult that I see growing online a lot are those who are invested in attacking Rust for some reason, though their arguments often indicate that they haven't even tried it. I believe that we're focusing so much on Rust evangelists that we're neglecting the other end of the zealotry spectrum - the irrational haters.
The Rust developers I meet are more interested in showing off their creations than in evangelizing the language. Even those on dedicated Rust forums are generally very receptive to other languages - you can see that in action on topics like goreleaser or Zig's comptime.
And while you have already dismissed the other commenter's experience of finding Rust nicer than C++ to program in, I would like to add that I share their experience. I have nothing against C++, and I would like to relearn it so that I can contribute to some projects I like. But the reason why I started with Rust in 2013 was because of the memory-saftey issues I was facing with C++. There are features in Rust that I find surprisingly pleasant, even with 6 additional years of experience in Python. Your opinion that Rust is unpleasant to the programmer is not universal and its detractions are not nonsense.
I appreciate the difficulty in learning Rust - especially getting past the stage of fighting the borrow checker. That's the reason why I don't promote Rust for immediate projects. However, I feel that the knowledge required to get past that stage is essential even for correct C and C++. Rust was easy for me to get started in, because of my background in digital electronics, C and C++. But once you get past that peak, Rust is full of very elegant abstractions that are similar to what's seen in Python. I know it works because I have trained js and python developers in Rust. And their feedback corroborates those assumptions about learning Rust.
Care to explain the obvious, then? Rust is quite a lot nicer to write than C++ in my experience (and in fact, it seems like rust is most attractive to people who were already writing C++: people who still prefer C are a lot less likely to like Rust).
There is nothing attractive about c++ or rust, I really don't understand how anyone can think so, it has to be some sort of Stockholm syndrome. Think about it, before you started programming what about your experiences would make you appreciate the syntax soup of rust and c++?
I dunno, there's not much about my previous experience that would indicate much one way or the other. I have found, though, that I tend to prefer slightly denser, heterogeneous code and syntax than average. Low-syntax languages like Haskell and Lisps make my head hurt because the code is so formless it becomes hard for me to parse, while languages with more syntax and symbols are easier (though, there is a limit, APL,k, etc, are a little far I find)
I see this subculture far more in online forums than with fellow Rust developers.
Most often, the comments come from people who don’t even write much Rust. They either know just enough to be dangerous or they write other languages and feel like it’s a “gotcha” they can use against Rust.
Formally the team/docs are very clear, but I think many users of Rust miss that nuance and lump memory safety together with all the other features that create the "if it compiles it probably works" experience
So I agree with the above comment that the title could be better, but I also understand why the author gave it this title
> ... with all the other features that create the "if it compiles it probably works" experience
While it's true that Rust's core safety feature is almost exclusively about memory safety, I think it contributes more to the overall safety of the program.
My professional background is more in electronics than in software. So when the Rust borrow checker complains, I tend to map them to nuances of the hardware and seek work-arounds for those problems. Those work-arounds often tend to be better restructuring of the code, with proper data isolation. While that may seem like hard work in the beginning, it's often better towards the end because of clarity and modularity it contributes to the code.
Rust won't eliminate logical bugs or runtime bugs from careless coding. But it does encourage better coding practices. In addition, the strict, but expressive type system eliminates more bugs by encoding some extra constraints that are verified at compile time. (Yes, there are other languages that do this better).
And while it is not guaranteed, I find Rust programs to just work if it compiles, more often than in the other languages I know. And the memory-safety system has a huge role in that experience.
The commonly given response to this question is two-fold, and both parts have a similar root cause: smart pointers and "safety" being bolted-on features developed decades after the fact. The first part is the standard library itself. You can put your data in a vec for instance, but if you want to iterate, the standard library gives you back a regular pointer that can be dereferenced unchecked, and is intended to be invalidated while still held in the event of a mutation. The second part is third party libraries. You may be diligent about managing memory with smart pointers, but odds are any library you might use probably wants a dumb pointer, and whether or not it assumes responsibility for freeing that pointer later is at best documented in natural language.
This results in an ecosystem where safety is opt-in, which means in practice most implementations are largely unsafe. Even if an individual developer wants to proactive about safety, the ecosystem isn't there to support them to the same extent as in rust. By contrast, safety is the defining feature of the rust ecosystem. You can write code and the language and ecosystem support you in doing so rather than being a barrier you have to fight back against.
Yep. Safe rust also protects you from UB resulting from incorrect multi-threaded code.
In C++ (and C#, Java, Go and many other “memory safe languages”), it’s very easy to mess up multithreaded code. Bugs from multithreading are often insanely difficult to reproduce and debug. Rust’s safety guardrails make many of these bugs impossible.
This is also great for performance. C++ libraries have to decide whether it’s better to be thread safe (at a cost of performance) or to be thread-unsafe but faster. Lots of libraries are thread safe “just in case”. And you pay for this even when your program / variable is single threaded. In rust, because the compiler prevents these bugs, libraries are free to be non-threadsafe for better performance if they want - without worrying about downstream bugs.
I've written some multithreaded rust and I've gotta say, this does not reflect my experience. It's just as easy to make a mess, as in any other language.
Safe rust prevents you from writing data races. All concurrent access is forced to be guarded by synchronization primitives. Eliminating an entire class of bugs.
You can still create a mess from logical race conditions, deadlocks and similar bugs, but you won’t get segfaults because you after the tenth iteration forgot to diligently manage the mutex.
Personally I feel that in rust I can mostly reason locally, compared to say Go when I need to understand a global context whenever I touch multithreaded code.
Me too. I agree that its not a bed of roses - and all the memory safety guarantees in the world don't stop you from making a huge mess. But I haven't run into any of the impossible-to-debug crashes / heisenbugs in my multithreaded rust code that I have in C/C++.
Most likely because it all multi-threaded code access in-memory data structures, internal to the process memory, the only scenario in multi-threaded systems that Rust has some support for.
Make those threads access external resources simultaneously, or memory mapped to external writers, and there is no support from Rust type system.
> Make those threads access external resources simultaneously, or memory mapped to external writers, and there is no support from Rust type system.
I don’t think that’s true.
External thread-unsafe resources like that are similar in a way to external C libraries: they’re sort of unsafe by default. It’s possible to misuse them to violate rust’s safe memory guarantees. But it’s usually also possible to create safe struct / API wrappers around them which prevent misuse from safe code. If you model an external, thread-unsafe resource as a struct that isn’t Send / Sync then you’re forced to use the appropriate threading primitives to interact with the resource from multiple threads. When you use it like that, the type system can be a great help. I think the same trick can often be done for memory mapped resources - but it might come down to the specifics.
Shared memory, shared files, hardware DMA, shared database connections to the same database.
You can control safety as much as you feel like from Rust side, there is no way to validate that the data coming into the process memory doesn't get corrupted by the other side, while it is being read from Rust side.
Unless access is built in a way that all parties accessing the resource have to play by the same validation rules before writting into it, OS IPC resources like shared mutexes, semaphores, critical section.
The kind of typical readers-writers algorithms in distributed computing.
The standard library doesn't give you a regular pointer, though (unless you specifically ask for that). It gives you an iterator, which is pointer-like, but exists precisely so that other behaviors can be layered. There's no reason why such an iterator can't do bounds checking etc, and, indeed, in most C++ implementations around, iterators do make such checks in debug builds.
The problem, rather, is that there's no implementation of checked iterators that's fast enough for release build. That's largely a culture issue in C++ land; it could totally be done.
Unfortunately, operator[] on std::vector is inherently unsafe. You can potentially try to ban it (using at() instead), but that has its own problems.
There’s a great talk by Louis Brandy called “Curiously Recurring C++ Bugs at Facebook” [0] that covers this really well, along with std::map’s operator[] and some more tricky bugs. An interesting question to ask if you try to watch that talk is: How does Rust design around those bugs, and what trade offs does it make?
Thank you for sharing. Seems I still have more to learn!
It seems the bug you are flagging here is a null reference bug - I know Rust has Optional as a workaround for “null”
Are there any pitfalls in Rust when Optional does not return anything? Or does Optional close this bug altogether? I saw Optional pop up in Java to quiet down complaints on null pointer bugs but remained skeptical whether or not it was better to design around the fact that there could be the absence of “something” coming into existence when it should have been initialized
It's not so much Optional that deals with the bug. It's the fact that you can't just use a value that could possibly be null in a way that would break at runtime if it is null - the type system won't allow you, forcing an explicit check. Different languages do this in different ways - e.g. in C# and TypeScript you still have null, but references are designated as nullable or non-nullable - and an explicit comparison to null changes the type of the corresponding variable to indicate that it's not null.
I think sum types in general and Option<T> in particular is nicer. But the reason C# has nullability isn't that they disagree with me, it's that fundamentally the CLR has the same model as Java, all these types can be null, even though in the modern C# language you can say "No, not null that's never OK" at runtime on the CLR too bad maybe it's null.
For example if I write a C# function which takes a Goose, specifically a Goose, not a Goose? or similar - well, too bad the CLR says my C# function can be called by this obsolete BASIC code which has no idea what a Goose is, but it's OK because it passed null. If my code can't cope with a null? Too bad, runtime exception.
In real C# apps written by an in-house team this isn't an issue, Ollie may not be the world's best programmer but he's not going to figure out how to explicity call this API with a null, he's going to be stopped by the C# compiler diagnostic saying it needs a Goose, and worst case he says "Hey tialaramex, why do I need a Goose?". But if you make stuff that's used by people you've never met it can be an issue.
> For example if I write a C# function which takes a Goose, specifically a Goose, not a Goose? or similar - well, too bad the CLR says my C# function can be called by this obsolete BASIC code which has no idea what a Goose is, but it's OK because it passed null. If my code can't cope with a null? Too bad, runtime exception.
That's actually no different to Rust still; if you try, you can pass a 0 value to a function that only accepts a reference (i.e. a non-zero pointer), be it by unsafe, or by assembly, or whatever.
Disagreeing with another comment on this thread, this isn't a matter of judgement around "who's bug is it? Should the callee check for null, or the caller?". Rust's win is by clearly articulating that the API takes non-zero, so the caller is buggy.
As you mention it can still be an issue, but there should be no uncertainty around who's mistake it is.
The difference is that C# has well-defined behavior in this case - a non-nullable notification is really "not-nullable-ish", and there are cases even in the language itself where code without any casts in it will observe null values of such types. It's just a type system hole they allow for convenience and back-compat.
OTOH with Rust you'd have to violate its safety guarantees, which if I understand correctly triggers UB.
Rust’s Optional does close this altogether, yes. All (non-unsafe) users of Optional are required to have some defined behavior in both cases. This is enforced by the language in the match statement, and most of the “member functions” on Optional use match under the hood.
This is an issue with the C++ standardization process as much as with the language itself. AIUI when std::optional (and std::variant, which has similar issues) were defined, there was a push to get new syntax into the language itself that would’ve been similar to Rust’s match statement.
However, that never made it through the standardization process, so we ended up with “library variants” that are not safe in all circumstances.
> whether or not it was better to design around the fact that there could be the absence of “something” coming into existence when it should have been initialized
So this is actually why "no null, but optional types" is such a nice spot in the programming language design space. Because by default, you are making sure it "should have been initialized," that is, in Rust:
struct Point {
x: i32,
y: i32,
}
You know that x and y can never be null. You can't construct a Point without those numbers existing.
By contrast, here's a point where they could be:
struct Point {
x: Option<i32>,
y: Option<i32>,
}
You know by looking at the type if it's ever possible for something to be missing or not.
> Are there any pitfalls in Rust when Optional does not return anything?
So, Rust will require you to handle both cases. For example:
let x: Option<i32> = Some(5); // adding the type for clarity
dbg!(x + 7); // try to debug print the result
This will give you a compile-time error:
error[E0369]: cannot add `{integer}` to `Option<i32>`
--> src/main.rs:4:12
|
4 | dbg!(x + 7); // try to debug print the result
| - ^ - {integer}
| |
| Option<i32>
|
note: the foreign item type `Option<i32>` doesn't implement `Add<{integer}>`
It's not so much "pitfalls" exactly, but you can choose to do the same thing you'd get in a language with null: you can choose not to handle that case:
let x: Option<i32> = Some(5); // adding the type for clarity
let result = match x {
Some(num) => num + 7,
None => panic!("we don't have a number"),
};
dbg!(result); // try to debug print the result
This will successfully print, but if we change `x` to `None`, we'll get a panic, and our current thread dies.
Because this pattern is useful, there's a method on Option called `unwrap()` that does this:
let result = x.unwrap();
And so, you can argue that Rust doesn't truly force you to do something different here. It forces you to make an active choice, to handle it or not to handle it, and in what way. Another option, for example, is to return a default value. Here it is written out, and then with the convenience method:
let result = match x {
Some(num) => num + 7,
None => 0,
};
let result = x.unwrap_or(0);
And you have other choices, too. These are just two examples.
--------------
But to go back to the type thing for a bit, knowing statically you don't have any nulls allows you to do what some dynamic language fans call "confident coding," that is, you don't always need to be checking if something is null: you already know it isn't! This makes code more clear, and more robust.
To add on another pitfall: iterator invalidation. In C++ you generally aren't allowed to modify a container while you're iterating through it, because it may re-allocate the memory and leave dangling pointers in the iterator, but the compiler doesn't check this. Rust's lifetime analysis closes this particular issue.
(Basically, the 'newer' C++ features do help a little with memory safety, but it's still fairly easy to trip up even if you restrict your own code from 'dangerous' operations. It's not at all obvious that a useful memory-safe subset of C++ exists. Even if you were to re-write the standard library to correct previous mistakes, it seems likely you would still need something like the borrow checker once you step beyond the surface level).
Many also need to learn that there are configuration settings on their compilers that make those two cases the same, enabling bounds checking on operator[]().
Sure, but at() is guaranteed to throw an exception and operator[] can throw an exception when you go out of bounds. C++26 is tweaking this, but it's still going to differ implementation to implementation.
At least that's my understanding of the situation. Happy to be corrected though.
The problem with the title is that the phrase "pitfalls of safe rust" implies that these pitfalls are unique to, or made worse by, safe rust. But they aren't. They are challenges in any programming language, which are no worse in rust than elsewhere.
It's like if I wrote an article "pitfalls of Kevlar vests" which talked about how they don't protect you from being shot in the head. It's technically correct, but misleading.
Safe Rust code doesn't have accidental remote code execution. C++ often does. C++ people need to stop pretending that "safety" is some nebulous and ill-defined thing. Everyone, even C++ people, shows perfectly damn well what it means. C++ people are just miffed that Rust built it while they slept.
Accidental remote code execution isn't limited to just memory safety bugs. I'm a huge rust fan but it's not good to oversell things. It's okay to be humble.
RCEs are almost exclusively due to buffer overruns, sure there are examples where that’s not the case but it’s not really an exaggeration or hyperbole when you’re comparing it to C/C++
The recent postgresql sql injection bug was similar. It happened because nobody was checking if a UTF8 string was valid. Postgres’s protections against sql injection assumed that whatever software passed it a query string had already checked that the string was valid UTF8 - but in some languages, this check was never being performed.
This sort of bug is still possible in rust. (Although this particular bug is probably impossible - since safe rust checks UTF8 string validity at the point of creation).
Rust’s static memory protection does still protect you against most RCE bugs. Most is not all. But that’s still a massive reduction in security vulnerabilities compared to C or C++.
In fact "exclusively" doesn't belong in the statement at all. A very small number of successful RCE attacks use exploits at all, and of those, most target (often simple command) injection vulnerabilities like Log4Shell.
If you think back to the big breaches over the last five years, though -- SolarWinds, Colonial Pipeline, Uber, Okta (and through them Cloudflare), Change Healthcare, etc. -- all of these were basic account takeovers.
To the extent that anyone has to choose between investing in "safe" code and investing in IT hygiene, the correct answer today is IT hygiene.
Can you back up your 'very small number " with some data? I don't think it lines up with my own experience here. It's really not an either or matter. Good security requires a multifaceted approach. Memory safety is definitely a worthwhile investment.
What do you count as data? I can keep naming big breaches that didn't involve exploits, like the Caesars and MGM ransomware attacks, or Russia getting deep into Microsoft. There aren't good public data sets, though.
As an example of a bad data set for this conversation, the vast majority of published CVEs have never been used by an attacker. CISA's KEVs give a rough gauge of this, with a little north of 1300 since 2021, and that includes older CVEs that are still in use, like EternalBlue. Some people point to the cardinality of CVE databases as evidence of something, but that doesn't hold up to scrutiny of actual attacks. And this is all before filtering down to memory safety RCE CVEs.
Probably the closest thing to a usable data set here would be reports from incident response teams like Verizon's, but their data is of course heavily biased towards the kinds of incidents that require calling in incident response teams. Last year they tagged something like 15% of breaches as using exploits, and even that is a wild overestimate.
> Memory safety is definitely a worthwhile investment.
In a vacuum, sure, but Python, Java, Go, C#, and most other popular languages are already memory safe. How much software is actively being written in unsafe languages? Back in atmosphere, there's way more value in first making sure all of your VPNs have MFA enabled, nobody's using weak or pwned passwords, employee accounts are deactivated when they leave the company, your help desk has processes to prevent being social engineered, and so on.
> How much software is actively being written in unsafe languages?
Well, let's see. Most major operating system kernels for starters. Web browsers. OpenSSL. Web servers/proxies like Apache, Nginx, HAProxy, IIS, etc. GUI frameworks like Gtk, Qt, parts of Flutter. And so on.
Research I've seen seems to say that 70-80% of vulnerabilities come from memory safety problems[0]. Eliminating those is of course a huge improvement, but is rust doing something to kill the other 20-30%? Or is there something about RCE that makes it the exclusive domain of memory safety problems?
Rust also provides guarantees that goe beyond mere memory safety. You get data-race safety as well, which avoids certain kinds of concurrency issues. You also get type-safety which is a step up when it comes to parsing untrusted input, at least compared to C for example. If untrusted inout can be parsed into your expected type system, it's more likely to not cause harm by confusing the program about what's in the variables. Rust doesn't straight up eliminate all source of error, but it makes major strides forward in areas that go beying mere memory safety.
If english had static checks this kind of runtime pedantry would be unnecessary. Sometimes it's nice to devote part of your brain to productivity rather than checking coherence.
I haven't actually used it but I do have experience of refinement types / liquid types (don't ask me about the nomenclature) and IMO they occupy a very nice space just before you get to "proper" formal verification and having to deal with loop invariants and all of that complexity.
I find it strange that the article doesn't talk about the alternative to checked arithmetic: explicit Wrapping [0] and Saturating [1] types, also provided as methods on numeric types (e.g. `usize::MAX.saturating_add(1)`).
Regarding `as` casting, I completely agree. I am trying to use safe `From::from` instead. However, this is a bit noisy: `usize::from(n)` vs `n as usize`.
True, I should add the wrapping types. They are actually quite useful if you know that you have a fixed range of values and you can't go above the min/max. Like a volume dial that just would stay at "max" if you turn up the volume; it wouldn't wrap around.
I'd add memory leaks to the list. Sometimes you feel compelled to wrap your data in an Rc or Arc (reference counted pointers for those unfamiliar) to appease the borrow checker. With capture semantics of closures and futures and such it's quite easy to fall into a referential cycle, which won't be freed when dropped.
I don't think it's a common concern in Rust.
It used to be a problem in Internet Explorer. It's a footgun in Swift, but Rust's exclusive ownership and immutability make cycles very difficult to create by accident.
If you wrap a Future in Arc, you won't be able to use it. Polling requires exclusive access, which Arc disables. Most combinators and spawn() require exclusive ownership of the bare Future type. This is verified at compile time.
Making a cycle with `Arc` is impossible unless two other criteria are met:
1. You have to have a recursive type. `Arc<Data>` can't be recursive unless `Data` already contains `Arc<Data>` inside it, or some abstract type that could contain `Arc<Data>` in it. Rust doesn't use dynamic types by default, and most data types can be easily shown to never allow such cycle.
It's difficult to make a cycle with a closure too, because you need to have an instance of the closure before you can create an Arc, but your closure can't capture the Arc before it's created. It's a catch-22 that needs extra tricks to work around, which is not something that you can just do by accident.
2. Even if a type can be recursive, it's still not enough, because the default immutability of Arc allows only trees. To make a cycle you need the recursive part of the type to also be in a wrapper type allowing interior mutability, so you can modify it later to form a cycle (or use `Arc::new_cycle` helper, which is an obvious red flag, but you still need to upgrade the reference to a strong one after construction).
It's common to have Arc-wrapped Mutex. It's possible to have recursive types, but having both together at the same time are less common, and then still you need to make a cycle yourself, and dodge all the ownership and borrow checking issues required to poll a future in such type.
This is very nice article, objectively lists possible pitfalls.
It's however not quite that simple.
I am in favor of removing as, but then try_from needs to work with more types, for example try converting u64 into f32 without using as. It turns out to be very hard. TryInto does not work in single step.
Important aspect is performance. HPC code needs to be able to opt out of math checks.
Also Results and Options are very costly, as they introduce lot of branching. Panic is just faster. Hopefully one day rust will use something like IEX by default https://docs.rs/iex/latest/iex/
It has the same benefits as Results, but if error is returned in less than 15% of function calls, then IEX is much faster.
Btw. allocation failure in std returning Result anytime soon?
> Surprising Behavior of Path::join With Absolute Paths
> I was not the only one who was confused by this behavior. Here’s a thread on the topic, which also includes an answer by Johannes Dahlström:
> > The behavior is useful because a caller […] can choose whether it wants to use a relative or absolute path, and the callee can then simply absolutize it by adding its own prefix and the absolute path is unaffected which is probably what the caller wanted. The callee doesn’t have to separately check whether the path is absolute or not.
> And yet, I still think it’s a footgun. It’s easy to overlook this behavior when you use user-provided paths. Perhaps join should return a Result instead? In any case, be aware of this behavior.
Oh, hey, that's me! I agree that it's a footgun, for what it's worth, and there should probably get a dedicated "absolutize" method for getting the "prefix this if relative, leave as is if already absolute" semantics.
Here's an important one: never use `mem::size_of_val(&T)`. Rust as a language strongly steers you towards ignoring double (or even triple)-referenced types because they're implicitly auto-dereferenced in most places, but the moment you try to throw one of those into this API it returns the size of the referenced reference `&T` which is very much not the same as `T`. I've been burned by this before, particularly in unsafe contexts; I only use `size_of::<T>()` now.
That was my first impression as well. So much of Rust’s language and standard library enforces correctness, that gaps start to feel way more visible.
“as” is a good example. Floats are pretty much the only reason PartialEq exists, so why can’t we have a guaranteed-not-NaN-nor-inf type in std and use that everywhere? Why not make wrapping integers a panic even in release mode? Why not have proper dependent types (e.g. to remove bound checks), and proper linear types (to enforce that object destructors always run)?
It’s easy to forget that Rust is not an ideal language, but rather a very pragmatic one, and sometimes correctness loses in favour of some other goals.
I have been following rust very closely since 2013.
As Rust is both evolving and spreading wide, we; the programmers, users of Rust; are also leveling up in how we approach correctness and design with it.
Maybe the next evolution will be something like Haskell but fast like Rust is fast like C without the pain of C++.
But it takes a while for the world to catch up, and for everybody to explore and find ways to work with or around the abstractions that helps with correctness.
It's a bit like the evolution from a pointer to some malloc memory, then the shared/unique pointer of C++, to the fully safe box/(a)rc of Rust.
It might be obvious today how much more efficient it is programming with those abstractions.
I see some similarities with functional programming that still seems so niche. Even though the enlighteneds swears by it. And now we actually seem to be slowly merging the best parts of functional and imperative together somehow.
So maybe we are actually evolving programming as a species. And Rust happens to be one of the best scaffold at this point in history.
There is hardly any evolution from pointer to malloc, C is one of the few systems languages, including those that predated it, where one needs math to allocate heap memory.
I do agree that the evolution is most likely a language that combines automatic resource management with affine/linear/effects/dependent/proofs.
Or AIs improve to the point to render all existing programming languages a thing from the past, replaced by regular natural languages and regular math.
There is some movement towards deprecating "as", and lints that will recommend using alternatives when possible, but there are a couple of cases, such as intentional truncation, where there isn't a stable alternative yet.
I don't remember every argument in there but it seemed that there are good reasons not to add it unlike a NonZero integer type which seems to have no real downsides.
The other option would be to change how floating point works. IEEE specifies operations, not names, so it would be totally valid to have <= on floats be a total order (using integer cpu instructions), and make a function called IEEEAreIdiotsWhoThinkThisIsFloatingPointLessThan which is the partial order that sucks.
For purposes of sorting, Rust does offer a non-IEEE order as f64::total_cmp. You can easily build a wrapper type that uses that for all comparisons, or use a crate that does it for you
total_cmp is precisely IEEE's separately specified total order for floats. It's just that the more common operators do something different, and that's perhaps better for most uses where NaN are inherently unexpected and generally indicate that some kind of error condition has occurred.
Some of these don't strike me as particularly pragmatic. E.g. are overflow checks really that expensive, given that it's a well-known footgun that is often exploitable? Sure, you don't want, say, 10% overhead in your number-crunching codec or whatever, but surely it's better to have those cases opt in for better perf as needed, as opposed to a default behavior that silently produces invalid results?
> Some of these don't strike me as particularly pragmatic. E.g. are overflow checks really that expensive
Did you read the article? Rust includes overflow checks in debug builds, and then about a dozen methods (checked_mul, checked_add, etc.) which explicitly provide for checks in release builds.
Pragmatism, for me, is this help when you need it approach.
TBF Rust forces certain choices on one in other instances, like SipHash as the default Hasher for HashMap. But again opting out, like opting in, isn't hard.
I'd prefer for Rust to opt for correctness/bug-freeness over performance, even in release builds. If you are doing number crunching you should have to opt out of these checks.
> I'd prefer for Rust to opt for correctness/bug-freeness over performance, even in release builds. If you are doing number crunching you should have to opt out of these checks.
But I think the behavior on overflow is to "panic!()" (terminate immediately)? So -- I guess from my POV I wouldn't in release mode. I just think that tradeoff isn't generally worth it, but again, you can turn that behavior on.
"This allows a *program to terminate immediately* and provide feedback to the caller of the program."
Now, I don't think so, because program death is usually what this type of panic means.
And my point remains, without more, this probably isn't the behavior one wants in release mode. But, yes, also perhaps an even better behavior is turning on checks, catching the panic, and logging it with others.
I don't disagree that it could use revising, but it's technically correct: it allows but does not require. If you've configured panic=abort, it will abort the program instead of unwind, but that's not the default.
Compared to C/C++ "as" feels so much safe
r. Now that Rust and we the programmers have evolved with it, I too feel that "as" for narrowing conversion is a small foot gun.
I'm struggling to see how you would implement narrowing conversion in a way that is harder for programmers to misuse when they aren't being mindful, while also being pleasant to use when you really do want to just drop higher bits. Like, you could conceivably have something like a "try_narrow" trait which wraps the truncated value inside an Err when it doesn't fit, and it would probably be harder to accidentally misuse, but that's also really cumbersome to use when you are trying to truncate things.
I don't really want narrowing conversion to be harder, I just want checked conversion to be at least nearly as convenient. `x as usize` vs `x.try_into().unwrap()` becomes `x tiu usize` or something even. I'm not picky. It's kindof funny that this is the exact mistake C++ made, where the safe version of every container operation is the verbose one: `vector[]` vs `vector.at()` or `*optional` vs `optional.value()`, which results in tons and tons of memory problems for code that has absolutely no performance need for unchecked operations.
It's useful to have something that does the job of "as", but I dislike how the most dangerous tool for type conversions has the nicest syntax.
Most of the time I want the behavior of ".try_into().unwrap()" (with the compiler optimizing the checks away if it's always safe) or would even prefer a version that only works if the conversion is safe and lossless (something I can reason about right now, but want to ensure even after refactorings). The latter is really hard to achieve, and ".try_into.unwrap()" is 20 characters where "as" is 2. Not a big deal to type with autocomplete, but a lot of visual clutter.
The question is not whether the language should include such a facility, but whether 'as' should be the syntax for it. 'as' is better than the auto conversions of C but it's still extremely obscure. It would be better to have some kind of explicit operator marking this kind of possibly unintended modulo conversion. Rust will gain safe transmute operations in the near future so that will perhaps be a chance to revise this whole area as well.
Can you expand on your thoughts here? What is the root issue with unsigned integers? Is your complaint primarily based on the implications/consequences of overflow or underflow? I’m genuinely curious as I very often prefer u32 for most numeric computations (although in a more ‘mathematics’ domain signed ints will often be the correct choice).
Unsigned integers are appealing because they make a range of invalid values impossible to represent. That's good! Indices can't be negative so simply do not allow negative values.
The issues are numerous, and benefits are marginal. First and foremost it is extremely common to do offset math on indices whereby negative values are perfectly valid. Given two indices idxA and idxB if you have unsigned indices then one of (idxB - idxA) or (idxA - idxB) will underflow and cause catastrophe. (Unless they're the same, of course).
The benefits are marginal because even though unsigned cannot represent a value below the valid range it can represent a value above container.size() so you still need to bounds check the upper range. If you can't go branchless then who cares about eliminating one branch that can always be treated as cold.
On a modern 64-bit machine doing math on smaller integers isn't any faster and may in fact be slower!
Now it can be valuable to store smaller integers. Especially for things like index lists. But in this case you're probably not doing any math so the overflow/underflow issue is somewhat moot.
Anyhow. Use unsigned when doing bitmasking or bitmanipulation. Otherwise default to signed integer. And default to i64/int64_t. You can use smaller integer types and even unsigned. Just use i64 by default and only use something else if you have a particular reason.
I'm kinda rambling and those thoughts are scattered. Hope it was helpful.
Overflow errors absolutely do happen. They're just no longer UB. It doesn't make them non-errors though. If your bank account balance overflowed, you'd be pretty upset.
The vast majority of code that does arithmetic will not produce a correct result with two's complement. It is simply assuming that the values involved are small enough that it won't matter. Sometimes it is a correct assumption, but whenever it involves anything derived from inputs, it can go very wrong.
This is something that's always bugged me because, yes, this is a real problem that produces real bugs. But at the same time if you really care about this issue then every arithmetic operation is unsafe and there is never a time you should use them without overflow checks. Sometimes you can know something won't overflow but outside of some niche type systems you can't really prove it to the compiler to elide the check in a way that is safe against code modifications— i.e. if someone edits code that breaks the assumption we needed to know it won't overflow it will err.
But at the same time in real code in the real world you just do the maths, throw caution to the wind, and if it overflows and produces a bug you just fix it there. It's not worth the performance hit and your fellow developers will call you mad if you try to have a whole codebase with only checked maths.
I think this is very much a cultural issue rather than a technical one. Just look at array bounds checking: widespread in the mainframe era even in systems languages, relegated to high-level languages for a very long time on the basis of unacceptable perf hit in low-level code, but more recently seeing more acceptance in new systems languages (e.g. Rust).
Similarly in this case, it's not like we don't have languages that do checked arithmetic throughout by default. VB.NET, for example, does exactly that. Higher-level languages have other strategies to deal with the problem; e.g. unbounded integer types as in Python, which simply never overflow. And, like you say, this sort of thing is considered unacceptable for low-level code on perf grounds, but, given the history with nulls and OOB checking, I think there is a lesson here.
For any arithmetic expression that involves only + - * operators and equally-sized machine words, two's complement will actually yield a "correct" result. It's just that the given result might be indicating a different range than you expect.
I'm a big fan of liberal use of saturating_mul/add/sub whenever there is a conceivable risk of coming withing a couple orders of magnitude of overflow. Or checked_*() or whatever the best behavior in the given case is. For my code it happens to mostly be saturating.
Overflow bugs are a real pain, and so easy to prevent in Rust with just a function call. It's pretty high on my list of favorite improvements over C/C++
You obviously have to decide it on a case-by-case basis. But anything that is only used in a comparison is usually fine with saturating. And many things that measures values or work with measurements are fine with saturating if it's documented. Saturating is how most analog equipment works too, and in non-interactive use cases "just pick the closest value we can represent" is often better than erroring out or recording nothing at all.
Of course don't use saturating_add to calculate account balance, there you should use checked_add.
Some of this advice is wrongheaded. Consider array indexing: usually, an out of bounds access indicates a logic error and should fail fast to abort the problem so it doesn't go further off the rails. Encouraging people to use try-things everywhere just encourage them to paper over logic bugs and leads to less reliable software in the end. Every generation has to learn this lesson anew through pain.
Try-things have the benefit of accurately representing the thing you're describing. Leave it to the caller to decide whether to panic or resize the data structure or whatever.
That's also not the only choice in the design space for correct array accesses. Instead of indices being raw integers, you can use tagged types (in Rust, probably using lifetimes as the mechanism if you had to piggy back on existing features, but that's an implementation detail) and generate safe, tagged indices which allow safe access without having to bounds check on access.
However you do it, the point is to not lie about what you're actually doing and invoke a panic-handler-something as a cludgy way of working around the language.
I think what you are saying is that there must be an informed decision betwen crashing the program vs returning an error. Instead of returning an error for everything that happens to be a logic error at a given level of abstraction.
I think that example comes from the talk "Safety in an Unsafe World" [0, slides at 1].
There are some crates which implement lock ordering as well (e.g., [2, 3]). lock-ordering states it's inspired by the technique discussed in the talk as well, for what it's worth.
Golang will panic with a runtime error index out of range if you index out of bounds. There doesn't seem to be a nice built in way to do `arr.get(3)` like in Rust.
slice := []int{1, 2, 3}
i := slice[3]
fmt.Println(i)
Both have their place but after writing both extensively, I much prefer Rust - despite the pitfalls.
My biggest critisim of Rust (in comparison to Go) is the lack of a powerful standard library while Go's standard library is outstanding. I would also like to see standardized interfaces in Rust (like AsyncWrite) and in general the async story could be better - though I appreciate the versatility.
Title is slightly misleading but the content is good. It's the "Safe Rust" in the title that's weird to me. These apply to Rust altogether, you don't avoid them by writing unsafe Rust code. They also aren't unique to Rust.
A less baity title might be "Rust pitfalls: Runtime correctness beyond memory safety."
It is consistent with the way the Rust community uses "safe": as "passes static checks and thus protects from many runtime errors."
This regularly drives C++ programmers mad: the statement "C++ is all unsafe" is taken as some kind of hyperbole, attack or dogma, while the intent may well be to factually point out the lack of statically checked guarantees.
It is subtle but not inconsistent that strong static checks ("safe Rust") may still leave the possibility of runtime errors. So there is a legitimate, useful broader notion of "safety" where Rust's static checking is not enough. That's a bit hard to express in a title - "correctness" is not bad, but maybe a bit too strong.
No, the Rust community almost universally understands "safe" as referring to memory safety, as per Rust's documentation, and especially the unsafe book, aka Rustonomicon [1]. In that regard, Safe Rust is safe, Unsafe Rust is unsafe, and C++ is also unsafe. I don't think anyone is saying "C++ is all unsafe."
You might be talking about "correct", and that's true, Rust generally favors correctness more than most other languages (e.g. Rust being obstinate about turning a byte array into a file path, because not all file paths are made of byte arrays, or e.g. the myriad string types to denote their semantics).
[1] https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html
Mostly, there is a sub culture that promotes to taint everything as unsafe that could be used incorrectly, instead of memory safety related operations.
That subculture is called “people who haven’t read the docs”, and I don’t see why anyone would give a whole lot of weight to their opinion on what technical terms mean
I don't see why people would drop the "memory" part of "memory safe" and just promote the false advertising of "safe rust"
It sounds like you should read the docs. It's just a subject-specific abbreviation, not an advertising trick.
but it is false advertising when it's used all over the internet with: rust is safe! telling the whole world to rtfm for your co-opting of the generic word "safe" is like advertisers telling you to read the fine print: a sleazy tactic.
It's not that either, and you are validating the GP's point. Rust has a very specific 'unsafe' keyword that every Rust developer interpret implicitly and instinctively as 'potentially memory-unsafe'. Consequently, 'safe' is interpreted as the opposite - 'guaranteed memory-safe'. Using that word as an abbreviation among Rust developers is therefore not uncommon.
However while speaking about Rust language in general, all half-decent Rust developers specify that it's about memory safety. Even the Rust language homepage has only two instances of the word - 'memory-safety' and 'thread-safety'. The accusations of sleaziness and false accusations is disingenuous at best.
Someone tell that to the standard library. No memory safety involved in non-zero numbers https://doc.rust-lang.org/std/num/struct.NonZero.html#tymeth...
There is, since the zero is used as a niche value optimisation for enums, so that Option<NonZero<u32>> occupies the same amount of memory as u32.
But this can be used with other enums too, and in those cases, having a zero NonZero would essentially transmute the enum into an unexpected variant, which may cause an invariant to break, thus potentially causing memory unsafety in whatever required that invariant.
> which may cause an invariant to break, thus potentially causing memory unsafety in whatever required that invariant
By that standard anything and everything might be tainted as "unsafe", which is precisely GP's point. Whether the unsafety should be blamed on the outside code that's allowed to create a 0-valued NonZero<…> or on the code that requires this purported invariant in the first place is ultimately a matter of judgment, that people may freely disagree about.
EDIT: A summary of this is that it is impossible to write a sound std::Vec implementation if NonZero::new_unchecked is a safe function. This is specifically because creating a value of NonZero which is 0 is undefined behavior which is exploited by niche optimization. If you created your own `struct MyNonZero(u8)`, then you wouldn't need to mark MyNonZero::new_unchecked as unsafe because creating MyNonZero(0) is a "valid" value which doesn't trigger undefined behavior.
The issue is that this could potentially allow creating a struct whose invariants are broken in safe rust. This breaks encapsulation, which means modules which use unsafe code (like `std::vec`) have no way to stop safe code from calling them with the invariants they rely on for safety broken. Let me give an example starting with an enum definition:
Now assume the compiler has used niche optimization so that if the byte corresponding to `discriminator` is 0, then the enum is `Example::Second`, while if the byte corresponding to `discriminator` is not 0, then the enum is `Example::First` with discriminator being equal to its given non-zero value. Furthermore, assume that `Example::First`'s `capacity`, `length`, and `arena` fields are in the in the same position as the fields of the same name for `Example::Second.vec`. If we allow `fn NonZero::new_unchecked(u8) -> NonZero<u8>` to be a safe function, we can create an invalid Vec: So if we allowed new_unchecked to be safe, then it would be impossible to write a sound definition of Vec.Yeah, anything can (and should) be marked unsafe if it could lead to memory safety problems. And so if it potentially breaks an invariant which is relied on for memory safety, it should be marked unsafe (conversely, code should not rely on an unchecked, safe condition for memory safety). That's basically how it works, Rust has the concept of unsafe functions so that libraries can communicate to users about what can and can't be relied on to keep memory safety without manual checking. This requires a common definition of 'safe', but it then means there isn't any argument about where the bug is: if the invariant isn't enforced by the compiler in safe code, then other code should not rely on it. If it is, then the bug is in the unsafe code that broke the invariant.
> Whether the unsafety should be blamed on the outside code that's allowed to create a 0-valued NonZero<…> or on the code that requires this purported invariant in the first place is ultimately a matter of judgment, that people may freely disagree about.
It's not, though. NonZero<T> has an invariant that a zero value is undefined behavior. Therefore, any API which allows for the ability to create one must be unsafe. This is a very straightforward case.
Because of cult like belief structures growing up around rust, it's clear as day for us on the outside, I see it from the evangelists in the company I work for "rust is faster and safer to develop with when compared to c++", I'm no c++ fan but it's obviously nonsense.
I feel people took the comparison of rust to c and extrapolated to c++ which is blatantly disingenuous.
The cult that I see growing online a lot are those who are invested in attacking Rust for some reason, though their arguments often indicate that they haven't even tried it. I believe that we're focusing so much on Rust evangelists that we're neglecting the other end of the zealotry spectrum - the irrational haters.
The Rust developers I meet are more interested in showing off their creations than in evangelizing the language. Even those on dedicated Rust forums are generally very receptive to other languages - you can see that in action on topics like goreleaser or Zig's comptime.
And while you have already dismissed the other commenter's experience of finding Rust nicer than C++ to program in, I would like to add that I share their experience. I have nothing against C++, and I would like to relearn it so that I can contribute to some projects I like. But the reason why I started with Rust in 2013 was because of the memory-saftey issues I was facing with C++. There are features in Rust that I find surprisingly pleasant, even with 6 additional years of experience in Python. Your opinion that Rust is unpleasant to the programmer is not universal and its detractions are not nonsense.
I appreciate the difficulty in learning Rust - especially getting past the stage of fighting the borrow checker. That's the reason why I don't promote Rust for immediate projects. However, I feel that the knowledge required to get past that stage is essential even for correct C and C++. Rust was easy for me to get started in, because of my background in digital electronics, C and C++. But once you get past that peak, Rust is full of very elegant abstractions that are similar to what's seen in Python. I know it works because I have trained js and python developers in Rust. And their feedback corroborates those assumptions about learning Rust.
Care to explain the obvious, then? Rust is quite a lot nicer to write than C++ in my experience (and in fact, it seems like rust is most attractive to people who were already writing C++: people who still prefer C are a lot less likely to like Rust).
There is nothing attractive about c++ or rust, I really don't understand how anyone can think so, it has to be some sort of Stockholm syndrome. Think about it, before you started programming what about your experiences would make you appreciate the syntax soup of rust and c++?
I dunno, there's not much about my previous experience that would indicate much one way or the other. I have found, though, that I tend to prefer slightly denser, heterogeneous code and syntax than average. Low-syntax languages like Haskell and Lisps make my head hurt because the code is so formless it becomes hard for me to parse, while languages with more syntax and symbols are easier (though, there is a limit, APL,k, etc, are a little far I find)
I see this subculture far more in online forums than with fellow Rust developers.
Most often, the comments come from people who don’t even write much Rust. They either know just enough to be dangerous or they write other languages and feel like it’s a “gotcha” they can use against Rust.
Formally the team/docs are very clear, but I think many users of Rust miss that nuance and lump memory safety together with all the other features that create the "if it compiles it probably works" experience
So I agree with the above comment that the title could be better, but I also understand why the author gave it this title
I agree with most of your assertions.
> ... with all the other features that create the "if it compiles it probably works" experience
While it's true that Rust's core safety feature is almost exclusively about memory safety, I think it contributes more to the overall safety of the program.
My professional background is more in electronics than in software. So when the Rust borrow checker complains, I tend to map them to nuances of the hardware and seek work-arounds for those problems. Those work-arounds often tend to be better restructuring of the code, with proper data isolation. While that may seem like hard work in the beginning, it's often better towards the end because of clarity and modularity it contributes to the code.
Rust won't eliminate logical bugs or runtime bugs from careless coding. But it does encourage better coding practices. In addition, the strict, but expressive type system eliminates more bugs by encoding some extra constraints that are verified at compile time. (Yes, there are other languages that do this better).
And while it is not guaranteed, I find Rust programs to just work if it compiles, more often than in the other languages I know. And the memory-safety system has a huge role in that experience.
If a C++ developer decides to use purely containers and smart pointers when starting a new project, how are they going to develop unsafe code?
Containers like std::vector and smart pointers like std::unique_ptr seem to offer all of the same statically checked guarantees that Rust does.
I just do not see how Rust is a superior language compared to modern C++
The commonly given response to this question is two-fold, and both parts have a similar root cause: smart pointers and "safety" being bolted-on features developed decades after the fact. The first part is the standard library itself. You can put your data in a vec for instance, but if you want to iterate, the standard library gives you back a regular pointer that can be dereferenced unchecked, and is intended to be invalidated while still held in the event of a mutation. The second part is third party libraries. You may be diligent about managing memory with smart pointers, but odds are any library you might use probably wants a dumb pointer, and whether or not it assumes responsibility for freeing that pointer later is at best documented in natural language.
This results in an ecosystem where safety is opt-in, which means in practice most implementations are largely unsafe. Even if an individual developer wants to proactive about safety, the ecosystem isn't there to support them to the same extent as in rust. By contrast, safety is the defining feature of the rust ecosystem. You can write code and the language and ecosystem support you in doing so rather than being a barrier you have to fight back against.
Yep. Safe rust also protects you from UB resulting from incorrect multi-threaded code.
In C++ (and C#, Java, Go and many other “memory safe languages”), it’s very easy to mess up multithreaded code. Bugs from multithreading are often insanely difficult to reproduce and debug. Rust’s safety guardrails make many of these bugs impossible.
This is also great for performance. C++ libraries have to decide whether it’s better to be thread safe (at a cost of performance) or to be thread-unsafe but faster. Lots of libraries are thread safe “just in case”. And you pay for this even when your program / variable is single threaded. In rust, because the compiler prevents these bugs, libraries are free to be non-threadsafe for better performance if they want - without worrying about downstream bugs.
I've written some multithreaded rust and I've gotta say, this does not reflect my experience. It's just as easy to make a mess, as in any other language.
Safe rust prevents you from writing data races. All concurrent access is forced to be guarded by synchronization primitives. Eliminating an entire class of bugs.
You can still create a mess from logical race conditions, deadlocks and similar bugs, but you won’t get segfaults because you after the tenth iteration forgot to diligently manage the mutex.
Personally I feel that in rust I can mostly reason locally, compared to say Go when I need to understand a global context whenever I touch multithreaded code.
Me too. I agree that its not a bed of roses - and all the memory safety guarantees in the world don't stop you from making a huge mess. But I haven't run into any of the impossible-to-debug crashes / heisenbugs in my multithreaded rust code that I have in C/C++.
I think rust delivers on its safety promise.
Most likely because it all multi-threaded code access in-memory data structures, internal to the process memory, the only scenario in multi-threaded systems that Rust has some support for.
Make those threads access external resources simultaneously, or memory mapped to external writers, and there is no support from Rust type system.
> Make those threads access external resources simultaneously, or memory mapped to external writers, and there is no support from Rust type system.
I don’t think that’s true.
External thread-unsafe resources like that are similar in a way to external C libraries: they’re sort of unsafe by default. It’s possible to misuse them to violate rust’s safe memory guarantees. But it’s usually also possible to create safe struct / API wrappers around them which prevent misuse from safe code. If you model an external, thread-unsafe resource as a struct that isn’t Send / Sync then you’re forced to use the appropriate threading primitives to interact with the resource from multiple threads. When you use it like that, the type system can be a great help. I think the same trick can often be done for memory mapped resources - but it might come down to the specifics.
If you disagree, I’d love to see an example.
Shared memory, shared files, hardware DMA, shared database connections to the same database.
You can control safety as much as you feel like from Rust side, there is no way to validate that the data coming into the process memory doesn't get corrupted by the other side, while it is being read from Rust side.
Unless access is built in a way that all parties accessing the resource have to play by the same validation rules before writting into it, OS IPC resources like shared mutexes, semaphores, critical section.
The kind of typical readers-writers algorithms in distributed computing.
What mainstream language has type system features that make multi-threaded access to external resources safe?
Managing something like that is a design decision of the software being implemented not a responsibility of the language itself.
None, however the fearless concurrency sales pitch usually leaves that scenario as footnote.
The standard library doesn't give you a regular pointer, though (unless you specifically ask for that). It gives you an iterator, which is pointer-like, but exists precisely so that other behaviors can be layered. There's no reason why such an iterator can't do bounds checking etc, and, indeed, in most C++ implementations around, iterators do make such checks in debug builds.
The problem, rather, is that there's no implementation of checked iterators that's fast enough for release build. That's largely a culture issue in C++ land; it could totally be done.
VC++ checked iterators are fast enough for my use cases, not everyone is trying to win a F1 race when having to deal with C++ written code.
Unfortunately, operator[] on std::vector is inherently unsafe. You can potentially try to ban it (using at() instead), but that has its own problems.
There’s a great talk by Louis Brandy called “Curiously Recurring C++ Bugs at Facebook” [0] that covers this really well, along with std::map’s operator[] and some more tricky bugs. An interesting question to ask if you try to watch that talk is: How does Rust design around those bugs, and what trade offs does it make?
[0]: https://m.youtube.com/watch?v=lkgszkPnV8g
Thank you for sharing. Seems I still have more to learn!
It seems the bug you are flagging here is a null reference bug - I know Rust has Optional as a workaround for “null”
Are there any pitfalls in Rust when Optional does not return anything? Or does Optional close this bug altogether? I saw Optional pop up in Java to quiet down complaints on null pointer bugs but remained skeptical whether or not it was better to design around the fact that there could be the absence of “something” coming into existence when it should have been initialized
It's not so much Optional that deals with the bug. It's the fact that you can't just use a value that could possibly be null in a way that would break at runtime if it is null - the type system won't allow you, forcing an explicit check. Different languages do this in different ways - e.g. in C# and TypeScript you still have null, but references are designated as nullable or non-nullable - and an explicit comparison to null changes the type of the corresponding variable to indicate that it's not null.
I think sum types in general and Option<T> in particular is nicer. But the reason C# has nullability isn't that they disagree with me, it's that fundamentally the CLR has the same model as Java, all these types can be null, even though in the modern C# language you can say "No, not null that's never OK" at runtime on the CLR too bad maybe it's null.
For example if I write a C# function which takes a Goose, specifically a Goose, not a Goose? or similar - well, too bad the CLR says my C# function can be called by this obsolete BASIC code which has no idea what a Goose is, but it's OK because it passed null. If my code can't cope with a null? Too bad, runtime exception.
In real C# apps written by an in-house team this isn't an issue, Ollie may not be the world's best programmer but he's not going to figure out how to explicity call this API with a null, he's going to be stopped by the C# compiler diagnostic saying it needs a Goose, and worst case he says "Hey tialaramex, why do I need a Goose?". But if you make stuff that's used by people you've never met it can be an issue.
> For example if I write a C# function which takes a Goose, specifically a Goose, not a Goose? or similar - well, too bad the CLR says my C# function can be called by this obsolete BASIC code which has no idea what a Goose is, but it's OK because it passed null. If my code can't cope with a null? Too bad, runtime exception.
That's actually no different to Rust still; if you try, you can pass a 0 value to a function that only accepts a reference (i.e. a non-zero pointer), be it by unsafe, or by assembly, or whatever.
Disagreeing with another comment on this thread, this isn't a matter of judgement around "who's bug is it? Should the callee check for null, or the caller?". Rust's win is by clearly articulating that the API takes non-zero, so the caller is buggy.
As you mention it can still be an issue, but there should be no uncertainty around who's mistake it is.
The difference is that C# has well-defined behavior in this case - a non-nullable notification is really "not-nullable-ish", and there are cases even in the language itself where code without any casts in it will observe null values of such types. It's just a type system hole they allow for convenience and back-compat.
OTOH with Rust you'd have to violate its safety guarantees, which if I understand correctly triggers UB.
> which if I understand correctly triggers UB.
Yes, your parent's example would be UB, and require unsafe.
Rust’s Optional does close this altogether, yes. All (non-unsafe) users of Optional are required to have some defined behavior in both cases. This is enforced by the language in the match statement, and most of the “member functions” on Optional use match under the hood.
This is an issue with the C++ standardization process as much as with the language itself. AIUI when std::optional (and std::variant, which has similar issues) were defined, there was a push to get new syntax into the language itself that would’ve been similar to Rust’s match statement.
However, that never made it through the standardization process, so we ended up with “library variants” that are not safe in all circumstances.
Here’s one of the papers from that time, though there are many others arguing different sides: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p00...
> whether or not it was better to design around the fact that there could be the absence of “something” coming into existence when it should have been initialized
So this is actually why "no null, but optional types" is such a nice spot in the programming language design space. Because by default, you are making sure it "should have been initialized," that is, in Rust:
You know that x and y can never be null. You can't construct a Point without those numbers existing.By contrast, here's a point where they could be:
You know by looking at the type if it's ever possible for something to be missing or not.> Are there any pitfalls in Rust when Optional does not return anything?
So, Rust will require you to handle both cases. For example:
This will give you a compile-time error: It's not so much "pitfalls" exactly, but you can choose to do the same thing you'd get in a language with null: you can choose not to handle that case: This will successfully print, but if we change `x` to `None`, we'll get a panic, and our current thread dies.Because this pattern is useful, there's a method on Option called `unwrap()` that does this:
And so, you can argue that Rust doesn't truly force you to do something different here. It forces you to make an active choice, to handle it or not to handle it, and in what way. Another option, for example, is to return a default value. Here it is written out, and then with the convenience method: And you have other choices, too. These are just two examples.--------------
But to go back to the type thing for a bit, knowing statically you don't have any nulls allows you to do what some dynamic language fans call "confident coding," that is, you don't always need to be checking if something is null: you already know it isn't! This makes code more clear, and more robust.
If you take this strategy to its logical end, you arrive at "parse, don't validate," which uses Haskell examples but applies here too: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...
To add on another pitfall: iterator invalidation. In C++ you generally aren't allowed to modify a container while you're iterating through it, because it may re-allocate the memory and leave dangling pointers in the iterator, but the compiler doesn't check this. Rust's lifetime analysis closes this particular issue.
(Basically, the 'newer' C++ features do help a little with memory safety, but it's still fairly easy to trip up even if you restrict your own code from 'dangerous' operations. It's not at all obvious that a useful memory-safe subset of C++ exists. Even if you were to re-write the standard library to correct previous mistakes, it seems likely you would still need something like the borrow checker once you step beyond the surface level).
Here's a program that uses only std::unique_ptr:
Clang 20 compiles this code with `-std=c++23 -Wall -Werror`. If you add -fsanitize=undefined, it will print or similar.C++ devs need to understand the difference between:
Even the at method isn’t statically checked. If you want static checking, you probably need to use std::array.Many also need to learn that there are configuration settings on their compilers that make those two cases the same, enabling bounds checking on operator[]().
Sure, but at() is guaranteed to throw an exception and operator[] can throw an exception when you go out of bounds. C++26 is tweaking this, but it's still going to differ implementation to implementation.
At least that's my understanding of the situation. Happy to be corrected though.
The problem with the title is that the phrase "pitfalls of safe rust" implies that these pitfalls are unique to, or made worse by, safe rust. But they aren't. They are challenges in any programming language, which are no worse in rust than elsewhere.
It's like if I wrote an article "pitfalls of Kevlar vests" which talked about how they don't protect you from being shot in the head. It's technically correct, but misleading.
> This regularly drives C++ programmers mad
I thought the C++ language did that.
It certainly used to, but tbh C++ since 17 has been pretty decent and continually improving.
That said, I still prefer to use it only where necessary.
Safe Rust code doesn't have accidental remote code execution. C++ often does. C++ people need to stop pretending that "safety" is some nebulous and ill-defined thing. Everyone, even C++ people, shows perfectly damn well what it means. C++ people are just miffed that Rust built it while they slept.
Accidental remote code execution isn't limited to just memory safety bugs. I'm a huge rust fan but it's not good to oversell things. It's okay to be humble.
RCEs are almost exclusively due to buffer overruns, sure there are examples where that’s not the case but it’s not really an exaggeration or hyperbole when you’re comparing it to C/C++
Almost exclusively isn't the same as exclusively.
Notably the log4shell[1] vulnerability wasn't due to buffer overruns, and happened in a memory safe language.
[1]: https://en.m.wikipedia.org/wiki/Log4Shell
The recent postgresql sql injection bug was similar. It happened because nobody was checking if a UTF8 string was valid. Postgres’s protections against sql injection assumed that whatever software passed it a query string had already checked that the string was valid UTF8 - but in some languages, this check was never being performed.
This sort of bug is still possible in rust. (Although this particular bug is probably impossible - since safe rust checks UTF8 string validity at the point of creation).
This is one article about it - there was a better write up somewhere but I can’t find it now: https://www.rapid7.com/blog/post/2025/02/13/cve-2025-1094-po...
Rust’s static memory protection does still protect you against most RCE bugs. Most is not all. But that’s still a massive reduction in security vulnerabilities compared to C or C++.
In fact "exclusively" doesn't belong in the statement at all. A very small number of successful RCE attacks use exploits at all, and of those, most target (often simple command) injection vulnerabilities like Log4Shell.
If you think back to the big breaches over the last five years, though -- SolarWinds, Colonial Pipeline, Uber, Okta (and through them Cloudflare), Change Healthcare, etc. -- all of these were basic account takeovers.
To the extent that anyone has to choose between investing in "safe" code and investing in IT hygiene, the correct answer today is IT hygiene.
Can you back up your 'very small number " with some data? I don't think it lines up with my own experience here. It's really not an either or matter. Good security requires a multifaceted approach. Memory safety is definitely a worthwhile investment.
What do you count as data? I can keep naming big breaches that didn't involve exploits, like the Caesars and MGM ransomware attacks, or Russia getting deep into Microsoft. There aren't good public data sets, though.
As an example of a bad data set for this conversation, the vast majority of published CVEs have never been used by an attacker. CISA's KEVs give a rough gauge of this, with a little north of 1300 since 2021, and that includes older CVEs that are still in use, like EternalBlue. Some people point to the cardinality of CVE databases as evidence of something, but that doesn't hold up to scrutiny of actual attacks. And this is all before filtering down to memory safety RCE CVEs.
Probably the closest thing to a usable data set here would be reports from incident response teams like Verizon's, but their data is of course heavily biased towards the kinds of incidents that require calling in incident response teams. Last year they tagged something like 15% of breaches as using exploits, and even that is a wild overestimate.
> Memory safety is definitely a worthwhile investment.
In a vacuum, sure, but Python, Java, Go, C#, and most other popular languages are already memory safe. How much software is actively being written in unsafe languages? Back in atmosphere, there's way more value in first making sure all of your VPNs have MFA enabled, nobody's using weak or pwned passwords, employee accounts are deactivated when they leave the company, your help desk has processes to prevent being social engineered, and so on.
> How much software is actively being written in unsafe languages?
Well, let's see. Most major operating system kernels for starters. Web browsers. OpenSSL. Web servers/proxies like Apache, Nginx, HAProxy, IIS, etc. GUI frameworks like Gtk, Qt, parts of Flutter. And so on.
Research I've seen seems to say that 70-80% of vulnerabilities come from memory safety problems[0]. Eliminating those is of course a huge improvement, but is rust doing something to kill the other 20-30%? Or is there something about RCE that makes it the exclusive domain of memory safety problems?
[0] For some reason I'm having trouble finding primary sources, but it's at least referenced in ex. https://security.googleblog.com/2024/09/eliminating-memory-s...
Rust also provides guarantees that goe beyond mere memory safety. You get data-race safety as well, which avoids certain kinds of concurrency issues. You also get type-safety which is a step up when it comes to parsing untrusted input, at least compared to C for example. If untrusted inout can be parsed into your expected type system, it's more likely to not cause harm by confusing the program about what's in the variables. Rust doesn't straight up eliminate all source of error, but it makes major strides forward in areas that go beying mere memory safety.
If english had static checks this kind of runtime pedantry would be unnecessary. Sometimes it's nice to devote part of your brain to productivity rather than checking coherence.
[flagged]
For integer overflows and array out of bounds I'm quite optimistic about Flux
https://github.com/flux-rs/flux
I haven't actually used it but I do have experience of refinement types / liquid types (don't ask me about the nomenclature) and IMO they occupy a very nice space just before you get to "proper" formal verification and having to deal with loop invariants and all of that complexity.
> refinement types / liquid types (don't ask me about the nomenclature)
There's a nice FOSDEM presentation "Understanding liquid types, contracts and formal verification with Ada/SPARK" by Fernando Oleo Blanco (Irvise): https://fosdem.org/2025/schedule/event/fosdem-2025-4879-unde.... One of the slides says:
I find it strange that the article doesn't talk about the alternative to checked arithmetic: explicit Wrapping [0] and Saturating [1] types, also provided as methods on numeric types (e.g. `usize::MAX.saturating_add(1)`).
Regarding `as` casting, I completely agree. I am trying to use safe `From::from` instead. However, this is a bit noisy: `usize::from(n)` vs `n as usize`.
[0] https://doc.rust-lang.org/std/num/struct.Wrapping.html [1] https://doc.rust-lang.org/std/num/struct.Saturating.html
> I am trying to use safe `From::from` instead. However, this is a bit noisy: `usize::from(n)` vs `n as usize`.
If there's enough information in the surrounding code for type inference to do its thing, you can shorten it to `n.into()`.
Another limitation I face is the impossibility of using `usize::from(n)` in `const` context.
True, I should add the wrapping types. They are actually quite useful if you know that you have a fixed range of values and you can't go above the min/max. Like a volume dial that just would stay at "max" if you turn up the volume; it wouldn't wrap around.
I'd add memory leaks to the list. Sometimes you feel compelled to wrap your data in an Rc or Arc (reference counted pointers for those unfamiliar) to appease the borrow checker. With capture semantics of closures and futures and such it's quite easy to fall into a referential cycle, which won't be freed when dropped.
I don't think it's a common concern in Rust. It used to be a problem in Internet Explorer. It's a footgun in Swift, but Rust's exclusive ownership and immutability make cycles very difficult to create by accident.
If you wrap a Future in Arc, you won't be able to use it. Polling requires exclusive access, which Arc disables. Most combinators and spawn() require exclusive ownership of the bare Future type. This is verified at compile time.
Making a cycle with `Arc` is impossible unless two other criteria are met:
1. You have to have a recursive type. `Arc<Data>` can't be recursive unless `Data` already contains `Arc<Data>` inside it, or some abstract type that could contain `Arc<Data>` in it. Rust doesn't use dynamic types by default, and most data types can be easily shown to never allow such cycle.
It's difficult to make a cycle with a closure too, because you need to have an instance of the closure before you can create an Arc, but your closure can't capture the Arc before it's created. It's a catch-22 that needs extra tricks to work around, which is not something that you can just do by accident.
2. Even if a type can be recursive, it's still not enough, because the default immutability of Arc allows only trees. To make a cycle you need the recursive part of the type to also be in a wrapper type allowing interior mutability, so you can modify it later to form a cycle (or use `Arc::new_cycle` helper, which is an obvious red flag, but you still need to upgrade the reference to a strong one after construction).
It's common to have Arc-wrapped Mutex. It's possible to have recursive types, but having both together at the same time are less common, and then still you need to make a cycle yourself, and dodge all the ownership and borrow checking issues required to poll a future in such type.
[dead]
This is very nice article, objectively lists possible pitfalls. It's however not quite that simple. I am in favor of removing as, but then try_from needs to work with more types, for example try converting u64 into f32 without using as. It turns out to be very hard. TryInto does not work in single step.
Important aspect is performance. HPC code needs to be able to opt out of math checks.
Also Results and Options are very costly, as they introduce lot of branching. Panic is just faster. Hopefully one day rust will use something like IEX by default https://docs.rs/iex/latest/iex/ It has the same benefits as Results, but if error is returned in less than 15% of function calls, then IEX is much faster.
Btw. allocation failure in std returning Result anytime soon?
> Surprising Behavior of Path::join With Absolute Paths
> I was not the only one who was confused by this behavior. Here’s a thread on the topic, which also includes an answer by Johannes Dahlström:
> > The behavior is useful because a caller […] can choose whether it wants to use a relative or absolute path, and the callee can then simply absolutize it by adding its own prefix and the absolute path is unaffected which is probably what the caller wanted. The callee doesn’t have to separately check whether the path is absolute or not.
> And yet, I still think it’s a footgun. It’s easy to overlook this behavior when you use user-provided paths. Perhaps join should return a Result instead? In any case, be aware of this behavior.
Oh, hey, that's me! I agree that it's a footgun, for what it's worth, and there should probably get a dedicated "absolutize" method for getting the "prefix this if relative, leave as is if already absolute" semantics.
(link to the thread: https://users.rust-lang.org/t/rationale-behind-replacing-pat...)
It's the same in at least Python, so it's not a Rust idiosyncratic behavior. The "absolutize" method you're asking for exists since 1.79. https://doc.rust-lang.org/stable/std/path/fn.absolute.html
Yeah, but I meant a method that would take a custom prefix, like join does now. (In the hypothetical situation where join had different semantics.)
> price.checked_mul(quantity)
nope, I won't do that. I'd rather waste away at a bank using early 2000s oracle forms . Make it a compiler flag, detect it statically or gtfo.
Here's an important one: never use `mem::size_of_val(&T)`. Rust as a language strongly steers you towards ignoring double (or even triple)-referenced types because they're implicitly auto-dereferenced in most places, but the moment you try to throw one of those into this API it returns the size of the referenced reference `&T` which is very much not the same as `T`. I've been burned by this before, particularly in unsafe contexts; I only use `size_of::<T>()` now.
Is "as" an uneccesary footgun?
That was my first impression as well. So much of Rust’s language and standard library enforces correctness, that gaps start to feel way more visible.
“as” is a good example. Floats are pretty much the only reason PartialEq exists, so why can’t we have a guaranteed-not-NaN-nor-inf type in std and use that everywhere? Why not make wrapping integers a panic even in release mode? Why not have proper dependent types (e.g. to remove bound checks), and proper linear types (to enforce that object destructors always run)?
It’s easy to forget that Rust is not an ideal language, but rather a very pragmatic one, and sometimes correctness loses in favour of some other goals.
I have been following rust very closely since 2013.
As Rust is both evolving and spreading wide, we; the programmers, users of Rust; are also leveling up in how we approach correctness and design with it.
Maybe the next evolution will be something like Haskell but fast like Rust is fast like C without the pain of C++.
But it takes a while for the world to catch up, and for everybody to explore and find ways to work with or around the abstractions that helps with correctness.
It's a bit like the evolution from a pointer to some malloc memory, then the shared/unique pointer of C++, to the fully safe box/(a)rc of Rust.
It might be obvious today how much more efficient it is programming with those abstractions.
I see some similarities with functional programming that still seems so niche. Even though the enlighteneds swears by it. And now we actually seem to be slowly merging the best parts of functional and imperative together somehow.
So maybe we are actually evolving programming as a species. And Rust happens to be one of the best scaffold at this point in history.
Thank you for reading my essay.
There is hardly any evolution from pointer to malloc, C is one of the few systems languages, including those that predated it, where one needs math to allocate heap memory.
I do agree that the evolution is most likely a language that combines automatic resource management with affine/linear/effects/dependent/proofs.
Or AIs improve to the point to render all existing programming languages a thing from the past, replaced by regular natural languages and regular math.
Sorry, my wording was not great. You got it right. I was saying that the evolution started with (a pointer from) malloc, then uniqueptr, then box.
There is some movement towards deprecating "as", and lints that will recommend using alternatives when possible, but there are a couple of cases, such as intentional truncation, where there isn't a stable alternative yet.
Regarding the not-NAN float type, there was actually a proposal for it which was shot down: https://github.com/rust-lang/libs-team/issues/238.
I don't remember every argument in there but it seemed that there are good reasons not to add it unlike a NonZero integer type which seems to have no real downsides.
The other option would be to change how floating point works. IEEE specifies operations, not names, so it would be totally valid to have <= on floats be a total order (using integer cpu instructions), and make a function called IEEEAreIdiotsWhoThinkThisIsFloatingPointLessThan which is the partial order that sucks.
For purposes of sorting, Rust does offer a non-IEEE order as f64::total_cmp. You can easily build a wrapper type that uses that for all comparisons, or use a crate that does it for you
https://doc.rust-lang.org/std/primitive.f64.html#method.tota...
total_cmp is precisely IEEE's separately specified total order for floats. It's just that the more common operators do something different, and that's perhaps better for most uses where NaN are inherently unexpected and generally indicate that some kind of error condition has occurred.
Some of these don't strike me as particularly pragmatic. E.g. are overflow checks really that expensive, given that it's a well-known footgun that is often exploitable? Sure, you don't want, say, 10% overhead in your number-crunching codec or whatever, but surely it's better to have those cases opt in for better perf as needed, as opposed to a default behavior that silently produces invalid results?
> Some of these don't strike me as particularly pragmatic. E.g. are overflow checks really that expensive
Did you read the article? Rust includes overflow checks in debug builds, and then about a dozen methods (checked_mul, checked_add, etc.) which explicitly provide for checks in release builds.
Pragmatism, for me, is this help when you need it approach.
TBF Rust forces certain choices on one in other instances, like SipHash as the default Hasher for HashMap. But again opting out, like opting in, isn't hard.
I'd prefer for Rust to opt for correctness/bug-freeness over performance, even in release builds. If you are doing number crunching you should have to opt out of these checks.
> I'd prefer for Rust to opt for correctness/bug-freeness over performance, even in release builds. If you are doing number crunching you should have to opt out of these checks.
You can turn those checks on, in release mode, of course: https://doc.rust-lang.org/rustc/codegen-options/index.html#o...
But I think the behavior on overflow is to "panic!()" (terminate immediately)? So -- I guess from my POV I wouldn't in release mode. I just think that tradeoff isn't generally worth it, but again, you can turn that behavior on.
panics do not terminate immediately; they unwind the stack, and if they’re not caught, they terminate the current thread, not the process.
> panics do not terminate immediately; they unwind the stack, and if they’re not caught, they terminate the current thread, not the process.
I don't disagree though this point is a little pedantic. I suppose the docs also need an update? See: https://doc.rust-lang.org/std/macro.panic.html
Now, I don't think so, because program death is usually what this type of panic means.And my point remains, without more, this probably isn't the behavior one wants in release mode. But, yes, also perhaps an even better behavior is turning on checks, catching the panic, and logging it with others.
I don't disagree that it could use revising, but it's technically correct: it allows but does not require. If you've configured panic=abort, it will abort the program instead of unwind, but that's not the default.
> guaranteed-not-NaN-nor-inf
Nor negative zero
Compared to C/C++ "as" feels so much safe r. Now that Rust and we the programmers have evolved with it, I too feel that "as" for narrowing conversion is a small foot gun.
I'm struggling to see how you would implement narrowing conversion in a way that is harder for programmers to misuse when they aren't being mindful, while also being pleasant to use when you really do want to just drop higher bits. Like, you could conceivably have something like a "try_narrow" trait which wraps the truncated value inside an Err when it doesn't fit, and it would probably be harder to accidentally misuse, but that's also really cumbersome to use when you are trying to truncate things.
I don't really want narrowing conversion to be harder, I just want checked conversion to be at least nearly as convenient. `x as usize` vs `x.try_into().unwrap()` becomes `x tiu usize` or something even. I'm not picky. It's kindof funny that this is the exact mistake C++ made, where the safe version of every container operation is the verbose one: `vector[]` vs `vector.at()` or `*optional` vs `optional.value()`, which results in tons and tons of memory problems for code that has absolutely no performance need for unchecked operations.
let foo: u8 = bar<u64>.truncate_to()?
I wouldn't say so. I quite like "as". It can have sharp edges but I think the language would be significantly worse off without it.
It's useful to have something that does the job of "as", but I dislike how the most dangerous tool for type conversions has the nicest syntax.
Most of the time I want the behavior of ".try_into().unwrap()" (with the compiler optimizing the checks away if it's always safe) or would even prefer a version that only works if the conversion is safe and lossless (something I can reason about right now, but want to ensure even after refactorings). The latter is really hard to achieve, and ".try_into.unwrap()" is 20 characters where "as" is 2. Not a big deal to type with autocomplete, but a lot of visual clutter.
The question is not whether the language should include such a facility, but whether 'as' should be the syntax for it. 'as' is better than the auto conversions of C but it's still extremely obscure. It would be better to have some kind of explicit operator marking this kind of possibly unintended modulo conversion. Rust will gain safe transmute operations in the near future so that will perhaps be a chance to revise this whole area as well.
When fitting larger types into smaller ones? Yes.
> Overflow errors can happen pretty easily
No they can’t. Overflows aren’t a real problem. Do not add checked_mul to all your maths.
Thankfully Rust changed overflow behavior from “undefined” to “well defined twos-complement”.
What makes you think this is the case?
Having done a bunch of formal verification I can say that overflows are probably the most common type of bug by far.
Yeah, they're so common they've become a part of our culture when it comes to interacting with computers.
Arithmetic overflows have become the punchline of video game exploits.
Unsigned underflow is also one of the most dangerous types. You go from one of the smallest values to one of the biggest values.
Unsigned integers were largely a mistake. Use i64 and called it a day. (Rusts refusal to allow indexing with i64 or isize is a huge mistake.)
Don’t do arithmetic with u8 or probably even u16.
Can you expand on your thoughts here? What is the root issue with unsigned integers? Is your complaint primarily based on the implications/consequences of overflow or underflow? I’m genuinely curious as I very often prefer u32 for most numeric computations (although in a more ‘mathematics’ domain signed ints will often be the correct choice).
> What is the root issue with unsigned integers?
If I play "Appeal to Authority" you can read some thoughts on this from Alexandrescu, Stroustrup, and Carruth here: https://stackoverflow.com/questions/18795453/why-prefer-sign...
Unsigned integers are appealing because they make a range of invalid values impossible to represent. That's good! Indices can't be negative so simply do not allow negative values.
The issues are numerous, and benefits are marginal. First and foremost it is extremely common to do offset math on indices whereby negative values are perfectly valid. Given two indices idxA and idxB if you have unsigned indices then one of (idxB - idxA) or (idxA - idxB) will underflow and cause catastrophe. (Unless they're the same, of course).
The benefits are marginal because even though unsigned cannot represent a value below the valid range it can represent a value above container.size() so you still need to bounds check the upper range. If you can't go branchless then who cares about eliminating one branch that can always be treated as cold.
On a modern 64-bit machine doing math on smaller integers isn't any faster and may in fact be slower!
Now it can be valuable to store smaller integers. Especially for things like index lists. But in this case you're probably not doing any math so the overflow/underflow issue is somewhat moot.
Anyhow. Use unsigned when doing bitmasking or bitmanipulation. Otherwise default to signed integer. And default to i64/int64_t. You can use smaller integer types and even unsigned. Just use i64 by default and only use something else if you have a particular reason.
I'm kinda rambling and those thoughts are scattered. Hope it was helpful.
Overflow errors absolutely do happen. They're just no longer UB. It doesn't make them non-errors though. If your bank account balance overflowed, you'd be pretty upset.
On the other hand, there’s a solid use case for underflow.
The vast majority of code that does arithmetic will not produce a correct result with two's complement. It is simply assuming that the values involved are small enough that it won't matter. Sometimes it is a correct assumption, but whenever it involves anything derived from inputs, it can go very wrong.
This is something that's always bugged me because, yes, this is a real problem that produces real bugs. But at the same time if you really care about this issue then every arithmetic operation is unsafe and there is never a time you should use them without overflow checks. Sometimes you can know something won't overflow but outside of some niche type systems you can't really prove it to the compiler to elide the check in a way that is safe against code modifications— i.e. if someone edits code that breaks the assumption we needed to know it won't overflow it will err.
But at the same time in real code in the real world you just do the maths, throw caution to the wind, and if it overflows and produces a bug you just fix it there. It's not worth the performance hit and your fellow developers will call you mad if you try to have a whole codebase with only checked maths.
I think this is very much a cultural issue rather than a technical one. Just look at array bounds checking: widespread in the mainframe era even in systems languages, relegated to high-level languages for a very long time on the basis of unacceptable perf hit in low-level code, but more recently seeing more acceptance in new systems languages (e.g. Rust).
Similarly in this case, it's not like we don't have languages that do checked arithmetic throughout by default. VB.NET, for example, does exactly that. Higher-level languages have other strategies to deal with the problem; e.g. unbounded integer types as in Python, which simply never overflow. And, like you say, this sort of thing is considered unacceptable for low-level code on perf grounds, but, given the history with nulls and OOB checking, I think there is a lesson here.
For any arithmetic expression that involves only + - * operators and equally-sized machine words, two's complement will actually yield a "correct" result. It's just that the given result might be indicating a different range than you expect.
I'm a big fan of liberal use of saturating_mul/add/sub whenever there is a conceivable risk of coming withing a couple orders of magnitude of overflow. Or checked_*() or whatever the best behavior in the given case is. For my code it happens to mostly be saturating.
Overflow bugs are a real pain, and so easy to prevent in Rust with just a function call. It's pretty high on my list of favorite improvements over C/C++
If you saturate you almost never ever want to use the result. You need to check and if it saturates do something else.
You obviously have to decide it on a case-by-case basis. But anything that is only used in a comparison is usually fine with saturating. And many things that measures values or work with measurements are fine with saturating if it's documented. Saturating is how most analog equipment works too, and in non-interactive use cases "just pick the closest value we can represent" is often better than erroring out or recording nothing at all.
Of course don't use saturating_add to calculate account balance, there you should use checked_add.
If you're going to check then you shouldn't be saturating, you should just be checking.
Some of this advice is wrongheaded. Consider array indexing: usually, an out of bounds access indicates a logic error and should fail fast to abort the problem so it doesn't go further off the rails. Encouraging people to use try-things everywhere just encourage them to paper over logic bugs and leads to less reliable software in the end. Every generation has to learn this lesson anew through pain.
Try-things have the benefit of accurately representing the thing you're describing. Leave it to the caller to decide whether to panic or resize the data structure or whatever.
That's also not the only choice in the design space for correct array accesses. Instead of indices being raw integers, you can use tagged types (in Rust, probably using lifetimes as the mechanism if you had to piggy back on existing features, but that's an implementation detail) and generate safe, tagged indices which allow safe access without having to bounds check on access.
However you do it, the point is to not lie about what you're actually doing and invoke a panic-handler-something as a cludgy way of working around the language.
I think what you are saying is that there must be an informed decision betwen crashing the program vs returning an error. Instead of returning an error for everything that happens to be a logic error at a given level of abstraction.
[flagged]
Golang might be better for writing robust software, if that is the goal. Robust services that don't go down.
I don't think so. Rust has much stronger typing than Go which allows you to prevent more classes of bugs than just memory errors.
The coolest one I've heard is that Fuchsia's network stack managed to eliminate deadlocks.
But even on a basic level Rust has that "if it compiles it works" experience which Go definitely doesn't.
> The coolest one I've heard is that Fuchsia's network stack managed to eliminate deadlocks.
Is there a write up on this? That's very cool
I think that example comes from the talk "Safety in an Unsafe World" [0, slides at 1].
There are some crates which implement lock ordering as well (e.g., [2, 3]). lock-ordering states it's inspired by the technique discussed in the talk as well, for what it's worth.
[0]: https://youtu.be/qd3x5MCUrhw?t=1001 (~16:41 in case the timestamp link doesn't work)
[1]: https://joshlf.com/files/talks/Safety%20in%20an%20Unsafe%20W... (deadlock prevention example starting slide 50)
[2]: https://github.com/akonradi/lock-ordering
[3]: https://github.com/alaric/lock_order
IIRC it is just having locks with exclusive constructors, which take previous locks’ guards (by ownership?).
That way you can never lock lock B if you have not received a guard aka lock from lock A prior. Ensured on the type level.
I suppose doing this at scale is a real challenge.
The general term for this is "Session types." The Par crate is probably the most mature attempt at this to date.
https://github.com/faiface/par
Golang will panic with a runtime error index out of range if you index out of bounds. There doesn't seem to be a nice built in way to do `arr.get(3)` like in Rust.
Hell, just go full erlang at that point and you get loads of services for "free".
First they have to improve the memory model due to possible races when sharing slices due to their fat pointers implementation.
Both have their place but after writing both extensively, I much prefer Rust - despite the pitfalls.
My biggest critisim of Rust (in comparison to Go) is the lack of a powerful standard library while Go's standard library is outstanding. I would also like to see standardized interfaces in Rust (like AsyncWrite) and in general the async story could be better - though I appreciate the versatility.
That's a weird thing to say about a language that doesn't have null safety.
Not to be outdone here, Go introduces multiple null values that are considered distinct, yet they all exhibit the same problem.
And why is that? I already don’t agree, but I’d love to hear your take.
Mentioning golang is a rust article comment section is just bait. People just live comparing the two even though it's somewhat boring comparison