> It is more like providing an "axiom" as the compiler won't be able to check it and instead has to assume that it true.
I don't think one should see this as an axiom, although as you mention one could.
If your unsafe code allows safe Rust to exhibit undefined behavior, Rust makes no guarantees whatsoever about what your program will do.
The convention in the standard library is to provide an actual proof that this won't happen as a comment to each unsafe block.
Right now, there is no tooling to check these proofs, but there is a lot of ongoing research about how to do that, and the standard library already has some tooling to check things about this.
> Normally full machine checking doesn't scale to large projects, but if one had to restrict to writing proofs only for the small subset of unsafe code and let the simpler type/lifetime system handle the rest of the code it might be manageable.
That's the idea.
It is not as simple as just proving each unsafe block independently, but rather all unsafe blocks within a Rust module must be proven together.
From this point-of-view, it would make sense to only implement one unsafe abstraction per module, and keep these modules as small as possible, providing most of the implementation of the abstraction outside the module using only the safe API.
Turns out, this is already how Rust code is written, since this is not only useful for writing proofs (either via assistants or using pencil and paper), but also helps humans reason about the correctness of the unsafe code by keeping all the "fluff" away.
>Using unsafe is like providing a "proof" that the code you write won't cause safe Rust code to exhibit undefined behavior
It is more like providing an "axiom" as the compiler won't be able to check it and instead has to assume that it true.
I wonder if there is scope to extend the language in the future to actually add machine checked proofs.
Normally full machine checking doesn't scale to large projects, but if one had to restrict to writing proofs only for the small subset of unsafe code and let the simpler type/lifetime system handle the rest of the code it might be manageable.
> the existence of unsafe Rust means that Rust as a whole isn't safe by construction
That's true, and it's a good reason to be very clear about when the Safe Rust subset is all that's been used. Unfortunately it's rare to hear written in Safe Rust, but that's how we ought to refer to code written purely in the safe subset. It should be a point of pride. (Corollary: making needlessly excessive use of Rust's unsafe features, should be a point of shame.)
> We also need to be careful about what we mean by "safe". Rust's definition of safety isn't exotic
Rust does a fine job of defining safety as they use it:
> If all you do is write Safe Rust, you will never have to worry about type-safety or memory-safety. You will never endure a dangling pointer, a use-after-free, or any other kind of Undefined Behavior. [0]
This total absence of undefined behaviour definition sounds about right to me. I think Safe Rust also guarantees against reading indeterminate data from an uninitialized variable, but by the total absence of undefined behaviour criterion, that's not strictly relevant to whether it's safe.
> it doesn't cover the gamut of all possible safety concerns.
Sure, but that's not really what safe language means.
> But to prove whether your unsafe code is, in fact, safe to use -- you're back to formal verification again, or (more typically) code reviews.
At the risk of sounding pedantic, code reviews do not prove anything about unsafe code, they only lower the odds of a defect making it through your process.
> You can (often) avoid the use of unsafe Rust altogether, but then you're using a subset of the language -- roughly like the case with MISRA C.
I'd flip that round though, as you generally shouldn't be using Rust's unsafe features. I'd rather say that in the rare case that you really need it, Rust offers additional unsafe features beyond the Safe Rust language.
Safe Rust is intended to have excellent ergonomics, very much unlike MISRA C. If Rust is doing its job, its unsafe features should seem like inline assembly in C. C programmers don't generally feel constrained by an absence of assembly code in their codebases, as that kind of code just isn't necessary very often.
(Also, MISRA C isn't a truly safe language, for what that's worth. C is not so easily tamed.)
Exactly. But many Rust proponents do not communicate that clearly. They often make it sound like unsafe blocks contain the undefined behavior and prevent it from spreading to the rest of the program, which they don't.
This seems like a weird statement to make to me when defending Rust simultaneously. The whole selling point of Rust is that it's supposed to let you write provably safe code. When "unsafe" is used, what you're writing can no longer be proven as safe by the compiler. Maybe you've convinced yourself it's actually safe via some other argument, but people have been convincing themselves that their code is memory safe for decades and they're often wrong. More insidiously, relying on anything in Rust that uses "unsafe" makes it seem like the compiler is proving your own code is safe, but that is no longer true.
With that said, I don't really disagree with what I think your main point is. But seeing unsafe code in Rust is similar to seeing unsafePerformIO in Haskell libraries. Both of these are much more common than the languages' respective evangelists would have you believe, and at some point it actually does start to call into question whether the languages are successful at their basic design goals. (Spoiler: I believe that they are successful more than they are not, and even in the face of the proliferation of "unsafe" code in both languages, they're still worth using.)
> That is, if you write entirely 100% safe code, and use some library that uses 'unsafe' code at a low level (this is very common; for example, an C binding or memory optimization), then rust does NOT guarantee safety.
What do you consider a "guarantee"? Do you consider pure JavaScript to "guarantee" safety, even though there have been exploitable bugs in the JIT and interpreter? Do you consider pure-computation, non-allocating Rust to be safe, even though LLVM may contain codegen bugs that allow for undefined behavior? Do you consider seccomp-isolated code to be safe, even though there have been, and will likely be in the future, kernel bugs that allow sandboxed process to escape?
The point is that instead of talking about "100% safe", which is never the case in the real world, I think it's better to talk about trusted computing bases. That is, instead of saying "X is safe" or "X is unsafe", we talk about what effect X has on the memory-unsafe surface area of a program. Rust's goal is to reduce the trusted computing base of an application written in it to the hardware, compiler, unsafe blocks, and type system. This gives us essentially the safety properties of "managed languages" (except I don't like that term because it tends to imply garbage collection and lack of fine-grained control over memory, which don't apply to Rust).
A specific example I like to give: Sure, we could hardwire (for instance) Vec into the language, and therefore reduce the number of lines of unsafe code in the libraries. But doing so would just be moving the unsafe code from the unsafe blocks in the libraries to the compiler itself. There would be no net gain in safety from it—compiler code can have bugs just as library code can—and there would be a decrease in maintainability and flexibility.
You can think of Rust's safety properties as establishing a type of sandbox if you'd like. All sandboxes have trusted computing bases, and Rust's compile-time sandbox is no exception. But "safety" as applied to a sandbox still has an important meaning. Chromium/Firefox OS's sandbox, for example, is a safe sandbox, even though its security depends on strong assumptions about the trustworthiness of the OS kernel and the IPC layer between the trusted and untrusted processes.
> There is a runtime cost to the drop checker; it's not particularly large, but it's certainly not just a sequence of drop operations.
After the drop optimizations are done, this will only be true if you conditionally move an object on one or more branches. It's already true today if the object gets SROA'd on the stack, as LLVM can then optimize out the drop flags. We've talked about adding a lint so the compiler can optionally warn and allow you to fix it if this performance cost is a concern to you. (It's a very minor cost—one test and branch on a stack byte, and again only if the object is conditionally moved.)
Note that C++ move semantics incurs essentially the same cost.
> No, it isn't, because it may call code that uses unsafe constructs. (That's why you don't have to put an unsafe block around your code every time you push a value into a vector.)
For the last time: if you have proven the module using the unsafe code correct, you do not have to treat code calling it as unsafe, and safe code calling it is guaranteed to be free of data races and memory safe.
> No, only the code that uses unsafe constructs needs to be treated as unsafe. You could easily make a tool that automatically inserted 'unsafe' tags around C++ code that used unsafe constructs such as pointer arithmetic. The fact that this tool is built into the Rust compiler is just a convenience.
That wouldn't solve anything at all. C++'s type system cannot and does not enforce safe usage of APIs like string_view; it doesn't matter whether the safe code is using "unsafe constructs." If you choose to mark all C++ or C libraries that rely on invariants that can't be proven safe by the type system as unsafe, you will quickly find that essentially your entire program is covered in an unsafe block. The difference in Rust is not that it has unsafe blocks, but that its type system can safely encapsulate interesting APIs whose proof of safety is nontrivial.
> The checker needs a few simple theorems, such as "valid_array(A,0,n) and valid_element(A[n+1)) implies valid_array(A,0,n+1)" to check this.
This is basically dependent types. Rust doesn't really want to go that far from a language level.
> Handling these two classes of unsafe code takes care of a sizable fraction of the unsafe code really needed in Rust.
Not really. All the other datastructures in the stdlib would still need unsafe. Not really a good metric.
The Rustbelt project at MIP-SWS is doing formal verification of Rust including unsafe code, and that approach will be far better IMO. It won't let you eliminate unsafe from your code, but might make it easier to prove that your unsafe code is performing safe operations.
> The major difficulty here is that in general, unsafe pieces of code cannot be safely composed, even if the unsafe pieces of code are individually safe. This allows you to bypass runtime safety checks without unsafe code just by composing "safe" modules that internally use unsafe code in their implementation.
This comment suggests you don't have much domain knowledge about how `unsafe` in Rust works, so I'm surprised you speak with such confidence. Your comment is flatly wrong: users using only safe code are not responsibility for guaranteeing the composed safety of the components they use (whether or not they are implemented with unsafe code).
Interfaces marked safe must uphold Rust's safety guarantees, or they are incorrect. They are just wrong if they have additional untyped invariants that need to be maintained to guarantee their safety; interfaces like this must be marked `unsafe`.
Because they cannot depend on untyped invariants, any correct implementation with a safe interface can be composed with any other. This ability to create safe abstractions over unsafe code which extend the reasoning ability of the type system is a fundamental value proposition of Rust.
That's an interesting way too look at it. But that's not how it works in practice. Almost no one in industry is going to write proofs for their unsafe code. It didn't happen for C or C++ and it won't happen for Rust.
The `unsafe` keyword in Rust isn't about preventing general security vulnerabilities though, it's about denoting parts of the code where proof of memory safety has to be determined by hand.
> You just need primitives which can be used in asserts such as
Sounds like you're going along the path of a dependent type system (in this specific case)? Yes, that could be done, and would perhaps let you reduce a couple of unsafe blocks in the implementation of Vec and other buffer-based abstractions (but not get rid of all of them).
FWIW there is active work going on for formal verification of Rust (both safe and unsafe code), in the RustBelt project.
In general making unsafe blocks run formal verification would be an interesting thing to do (and would solve this problem completely). I don't think it deserves language support, however (nice-to-have, not must-have). This is a very different goal from your original point of adding a few language features that ease writing lower level abstractions.
--------
Ultimately, you're right. While pcwalton did mention "There's no way to solve that problem without just forbidding unsafe code entirely."; this is a possible alternative -- have language support for scoped formal verification that allows you to use "unsafe" operations safely. I think this is an extreme solution to what I consider to be a mostly nonexistent problem.
For really security sensitive code this would indeed be very useful (and is probably a big motivator behind the RustBelt project). Or just use SPARK or something.
But for most Rust users I think the current system is pretty robust and provides enough primitives to write clean, easy-to-verify abstractions with (verifiable) safe API boundaries. (when I say "verify" here I mean it in the informal sense). I haven't come across unsafe code doing contortions, and I have had the (mis?)fortune of going through a lot of unsafe code. The only rough edges are with FFI, and these are mostly due to a lot of things about unsafe code being underspecified (which don't crop up as often in pure rust unsafe code, but do crop up when you through some C/++ FFI in the mix). There is active work on specifying the exact semantics of unsafe code however, so that should be fixable once it happens.
> This would also be unsound. Multiple people develop software in a library. Any future developer on your library will automatically assume that any safe Rust function can be called with any parameters and not invoke undefined behavior. That is the default assumption of any Rust developer.
That's quite the broad generalization. Obviously, if you're at all sane, you're internally documenting your internal invariants, and keeping the section of code upholding them as small and tightly-packed as possible. Rust's unsafe keyword is simply a tool that lets us use interfaces without being fully familiar with their implementations.
It's generally a great idea to encapsulate unsafe code into a sound internal interface whenever possible, but this doesn't necessarily require that every single function be its own standalone interface. What matters is that it's clear to the maintainers that the callers (which should best be nearby in the code) are bound by its requirements.
> Again, if you can invoke undefined behavior by calling ANY safe function with ANY form of input, then the function is unsound. This is simply how Rust assumptions work.
Suppose I have a function that internally creates an Option<T> in such a way that a certain invariant is upheld. I call Option::map() on it, passing it an inner function with an unsafe block assuming that the invariant is upheld. The inner function is called nowhere else. By necessity, this inner function is not marked unsafe, since Option::map() requires a safe FnOnce.
Is there unsoundness in my code? I say not, since the inner function is not a standalone interface; it is an implementation detail inseparable from its outer function. There is absolutely no way to invoke my code as it stands so as to cause UB.
> I suggest you re-learn how to program Rust, personally. You've picked up some really bad habits that don't line up with general consensus on how to code in Rust.
I would prefer that you not make baseless allegations about my code. I mention that pitfall only from experience reviewing others' unsafe code. Again, if you're sane, you keep the parts of the code manually upholding your invariants as small as possible. Marking your internal functions unsafe and clearly documenting their preconditions is a perfectly fine way to go, and that's how I write my own code, being a sane person.
But having an invariant held across multiple functions does not automatically make your code unsound. In certain situations (internal implementations of external traits, callbacks from external code), it is even necessary to have an unsafe precondition pass through the call of a function or closure marked safe. All it takes is the recognition that not every single function is its own standalone interface.
> 1. An unsafe block tells the compiler "trust me, I know this is unsafe and I know how to do it safely"
Right, so aren't you giving up the biggest advantage of Rust, that your code will be safe by virtue of it getting through the Rust compiler's safety checking?
> Do you validate every single unsafe code in every single library you use? Even the standard library? Every time?
We don't need to? Rust's main soundness theorem says that if a safe Rust API is sound, then safe Rust code using it is sound.
So the only thing we need to check is that the unsafe code that _we_ write is sound, and we 100% do this all the way for the standard library. In the standard library _every_ unsafe block has a comment explaining why the "user-provided soundness proof" that they represent is correct. We even have a linter that actually rejects PRs that add unsafe blocks to the standard library without such comments.
> Do you validate every single unsafe code in every single library you use? Even the standard library? Every time?
Soundness is IMO Rust's main feature and its main core value, and is more important than other Rust core values like, e.g., zero-overhead abstractions, which are also extremely important. Soundness is what enables "Hack without fear", "no segfaults", "refactor without fear", cargo ("large scale software without fear"), crates.io ("using other people's libraries without fear")... and it's what sets Rust apart from unsound-by-design languages like D, Nim, Zig, C, C++, etc.
I haven't read the book in a long time, but the documentation I do read (nomicon, spec, unsafe code guidelines, issue tracker, internals) makes it very clear that making sure that unsafe Rust code is correct is critical for the ecosystem and for users to benefit from the main advantages Rust has to offer.
Perhaps you can clarify - if you're saying unsafe rust that performs undefined behaviour is unsafe regardless of the safe bits, then you'll have no disagreement. If on the other hand you're suggesting that it's possible to have well defined unsafe rust that exposes a lack of safety across the unsafe boundary, then you're going to have to explain a bit more...
The point that I'm trying to make here is that you cannot make any assumptions about an unsafe block. Anything can happen, including really terrible undefined behavior. But the fact that anything can happen is why Rust is as powerful as C in this area.
Rust doesn't ensure memory safety or safe concurrency. It ensures memory safety - including data race safety, just one part of safe concurrency - assuming you never use any unsafe code and that the standard library is free of bugs. I'm happy to assume the standard library is free of memory safety bugs, because you have to trust something.
But I'm not happy trusting that dependencies aren't using unsafe code, and I'm not happy claiming that Rust ensures safety, when it ensures safety only if you assume that unsafe blocks aren't unsafe.
The problem is that you can't check unsafe blocks locally. Checking that each individual unsafe block doesn't have undefined behaviour requires checking the entire programme.
It's better than nothing, without a doubt, but it isn't safe.
We must be talking past each other, because I'm still not certain what you're trying to say.
If you're trying to say "there exists Rust code that contains `unsafe` blocks that is memory-safe", then this is obviously (hopefully!) correct, because 100% of the time we hope that our `unsafe` blocks are correctly implemented. In the same sense that "valid" C code does not contain undefined behavior, "valid" Rust code doesn't either; therefore the correctness of `unsafe` blocks in Rust must be tautologically (if uselessly) guaranteed. But that's not a very useful statement. Bugs happen.
Conversely, if you're trying to say "there exists code that can only be written in `unsafe` blocks but that can never cause memory unsafety despite any modification to the surrounding code", then I'd say this is trivially refutable; I can modify any unsafe block to exhibit memory unsafety.
Finally, if you're just saying "static analysis must, by its very definition, reject some correct programs in order to guarantee that the programs it does accept are correct, and Rust's static analyses are no different" then, of course, this is true (again, by the nature of static analysis), but again this is not an especially useful statement, especially since this isn't what anyone here is disputing. Your original comment was made in reply to this statement by rabidferret: "`unsafe` literally means "this might violate memory safety". This is a true statement, both in a social context (see my original comment) and in an implementation context (see the second example from this comment).
The bottom line is, if you see an `unsafe` block, assume memory safety is at risk unless you're absolutely sure the author knows what they're doing.
> The obvious implementation is for unsafe to be infectious like const. You have unsafe code, your crate is unsafe. You depend on an unsafe crate, your crate becomes unsafe.
That would mean everything is unsafe, since every crate depends on core (or on std which depends on core), which has "unsafe" code.
The design of "unsafe" in Rust, instead, is to allow building safe abstractions on top of unsafe code (or be able to clearly mark when the abstraction itself is unsafe). That way, for instance, users of `Vec::push` do not have to worry that it uses uninitialized memory (which is unsafe).
I don't think one should see this as an axiom, although as you mention one could.
If your unsafe code allows safe Rust to exhibit undefined behavior, Rust makes no guarantees whatsoever about what your program will do.
The convention in the standard library is to provide an actual proof that this won't happen as a comment to each unsafe block.
Right now, there is no tooling to check these proofs, but there is a lot of ongoing research about how to do that, and the standard library already has some tooling to check things about this.
> Normally full machine checking doesn't scale to large projects, but if one had to restrict to writing proofs only for the small subset of unsafe code and let the simpler type/lifetime system handle the rest of the code it might be manageable.
That's the idea.
It is not as simple as just proving each unsafe block independently, but rather all unsafe blocks within a Rust module must be proven together.
From this point-of-view, it would make sense to only implement one unsafe abstraction per module, and keep these modules as small as possible, providing most of the implementation of the abstraction outside the module using only the safe API.
Turns out, this is already how Rust code is written, since this is not only useful for writing proofs (either via assistants or using pencil and paper), but also helps humans reason about the correctness of the unsafe code by keeping all the "fluff" away.
reply