So You Want to Rust the Linux Kernel?

carom | karma 4857 | avg karma 7.9 · 2021-10-11 04:33:38

I couldn't back out of this page on mobile.

tkzed49 | karma 486 | avg karma 3.8 · 2021-10-11 04:36:40

Works fine for me on Chrome

lbayes | karma 388 | avg karma 4.51 · 2021-10-11 05:00:13

Also hit the back button trap with Firefox on Android.

Ew

reply

Ensorceled | karma 11982 | avg karma 3.74 · 2021-10-11 05:44:50

Is this an ad for Chrome or are you disputing the comment?

tkzed49 | karma 486 | avg karma 3.8 · 2021-10-11 13:32:39

I realize chrome is a sensitive subject here, but I'm just narrowing down the problem :)

pedrocr | karma 7200 | avg karma 3.62 · 2021-10-11 05:00:16

If it's Firefox on Android long-pressing the back button gives you a history list that allows going back. It's a bug that happens in many sites, where the history gets a bunch of duplicate entries of the current page for some reason.

jve | karma 2755 | avg karma 2.53 · 2021-10-11 05:16:51

I think it is not a bug but abuse of history. Especially when landing on some shady site via google - good luck backing out. Right click on "BACK" helps as it shows the history. Sometimes with MANY duplicate entries.

sen | karma 3525 | avg karma 7.96 · 2021-10-11 05:17:06

It’s not a bug, a huge amounts of sites are doing this now, or hijacking the back button completely to send you to an ad or their homepage.

pedrocr | karma 7200 | avg karma 3.62 · 2021-10-11 12:00:00

There are probably sites doing this maliciously but the cases I've seen are weird things like only some subreddits and even then not always. I suspect there's a bug somewhere in a commonly used web framework or in Firefox mobile itself. This site for example doesn't have the issue in the desktop version of the site, only the mobile one.

napsterbr | karma 2317 | avg karma 5.74 · 2021-10-11 14:58:57

Me too. Looks like I got redirected two or three times there. Not sure if it's a bug or marketing crap.

TacticalCoder | karma 9117 | avg karma 3.7 · 2021-10-11 04:34:25

> On the other hand, might the rumored sudden merge of the ksmdb driver (https://lwn.net/Articles/871098/) have been due to the implicit threat of its being rewritten in Rust?

That links to a message saying this:

> "ksmbd was then merged unbeknownst to me and my worries confirmed to be true as some out-of-bounds bugs (which are impossible in Rust) surfaced immediately."

Sad.

reply

josefx | karma 4586 | avg karma 2.35 · 2021-10-11 05:00:52

I would like a link to the bugs in question.

The only bugs I found have a long email chain with people being surprised that the kernel module even passed any tests at all, with extensive changes needed to fix logic issues for the path conversion code. The devs. even added an unload reload cycle to the tests because the module was in a bad enough state that it wouldn't consistently run over several iterations. I wouldn't be surprised if they would simply turn to unsafe if they had to write it in Rust, errors are apparently to be worked around not fixed. [1]

[1]https://lore.kernel.org/lkml/CAH2r5mvu5wTcgoR-EeXLcoZOvhEiMR...

reply

Measter | karma 224 | avg karma 1.48 · 2021-10-11 06:23:06

> I wouldn't be surprised if they would simply turn to unsafe if they had to write it in Rust, errors are apparently to be worked around not fixed.

They probably would. But it's easier to find usages of unsafe through grep or the like than it is with C because Rust's syntax and requirement of marking unsafe make it easier to find. To use one of my own codebases for example, I can look at the instances of the following regex patterns:

    * "\bfn\b" (function declarations), 138 matches.
    * "\bunsafe\s+fn\b" or "\bunsafe\s+extern\b" (unsafe function declarations), 23 total.
    * "\bunsafe\s*\{" (opening an unsafe block), 41 matches.
    * "\bunsafe\s+impl" (unsafe trait implementation), 0 matches.

We can also look at the total lines of code written (2269), and ask if this these numbers feel reasonable for what it's doing. Obviously that will require familiarity with Rust, what's being implemented, and the expected amount of unsafe based on previous usage in the kernel, but it would at least give an overall feel. And, of course, if you want to examine any of them, you know exactly where they are.

pdimitar | karma 9735 | avg karma 1.68 · 2021-10-11 05:02:17

Wait wait, so somebody panicked that something might be rewritten in Rust and they rushed to merge C code which was then proven to have memory safety bugs?!

Is that what happened?

reply

pharmakom | karma 3602 | avg karma 2.45 · 2021-10-11 04:36:26

Genuine question for Rust programmers:

Memory safety without GC is a big selling point of Rust. However, sometimes your program’s ownership model does not fit with what the Rust borrow-checker can verify. Not to worry, just use “unsafe”. But if “unsafe” is used, can we still claim that Rust code is safe?

reply

nchi3 | karma 115 | avg karma 1.58 · 2021-10-11 04:40:47

No, but it makes it explicit regarding what's unsafe, so that you'll more easily know where to put resources to make the unsafe code as bomb proof as possible. In C that's 100% of the code, in Rust you can keep it to a low percentage.

Gaelan | karma 2595 | avg karma 2.88 · 2021-10-11 04:40:52

Generally, the idea is that you use a small amount of unsafe code to build a safe abstraction to do what you need, then use that abstraction in many places. So now you only have a few unsafe sections you have to verify for correctness, instead of the entire program like you'd need in C/C++.

foldr | karma 6163 | avg karma 1.17 · 2021-10-11 04:41:43

Ideally the unsafe parts are restricted to modules which have been proven safe to an appropriate degree of confidence. ('Unsafe' in Rust really means 'not guaranteed by the type system to be safe', so it's not a contradiction in terms to show that 'unsafe' code is safe.)

One might respond that you can do the same kind of modularization in C++. However, I think a Rust advocate would argue that it's much more difficult in practice to modularize unsafe code in C++ to the same degree, since unsafe idioms are deeply embedded in the language and library ecosystem.

reply

dochtman | karma 8267 | avg karma 10.48 · 2021-10-11 04:42:25

unsafe code blocks are required to uphold the invariants of the safe Rust code surrounding it. So yes, if you fail to uphold those invariants (and that's proven to be pretty tricky), your program is not safe.

My experience is that you hardly ever need unsafe Rust code. If your ownership model doesn't the borrow checker, there are of course also other avenues like ref counts or even an actual garbage collecter (which you can chose to apply only to values whose ownership situation is not aligned with the stack). Only when those options are unable to give you the performance you want you'll need to reach for unsafe code.

Other scenarios where you'll need unsafe might include things like SIMD (although safe wrappers exist) or FFI, but hopefully we'll evolve the Rust ecosystem over time to concentrate the unsafe code in a few places by building abstractions, where they are thoroughly reviewed and well tested.

reply

netheril96 | karma 2460 | avg karma 2.23 · 2021-10-11 05:17:20

> My experience is that you hardly ever need unsafe Rust code.

I doubt that experience applies equally to kernel development.

reply

_flux | karma 840 | avg karma 1.57 · 2021-10-11 05:40:38

They may apply better than one might think: the example SPI driver posted at https://lwn.net/Articles/863459/ only has two cases of unsafe.

Of course, the kernel:: -modules it uses undoubtedly have more, but I imagine the idea is to provide abstractions that allow actual driver code not to use unsafe.

And then there's stuff like file system code, that—I imagine—would require no more uses of unsafe than a database server written in Rust.

reply

tialaramex | karma 26322 | avg karma 2.85 · 2021-10-11 08:21:37

Because it's normal (indeed expected) to build safe abstractions on top of necessarily unsafe code, a lot of the kernel by volume shouldn't need to be unsafe at all.

For example suppose that we're back in the late 1990s and so for some reason dozens of different yet roughly equivalent 100Mbps fast Ethernet PCI cards are available, each offering the same functionality but with slightly different register layouts, interrupts and quirks. We are of course writing drivers for them in Rust.

Obviously at some fundamental level "configure PCI bus" isn't safe - if you get it wrong that'll be bad. But as the author of mediocre Ethernet driver #42 you aren't implementing "configure PCI bus" you're just calling a safe abstraction that worked for the other 41 drivers.

Handling userspace calls to twiddle the Ethernet interface is potentially unsafe too, but likewise that will have been factored out into common code and you're just talking to a safe abstraction.

Rust's type system and borrow checking makes it easier to write a safe abstraction for relatively low level ideas like "Only one device can own this MMIO register" or even "this can be 0x03 or 0x06 or even 0x18 but it can't be all zeroes, that's not a thing".

This sort of stuff is boring and in principle already not particularly unsafe in C today, this sort of abstraction is already done - but of course in C you've no practical way to get the promises Rust offers for safe code.

reply

twic | karma 21889 | avg karma 3.14 · 2021-10-11 04:42:30

Code using unsafe might be safe, it just can't be proved to be safe by the compiler.

Just as code written in C might be safe, it just can't be proved to be safe by the compiler! Rust's advantage over C is that the machine-unprovability is confined to small, defined zones, where hopefully human programmers can, formally or informally, prove the code safe.

reply

liaukovv | karma 505 | avg karma 2.59 · 2021-10-11 04:42:50

Unsafe in rust means "you have to enforce the use contract manually as opposed to compiler doing it for you" So yes, if contract is violated there the entire program is contaminated

josalhor | karma 777 | avg karma 4.92 · 2021-10-11 04:44:49

Yes. The idea is to isolate the use of unsafe and prove that it is safe.

Because compiler cannot prove that what you're doing is safe it is your responsibility to do so. Once you've _proven_ that your unsafe code is safe, then it cannot have propagating effects in the rest of the code.

A lot of people get this confused. They think unsafe means you can do whatever you want. In reality it means you'll play by the rules even if the compiler cannot verify them all. In truth it is not as hard as it sounds, we do it in C all the time.

reply

dgb23 | karma 6604 | avg karma 2.75 · 2021-10-11 05:02:25

Should have been called „trust_me“ instead of „unsafe“.

masklinn | karma 65147 | avg karma 3.36 · 2021-10-11 05:17:18

It’s never really going to be perfect e.g. your idea doesn’t really work with unsafe functions, which can only be called from unsafe contexts but also are an unsafe context.

`unsafe` in rust really means that some of the compiler guardrails are disabled:

* you can call unsafe (and extern) functions

* you can dereference raw pointers (this has significant consequences such as being able to create mutable references out of thin air)

* you can access or modify mutable statics

* you can access `union` contents

reply

Someone1234 | karma 47831 | avg karma 6.58 · 2021-10-11 05:26:24

It could, it would be accurate, but they're trying to implicitly teach this idea to "avoid these." trust_me blocks lacks that implication.

It is very easy, particularly coming from another language, to butt heads with the language's design and then try to solve it with "trust_me" blocks. Throwing everything in unsafe feels like you're doing it wrong, and correctly so.

reply

dgb23 | karma 6604 | avg karma 2.75 · 2021-10-11 05:56:08

My above comment was meant to be an illustration rather than a proposal. I think calling it "unsafe" is a good design choice as you explained.

The downside is: Every time this discussion comes up, someone has to explain what "unsafe" actually means. Every single tech talk ever that touches on "unsafe", pre-fixes the talk with this explanation too.

A more funny idea instead of "trust_me" would be "safe_because[...]" where "..." is a rationale about the code plus your social security number and biometrics data.

For any control freaks and regulators out there: If you take the above seriously and try to implement such a thing you will be burnt by the stake by cyber elves.

reply

piaste | karma 2446 | avg karma 2.73 · 2021-10-11 06:57:46

A less extreme variation would be "unsafe_because "X" " where X must be a string of length at least 50.

You want to do something that the compiler can't understand but you know how it works? Ok, write down _why_ it works so others can know that, too.

Would it be possible to write the above as a Rust macro, and mark invocations of the ordinary "unsafe" operator as a warning / linter alert?

PS: I actually wrapped most F# unsafe (= partial) functions in our codebase in such a way. `Option.get x`, which extracts a possibly-null value and crasheswise, is deprecated and replaced by `Option.getOrFail fmt args x`, where you say why you are sure the value can't be null and provide context variables to diagnose a possible failure.

e.g.

     let title =
       textLines
       |> List.headOrFail "We generated the line lists by subgrouping the original list %A, so we know they're non-empty" originalList

estebank | karma 4736 | avg karma 3.39 · 2021-10-11 07:08:12

The style guide for most projects is to include a comment on the unsafe block's header. Clippy has a lint for unsafe fns[1], and with either changes to the Rust AST (to not discard comments) or to the acceptability of doc attributes on unsafe blocks (to allow the use a /// comment on unsafe blocks), that lint could also work for unsafe blocks.

[1]: https://rust-lang.github.io/rust-clippy/master/#missing_safe...

reply

GuB-42 | karma 15915 | avg karma 4.0 · 2021-10-11 05:51:51

Yes, from the compiler point of view, that's the idea, and it is right there in the manual "Trust me, I know what I’m doing.".

But the keyword "unsafe" has been used in C# for quite a while, and probably in other languages too. So by the time Rust came out, it was already a well understood concept.

In the C# manual: "the execution engine works to ensure that unsafe code cannot be executed in an untrusted environment". Still that notion of trust.

One problem I see with "trust_me" or maybe "trusted" is that it gives the opposite message from a user point of view. Trusted code is code the compiler should trust but that you should not trust.

reply

jacquesm | karma 227883 | avg karma 3.76 · 2021-10-11 06:24:10

Trust me, I know what I'm doing is exactly what every C programmer thinks when writing code.

skocznymroczny | karma None | avg karma None · 2021-10-11 08:38:45

D still has some issues with the safety story, but they have three safety attributes. @system - unsafe code, may do pointer arithmetics, may call C code, @safe - safe code, not allowed to do those things, @trusted - unsafe code, verified by programmer, can be called by @safe.

The main issue is that it came late in D lifetime, so until situation improves, @system is the default, because there's too much std/legacy code which still assumes @system as default. Many new projects however start with @safe as default.

reply

andy_ppp | karma 10079 | avg karma 2.53 · 2021-10-11 05:04:44

Yes exactly, it provides a bounded area to review where as C or C++ we can have unsafe stuff everywhere. And if you do find a leak or crashes it’s likely to be in unsafe code.

null0pointer | karma 1199 | avg karma 3.66 · 2021-10-11 04:45:41

The way I view it is that the unsafe keyword acts as a flag for auditors that they would need to thoroughly check the following block for memory errors.

rapsey | karma 4717 | avg karma 2.77 · 2021-10-11 04:47:26

> However, sometimes your program’s ownership model does not fit with what the Rust borrow-checker can verify.

This tends to be a beginner issue for c/c++ programmers. 99% of the time you just need to rethink your design into something better.

reply

BiteCode_dev | karma 23110 | avg karma 4.85 · 2021-10-11 04:47:32

unsafe rust code is not safe, but the rest of the program is, and the unsafe blocks provides very nice benefits:

- limit the size and scope of the unsafe code

- make the unsafe part stand out, "here be dragons", so you know when to be particularly alert

- act as a safe way to encapsulate unsafe code so that it can be used by safe code, and used without the lack of safety propagating outside of the unsafe zone

reply

the-smug-one | karma 747 | avg karma 2.7 · 2021-10-11 04:48:20

Yes, we can claim so because there are other ways of proving your program to be safe. See: https://plv.mpi-sws.org/rustbelt/

By the way, several parts of the std (Mutex and Arc for example) are written in unsafe Rust. They have to be, by necessity, because they allow for aliasing and mutation.

reply

oleganza | karma 2758 | avg karma 2.45 · 2021-10-11 04:49:33

unsafe {} does not disable type checks or ownership rules. It (basically) only allows dereferencing a raw pointer. It's a way of telling the compiler "in this specific place, for this specific variable I manually verified correctness of dereferencing that pointer".

C/C++ is not just "everything is unsafe", it's also lacking many powerful features that exist even inside unsafe{} blocks: ergonomic enums with associated data, type-checked generics, ergonomic moves etc.

reply

pizza234 | karma 6047 | avg karma 3.22 · 2021-10-11 05:08:38

First, there's a misconception to be cleared :)

Unsafe code is not the only solution to borrow checking problems, as Rust supports reference counting, which is an implementation of shared ownership.

Whether this defeats the point of Rust is subjective, but regardless, it's a solution to at least certain ownership problems.

A very typical area where you can observe this is bidirectional trees. They're ugly in Rust! :)

Now, memory unsafety is not a black and white matter. In my experience with the community, you'll be surprised at how much pragmatism there is. Nobody will scream sacrilege if one expresses the necessity to use memory unsafe code¹.

The connection to your question is that overally, the Rust memory safety model is not all-or-nothing, rather, a spectrum of safety above the convenientional entirely-unsafe approach. With this in mind, I'd say that "98% Rust-safety" is well beyond the "100% unsafe" approach of traditional low-level languages.

As a matter of fact, I suppose that in the future, we'll see languages developing intermediate memory safety features (I think Zig was doing something like that).

[¹] The actix-web case is a very interesting one. I haven't looked at it, but my personal suspicion is that it made use of _wildly_ unsafe code, and that usafe of unsafe code was not the problem, rather that unnecessary usage was.

reply

cies | karma 7348 | avg karma 2.3 · 2021-10-11 05:57:25

> The actix-web case

IIRC it was mostly the initial reluctance of the author to merge PRs that rewrote unsafe to safe Rust. Some escalation, some emotions, and then everything move forward in harmony and with more safety.

reply

nuerow | karma 1179 | avg karma 2.93 · 2021-10-11 07:20:53

> IIRC it was mostly the initial reluctance of the author to merge PRs that rewrote unsafe to safe Rust.

No, not really. Spinning the actix-web as the author's reluctance to accept PRs is a gross misrepresentation of the problem faced by the project.

The author even went to great strides to explain that his use of unsafe code was indeed safe.

For context, here's what the author had to say about that regrettable episode.

https://github.com/fafhrd91/actix-web-postmortem

reply

ezekiel68 | karma 740 | avg karma 2.63 · 2021-10-11 10:25:55

> No, not really. Spinning the actix-web as the author's reluctance to accept PRs is a gross misrepresentation of the problem faced by the project.

No, not really. The author, a professional at Microsoft, was a hobbyist with rust who (by his confession) wished to explore the limits of performance using various techniques. People made a mountain out of the fact that his project rocketed up to the top of the TechEmpower benchmarks -- as if this obligated him to conform to their flash-mob wishes regarding sanitized code, whereas the project very explicitly made it clear that production use was at one's own risk.

What happened to him was a hit-job by a bunch of entitled bullies. No less than Steve Klabnik acknowleged this ("far, far, far over the line"[0]). The fact that ntex exists today[1] yet the world has somehow not exploded yet demonstrates how overblown the criticisms were.

[0] https://steveklabnik.com/writing/a-sad-day-for-rust

[1] https://github.com/ntex-rs/ntex

reply

3np | karma 7507 | avg karma 2.22 · 2021-10-11 11:48:51

As an outsider, as it was clear that the author wasn’t that interested in reviewing safety patches, wouldn’t a “safe-actix” fork maintained by someone else have made everyone happy?

nuerow | karma 1179 | avg karma 2.93 · 2021-10-11 12:00:59

> No, not really. The author, a professional at Microsoft, was a hobbyist with rust who (...)

It's very hard to believe that "a hobbyist" would, by mere chance, write the absolute best performing framework in the world.

https://www.techempower.com/benchmarks/

Your comment not only lacks credibility but also sounds extremely petty.

reply

dralley | karma 13928 | avg karma 4.7 · 2021-10-11 17:23:13

The problem is that actix-web (at the time, at least) was also quite benchmark-oriented -- it was doing a lot of things you don't necessarily want to do in production in order to get better benchmarks.

And that was especially true of the benchmark code itself.

TechEmpower's methodology, additionally, is not perfect. IIRC they support HTTP 1.0 only, which is not what most modern servers are focusing on, but actix-web did.

reply

burntsushi | karma 13683 | avg karma 4.52 · 2021-10-11 12:50:29

You can't boil what happened down into a black-or-white simplification. Because what happened is more complicated than that. Some people went over the line, absolutely, but that doesn't mean any criticism of the project was overblown or inappropriate. Lodging criticism against a project doesn't have to imply any sort of obligation on anyone.

There was some pretty bad misuse of unsafe in actix-web. And as I recall it, the author was resistant to fixing it, even when patches were provided. This is the author's right. It's their project. But it's anyone else's right to make this sort of thing visible, and to advocate against holding up actix-web (at the time) as a good example of what Rust programmers could deliver.

A lot of people can't or won't communicate effectively, and often blur these lines. Either assuming there is entitlement where there is none, or expressing entitlement when they didn't mean to. But there is a reasonable position underlying what happened, and unfortunately some took it too far.

reply

littlestymaar | karma 8278 | avg karma 1.96 · 2021-10-11 16:08:27

> The author even went to great strides to explain that his use of unsafe code was indeed safe.

Some places of code were actually unsound (which means: got function containing unsafe and marked safe even if it didn't upheld the invariant of the unsafe block, meaning you could trigger UB from safe code: this is a big no-go in Rust).

That being said, the whole drama with people mobbing the author on every social media, calling him name and so on, is indeed regrettable.

reply

cies | karma 7348 | avg karma 2.3 · 2021-10-12 03:00:02

> The author even went to great strides to explain that his use of unsafe code was indeed safe.

That's what any author of unsafe code is supposed to do right? Explain why the unsafe code is in fact safe (enough).

reply

steveklabnik | karma 91260 | avg karma 5.08 · 2021-10-12 12:06:41

It is, yes. The parent is defending them, not trying to say that they were wrong.

sodality2 | karma 3858 | avg karma 3.38 · 2021-10-11 05:58:59

With actix, the author was both misusing unsafe, as well as being completely unreceptive to PRs that resolve it with safe variants, claiming it was not dangerous in its current internal state and that the safe variant would shave percents and possibly lose it its spot as an ultra fast web library.

Patix0331 | karma 29 | avg karma 29.0 · 2021-10-11 05:27:45

Copy paste from https://this-week-in-rust.org/blog/2021/10/06/this-week-in-r...

There's a common trope among people unfamiliar with rust where they assume that if you use unsafe at all, then it's just as unsafe as C and rust provided no benefit. Comparing C's approach to safety vs Rust's is like comparing an open world assumption to a closed world assumption in formal logic systems. In C, you publish your api if it's possible to use correctly (open world). In Rust, you publish a safe api if it's im possible to use in correctly (closed world). Rust's key innovation here is that it enables you to build a 'bridge' from open world (unsafe) to a closed world (safe), a seemingly impossible feat that feels like somehow pairwise reducing an uncountable infinity with a countable infinity. Rust's decision to design an analogous closed-world assumption for safe code is extremely powerful, but it seems very hard for old school C programmers to wrap their head around it.

reply

sgift | karma 7698 | avg karma 3.77 · 2021-10-11 05:32:45

Interesting enough Java has the same question. There is a sun.misc.Unsafe class (https://blogs.oracle.com/javamagazine/post/the-unsafe-class-...) which is used internally to provide the safe code you can use in normal Java code. This is needed since at some point you will always have to go down to something which a machine cannot verify (e.g. "this port will hold the result provided by an external machine and the machine operator guarantees me it will always be a non-null address) and the guideline is the same as in rust: If you have to use this, you are responsible to hold all invariants and if you do this and therefore "guarantee" (usually it's done by tests, so it's not really guaranteed, but you get the idea) people using the safe abstractions can be sure that their code will work.

My point being: Unsafe does not invalidate the safety of the overall code, it shows you places where you have to be extra carefully. On the other hand if you yourself do not use unsafe and all the libraries you use are of high quality you can be sure your code will be correct. Which is the best you can do in reality.

reply

fiedzia | karma 544 | avg karma 1.29 · 2021-10-11 05:32:51

Two things here: 1. Restricting unsafe to limited number of places is a lot better than having it as a default everywhere. 2. The idea is that if you have to use unsafe, you (or some other module author) would wrap it in an interface that is safe to use. Overall this may not be perfect (in the sense of "its 100% verified by compiler") and finding bugs does happen in supposedly safe abstractions over unsafe, but Rust moves this from "its a norm" to "its very, very rare". 3. Unsafe lifts many restrictions, but not all of them, so there still is some validation.

Keep in mind that GC and a large parts of low-level code that is fundamental for languages that have GC is written in C, and we still are findings safety bugs there.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 05:47:11

Using unsafe is like providing a "proof" that the code you write won't cause safe Rust to exhibit undefined behavior.

Writing unsafe requires a certain degree of skill, because one needs a deep understanding about what guarantees safe Rust provides, to ensure that your abstraction does not break them.

If you read the standard library docs, every unsafe block has a comment explaining / proving why it can't cause safe Rust code to exhibit undefined behavior.

Many of these blocks, particularly those in libcore, have been proved formally correct in proof assistants.

There is a theorem that proves that "if a safe Rust wrapper over unsafe code is proven correct, then extending safe Rust with that wrapper is still sound".

So you basically can infinitely extend the safe Rust subset of Rust with these abstractions.

> But if “unsafe” is used, can we still claim that Rust code is safe?

So the answer to this is "no, you can't claim that, you have to actually go and prove it".

Often, very often actually, the proofs are trivial.

For example, to index into a slice:

    fn index_slice(slice: &[T], i: usize)
        // SAFETY: if the index is not in bounds, we panic
        assert!(i < slice.len());
        unsafe { 
            *slice.as_ptr().add(i)
        }
    }

suffices.

Why? Because code that creates a slice with a ptr and a len field that do not point to a valid allocation with len valid elements already exhibits UB. That is, for any valid Rust program, we are guaranteed here that these two fields are "ok".

So the only thing we need to make sure of is that the index "i" is inbounds, and that's trivial to do with an assert that panics if this is not the case. That is, at runtime, no program for which the index is not inbounds will reach the ptr.add(i) method, so there is no way to offset this pointer out of bounds, much less dereference it.

Basically, because all safe Rust code can rely on all other safe Rust code upholding the rules, most of the proofs are really easy. The fact that you can prove safe abstractions over unsafe code in isolation from other abstractions is one of the main features of Rust.

---

For synchronization primitives, proving the absence of data-races is particularly hard, and requires a lot of expertise. The subset of Rust programmers that actually need to do this is very small. Most people just use those primitives that have already been proven today, of which there are many.

reply

nuerow | karma 1179 | avg karma 2.93 · 2021-10-11 05:59:45

> Using unsafe is like providing a "proof" that the code you write won't cause safe Rust code to exhibit undefined behavior.

My take is that Rust is proposed based on the value proposition of its Safe Rust feature, but in the Linux kernel that feature has limited use.

Therefore, if Safe Rust is out of the table then what's the point of writing/rewriting parts of the Linux kernel in Rust? What's the value proposition?

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 06:04:35

> My take is that Rust is proposed based on the value proposition of its Safe Rust feature, but in the Linux kernel that feature has limited use.

Where does your take come from?

Everyone I've heard of proposing Rust for the Linux kernel does so almost exclusively by making the argument that Rust can create safe performant abstractions over unsafe code.

That is, instead of requiring all Linux kernels programmers to be deeply familiar with RCU, you can have a small group of RCU experts create a safe abstraction over RCU, that then all other Linux kernel programmers can just use, without having to understand very deeply how it works. If they make a mistake, their code won't compile, and the error message explains why and how to fix it.

This is a huge productivity boost for large complex projects, and the main reason I've seen people to use to introduce Rust into the Linux kernel.

The fact that Rust does this with minimal - often zero - runtime overhead is a requirement for the kernel. But even for abstractions for which this is not the case, it still allows people to only have to become an expert when they want to improve performance, and any use of unsafe immediately triggers a requirement for such code to be reviewed by the actual experts.

This is also a huge boon for these experts, since it substantially reduces the code they have to review. E.g. now they only have to review uses of the "unsafe RCU" APIs, instead of also having to review the uses of the safe RCU APIs, which is the case today.

reply

nuerow | karma 1179 | avg karma 2.93 · 2021-10-11 08:47:53

> Where does your take come from?

From this very discussion, for starters.

> Everyone I've heard of proposing Rust for the Linux kernel does so almost exclusively by making the argument that Rust can create safe performant abstractions over unsafe code.

The whole point is that the same thing (creating safe abstractions over unsafe code) is what developers have been doing with C for close to half a century.

Therefore, there must be a value proposition regarding the use of Rust other than Safe Rust. I haven't heard one so far. In fact, I've read the exact opposite, such as compilers needing rewrites to accommodate that usecase.

Thus, with Safe Rust out of the table, what exactly is Rust's value proposition?

> This is a huge productivity boost for large complex projects, and a huge safety boost since these abstractions prevent a large set of security vulnerabilities, and these are the main reason I've seen people to use to introduce Rust into the Linux kernel.

I feel those baseless assertions require some supporting evidence. Is there actually any tangible and concrete evidence that Rust, specially Unsafe Rust, improves productivity and safety? I'd love to read about it.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 08:54:32

> The whole point is that the same thing (creating safe abstractions over unsafe code) is what developers have been doing with C for close to half a century.

They have been trying to do this, but it doesn't work because C's support for this is very poor. Not even C++ managed to do this.

Rust makes it easy to do this in a reliable and efficient way that is 100% guaranteed to work.

> I feel those baseless assertions require some supporting evidence. Is there actually any tangible and concrete evidence that Rust, specially Unsafe Rust, improves productivity and safety?

Yes, see: https://prev.rust-lang.org/en-US/whitepapers.html and https://www.rust-lang.org/production , and well https://dl.acm.org/action/doSearch?AllField=Rust (there are a bunch of peer-reviewed papers there that show this, serve yourself).

reply

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 06:29:09

>Using unsafe is like providing a "proof" that the code you write won't cause safe Rust code to exhibit undefined behavior

It is more like providing an "axiom" as the compiler won't be able to check it and instead has to assume that it true.

I wonder if there is scope to extend the language in the future to actually add machine checked proofs.

Normally full machine checking doesn't scale to large projects, but if one had to restrict to writing proofs only for the small subset of unsafe code and let the simpler type/lifetime system handle the rest of the code it might be manageable.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 09:34:58

> It is more like providing an "axiom" as the compiler won't be able to check it and instead has to assume that it true.

I don't think one should see this as an axiom, although as you mention one could.

If your unsafe code allows safe Rust to exhibit undefined behavior, Rust makes no guarantees whatsoever about what your program will do.

The convention in the standard library is to provide an actual proof that this won't happen as a comment to each unsafe block.

Right now, there is no tooling to check these proofs, but there is a lot of ongoing research about how to do that, and the standard library already has some tooling to check things about this.

> Normally full machine checking doesn't scale to large projects, but if one had to restrict to writing proofs only for the small subset of unsafe code and let the simpler type/lifetime system handle the rest of the code it might be manageable.

That's the idea.

It is not as simple as just proving each unsafe block independently, but rather all unsafe blocks within a Rust module must be proven together.

From this point-of-view, it would make sense to only implement one unsafe abstraction per module, and keep these modules as small as possible, providing most of the implementation of the abstraction outside the module using only the safe API.

Turns out, this is already how Rust code is written, since this is not only useful for writing proofs (either via assistants or using pencil and paper), but also helps humans reason about the correctness of the unsafe code by keeping all the "fluff" away.

reply

tialaramex | karma 26322 | avg karma 2.85 · 2021-10-11 07:11:22

In respect of ownership models the borrow checker doesn't natively understand specifically, the Rust standard library provides a bunch of obvious things you would want, such as a Reference Counted type and the idea of Weak references. These are of course unsafe internally, but assuming you believe they are correctly implemented, it's safe to use them and so Rust code which uses Rc or Arc (an atomic reference counted type) is safe.

Now, beyond obvious things a beginner would want, you start to get into some pretty cool structures, and many of those are not provided by Rust's standard library, however, if you aren't excited about implementing them yourself others have likely done so. For example maybe you've read about "Hazard pointers" for very concurrent systems where you can't afford to take a lock when destroying shared resources but you want to only destroy resources that are no longer used. Rust doesn't provide Hazard pointers in its standard library, but several crates offer different takes on this.

Again, the implementations are unsafe internally, but if you trust that the programmer got them right, your use of their library is safe.

reply

honkycat | karma 3055 | avg karma 3.39 · 2021-10-11 11:37:24

They have a data structure that will do reference counting just like what most garbage collection systems use.

You see it used every so often.

reply

Const-me | karma 5797 | avg karma 1.88 · 2021-10-12 06:09:10

Ref.counting is slow, causes too many extra memory requests. Also, when the program is multithreaded, access to cache lines recently modified by another CPU core is expensive, even more so that a cache miss.

For these reasons, most garbage collection systems are using mark and sweep, or something similar.

reply

amrx101 | karma 102 | avg karma 1.46 · 2021-10-12 02:38:03

Reference counting man. When the programs ownership model doesn't adhere to Rust's borrow checker.

xvilka | karma 6760 | avg karma 3.09 · 2021-10-11 04:37:10

Another approach would be to go all in into improving the kernel [1] completely written in Rust, like in RedoxOS[2].

[1] https://gitlab.redox-os.org/redox-os/kernel

[2] https://www.redox-os.org/

reply

5e92cb50239222b | karma 3964 | avg karma 3.97 · 2021-10-11 04:39:32

One of the main value propositions of Linux compared to any such alternative are thousands of drivers for almost any device that you can find on the market. Unless you find somebody willing to port them, I see very little practical sense in doing this (unless your goal is to have fun).

Const-me | karma 5797 | avg karma 1.88 · 2021-10-11 04:45:50

Moreover, for many critically important pieces of hardware, such as ARM Mali GPUs, the drivers are only available for Linux and Android.

bo0tzz | karma 3253 | avg karma 10.84 · 2021-10-11 05:06:05

Why not include support for running those 'legacy' C drivers in that new Rust kernel? Sure, you'd be giving up some of the memory guarantees, but that's no different from when you include some Rust into the current kernel.

rini17 | karma 2455 | avg karma 1.39 · 2021-10-11 05:53:24

There's no stable ABI for drivers. So much even Linux kernel itself isn't compatible with its drivers without their constant maintenance.

rcxdude | karma 6315 | avg karma 2.5 · 2021-10-11 11:55:57

There's not even a stable API for drivers. The interfaces they use get changed and rewritten all the time.

phendrenad2 | karma 6687 | avg karma 1.55 · 2021-10-12 17:13:33

My pet conspiracy theory is they do this to discourage kernel forks, so that Linux doesn't become fragmented.

mariusor | karma 1801 | avg karma 1.46 · 2021-10-11 06:21:49

I think supporting linux drivers was already tried in other kernels (I remember a talk Andrew Tanenbaum gave at a FOSDEM) when he tried to incorporate them in Minix v3 and he and his team gave up in frustration.

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-11 05:21:41

I don't think we need as many kernel drivers as we have. Many of those drivers should never have been in the kernel in the first place. Userspace would have been adequate.

If something can be done from userspace, it should be done from userspace.

Legacy Linux drivers could be sandboxed in user space inside a fake Linux kernel environment, a bit like how captive NDIS used to provide a fake NT kernel for network drivers run under Linux.

reply

mnd999 | karma 3101 | avg karma 2.74 · 2021-10-11 06:40:19

Sounds like you’re suggesting a microkernel architecture. It’s not a terrible idea, but it’s not a new one either and it hasn’t gained a huge amount of traction.

NavinF | karma 4046 | avg karma 2.2 · 2021-10-11 11:49:24

Not practical. The Linux driver ABI is unstable. At best, your hypothetical microkernel will work on a subset of obsolete hardware.

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-11 12:38:05

Say you have a USB coffee maker with a custom kernel driver and you want to put that driver in userspace. Just take one specific kernel version that has that driver, build it as a microkernel (so the kernel runs as a user mode process) including that driver, and then make that driver talk to the rest of the system over a stable interface like dbus. (Think of something like UML.)

The fact that the kernel ABI is unstable is irrelevant because you're not relying on ABI stability. You're using a specific version of the core kennel paired with a specific version of the driver.

reply

NavinF | karma 4046 | avg karma 2.2 · 2021-10-12 01:21:13

Ahh ic. That's very different from the example you gave, NDISwrapper (which does rely on the ABI stability of Windows drivers).

Yeah your kernel could totally support USB coffee makers using USB passthrough to a Linux VM as long as (1) you write a Linux daemon for each device that lets you talk to the hardware over RPCs from the host and (2) all application software that wants to talk to said hardware uses your dbus API. That's a hell of an ask, but it's doable.

But what about the xHCI drivers for USB? You still have to write that for every USB chip on the market.

What about file systems? I've been burned by FUSE performance.

What about GPU drivers? This sucks on Linux and it'll suck even harder on your hypothetical kernel.

Or any device that sits on the PCIe bus for that matter. There are so many drivers that rely on running in a shared address space with data structures and hardware state shared with the rest of the kernel. At the very least you'll have to support the 2 new motherboard chipsets that come out every year. To be clear, I'm just taking about desktops. Laptops and phones will be a whole new world of pain

reply

DeathArrow | karma 9561 | avg karma 2.13 · 2021-10-11 05:14:53

Aside for what others said, a microkernel is much less performant than a monolithic kernel.

aliswe | karma 754 | avg karma 0.6 · 2021-10-11 15:37:10

I thought Fuchsia was a microkernel? not to counter your point.

mnd999 | karma 3101 | avg karma 2.74 · 2021-10-12 01:38:53

I think post meltdown / spectre the idea of a trade off between security and performance has gained more traction. Perhaps it’s not such a problem.

zokula | karma 96 | avg karma 0.4 · 2021-10-11 18:57:22

Shame about the license.

encryptluks2 | karma 1594 | avg karma 0.49 · 2021-10-11 04:49:01

I'm almost certain at this point, Linus will eventually kill the idea of Rust in the kernel. The toxic culture of Rust developers will eventually ruin what was once good about Linux.

franga2000 | karma 4899 | avg karma 2.85 · 2021-10-11 04:54:43

> The toxic culture of Rust developers

What a strong and damning claim, it would be a shame if it were left entirely unsubstantiated...

reply

throwaway432242 | karma 33 | avg karma 33.0 · 2021-10-11 05:29:34

Not the OP, but here's an explanation anyway: the Rust community is full of superficially kind and polite language, enforced ruthlessly, that masks a high level of politics, intrigue, and backstabbing. The problem is worsened by people being unable to talk about it in the open due to the Rust community's enforced superficial politeness --- it's against the rules, you see, to accuse someone who's using such helpful and kind language of doing something wrong.

Example: see the drama surrounding https://news.ycombinator.com/item?id=28513130

The Linux kernel people are completely different. They're direct, aggressive, and have arguments directly, out in the open. Linus will call you an idiot in public, which is a big no-no in "nice" places like Rust, but if you fix whatever it is that's making him call you an idiot, Linus will then take your code and think nothing more about the incident.

By contrast, in Rust-land, disputes might simmer forever because nobody is allowed to by "mean" and bring them out into the open. Things will just mysteriously not happen for you and you'll get more and more frustrated that nothing seems to be making any sense.

In other words: Linux has an old-school hacker culture, and Rust has the culture of a college campus.

People used to the Rust style think of the Linux community as a bunch of toxic aggressive assholes. People used to the Linux style think of the Rust people as a bunch of toxic passive-aggressive assholes.

(I personally much prefer the Linux style.)

reply

Dylan16807 | karma 31639 | avg karma 1.39 · 2021-10-11 05:54:20

> see the drama surrounding https://news.ycombinator.com/item?id=28513130

That was drama, yes. I didn't see any intrigue or backstabbing or superficial politeness.

By superficial politeness do you mean keeping certain things private?

reply

seanw444 | karma 2596 | avg karma 1.85 · 2021-10-11 08:41:09

Interestingly, most of the personal websites and journals I've seen of various developers that primarily use Rust in their projects, have this exact undertone to them. I'm still not certain why this is, but I can definitely sympathize with your description.

JohnWhigham | karma 1355 | avg karma 1.92 · 2021-10-11 11:07:34

I'm still not certain why this is

It's mostly sheltered Americans who misinterpret Linus's way of doing things as "toxic". Said sheltered Americans also coasted their way through life without ever having a Linus-esque reality check, so again they perceive that kind of concise communication as a personal material attack on them.

reply

smoldesu | karma 10042 | avg karma 1.14 · 2021-10-11 08:58:09

The biggest differentiation is that C is not an opinionated language, whereas Rust is. There are right and wrong answers in Rust (look at Actix, for christ's sake), where often the friction generated is because of a fundamental misunderstanding in how it works. C is much more freeform, where you could probably write 500 incorrect solutions and 5 perfectly safe ones. That sort of programming style worked great for the early kernel org, but now we're shifting our focus towards security and correctness. C still works, but Rust is simply a lot more ergonomic for the majority of situations.

zozbot234 | karma 19616 | avg karma 2.16 · 2021-10-11 09:35:21

I've never seen people refrain from engaging in technical debates in the Rust community because they were scared of seeming unkind or impolite. I'm sure that one could argue that other programming language communities have issues with this one way or the other, but the complexity of things that are decided wrt. Rust is such that the way people phrase things just doesn't affect much. There's simply no harm in being kind and polite about this stuff, having good arguments is ultimately what matters.

pdimitar | karma 9735 | avg karma 1.68 · 2021-10-13 17:26:51

Classic example of filter bubble. I've seen what you describe about the Rust community in many others -- and only occasionally in the Rust one.

Super strange comment, your one. Goes into "Rust devs can't take criticism" land and has no leg to stand on at that.

reply

indeyets | karma 394 | avg karma 2.13 · 2021-10-11 04:49:40

There's previous discussion on reddit https://www.reddit.com/r/rust/comments/pzo1v9/so_you_want_to... with important points

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-11 05:17:33

Reading through the descriptions of the fast and sophisticated memory synchronization techniques in Linux that Rust cannot express in safe code, I have to wonder how many optimization opportunities would have gone undiscovered, forever, if the kernel had been written in Rust from the start.

Language shapes how we think about the world. Ideas that are hard to describe in the language of the mind are hard to imagine in the first place. What potential avenues for improvement might we lose by shifting computing to safe but constricting languages like Rust?

reply

kaba0 | karma 9701 | avg karma 1.18 · 2021-10-11 05:25:01

Rust is not confining at all, you always have an escape hatch through unsafe, but even without that, programs that fail to compile would mostly all be faulty ones.

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-11 05:46:28

> Rust is not confining at all,

Yes, it is, and the linked blog posts talk about how. Did you read them?

My point isn't that some ideas can't be expressed in Rust. It's true that with sufficient effort, Rust can express anything.

My point is that some ideas are harder and more awkward to express in Rust compared to C, and that as a result, people might have been less likely to invent those difficult-to-imagine ideas if they'd been working in Rust and not C.

I'm basically arguing for the "weak" variant of the Sapir-Whorf hypothesis as applied to programming languages. The weak form of Sapir-Whorf is obviously true with respect to human languages.

To the responses below:

Yes, it's true that Rust's higher level of abstraction allows for optimizations difficult to express in C. It's worth pointing out that C++ has similar expressiveness but doesn't constrain access to the machine in the same way.

> Sure some things might be harder to express in Rust, but the tradeoff is well worth it.

You're missing my point. For some types of code, the safety Rust provides might be "worth it". I'm just suggesting that there really are performance-enhancing techniques that would have gone uninvented in a Rust-only world.

You seem to be agreeing with me and adding on top that we don't need that stinking performance anyway, so nothing I said presents a problem. Can you see how someone might disagree with that perspective?

reply

kaba0 | karma 9701 | avg karma 1.18 · 2021-10-11 05:55:14

Yeah, some ideas are harder in Rust than in C, but I argue that most of those should not be followed in the first place, and are the way due to C’s own Sapir-Whorf, whereas it can’t abstract away jack shit. Eg. the insane overuse of linked lists are pretty much not useful on any modern computer, and it is used all over C codebases due to the language not being able to abstract something as simple as a vector. C strings are also abominations, and are non-efficient on top of that!

All of these are possible in an elegant/comfortable to use way in “higher-level” languages, like Rust and C++.

reply

Ygg2 | karma 7330 | avg karma 2.02 · 2021-10-11 05:58:03

> Yes, it is

Ok, let's assume it is confining (something that I disagree with). So what? Seat belts are confining as well.

Sure some things might be harder to express in Rust, but the tradeoff is well worth it.

I'd rather take the confinement or Rust, rather than pervasive paranoia of whether I'm going to hit C UB anywhere.

reply

ezekiel68 | karma 740 | avg karma 2.63 · 2021-10-11 10:35:09

You do well to point this out. Can you also see that the original claim was therefore overblown? (not an attack -- we all do it sometimes)

paavohtl | karma 1523 | avg karma 5.75 · 2021-10-11 06:02:41

The opposite also holds true. Because Rust supports much higher level abstractions than C, it can be much faster and easier to experiment with different designs than with C. Parallelism is the classic example, but I also really like Bryan Cantrill's blog post from 2018 [1], where he was able to use a BTreeMap as a drop-in replacement for a HashMap, and gain a lot of performance that way.

[1] http://dtrace.org/blogs/bmc/2018/09/28/the-relative-performa...

reply

d3nj4l | karma 4224 | avg karma 5.38 · 2021-10-11 06:27:40

Regardless of anything else you've said;

    The weak form of Sapir-Whorf is obviously true with respect to human languages.

No, it's not. Nothing is "obviously true", least of all a pretty grandiose theory, and I can't find any source that empirically backs up the idea that the weak version of the hypothesis is true at all.

unrealhoang | karma 665 | avg karma 2.96 · 2021-10-11 07:23:34

Regarding performance, the opposite side of your question is as relevant, how many optimization opportunities would have been skipped because the C API is too hard to guarantee correct usage [1]? Rust has already surpassed C on the benchmark game, can that be used as data points?

[1]: http://dtrace.org/blogs/bmc/2018/09/28/the-relative-performa...

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 08:48:37

The rust they write for those benchmarks looks like this:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Just as safe and fast as C++, but with more work done to micro optimize.

reply

SubjectToChange | karma 472 | avg karma 1.61 · 2021-10-11 23:48:47

You should read the author's other post: http://dtrace.org/blogs/bmc/2018/09/18/falling-in-love-with-...

They weren't trying to game benchmarks in their analysis.

reply

igouy | karma 1139 | avg karma 0.41 · 2021-10-12 12:05:16

> Rust has already surpassed C on the benchmark game…

Surpassed?

Looks kind-of similar.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 05:19:54

> Given that DEC Alpha famously had difficulty with RCU, it is only reasonable to ask how Rust will do with it.

> Including some wild speculation about how Rust's ownership model might be generalized

I recommend starting with the latests post in the series and the conclusions(https://paulmck.livejournal.com/64209.html and https://paulmck.livejournal.com/65056.html).

From the snippets at the beginning, it looks like the author documented how they started learning about how to solve these problems in Rust, and it does not take too long until the comment section starts pointing the author at crossbeam-rs, GhostCell, etc. which only start appearing in the series in the last couple of sections.

Learning Rust by implementing a linked-list is pretty hard, but "Learning Rust by implementing RCU" has to be, by far, the hardest way to learn Rust that I know of. Kudos to the author for pushing through this so quickly and thoroguly. The posts are all excellent.

I wish the author would go a little bit more into crossbeam-rs. One thing the author does not consider, is that a safe Rust interface to RCU is essentially a "proof of correctness" for RCU, i.e., it proves that using RCU via that interface cannot introduce undefined behavior. (Or maybe more technically, the contracts on the correctness of unsafe code within the safe wrapper provide a potentially incorrect "proof" of this).

AFAIK no such correctness proof for the C APIs exists (there are a couple of linters that catch some issues, but there is no proof that say that if you stick by those warnings your code is UB free).

The author seems to be assuming that the C APIs to RCU have been proved correct, which AFAIK is incorrect, and by ignoring the safe Rust APIs to RCU, the author is missing out on potentially finding issues on the correctness of the C APIs that Rust would discover.

Basically, the question: "Can we expose Linux RCU apis to Rust?" is definetly worth asking, but the question "Can Rust prove Linux RCU APIs correct?" is much more interesting. If it can't, then why not? The fact that we have such proofs for other RCU and hazard pointers APIs in Rust shows that this is at least possible in general. And if Linux RCU can't be exposed in safe Rust, precisely understanding why would definitely be illuminating (Can Linux RCU be proven correct outside Rust, but not in Rust? Or maybe does Linux RCU APIs can actually be proven incorrect by Rust?).

reply

dbrgn | karma 3910 | avg karma 5.16 · 2021-10-11 05:37:47

Note: If someone wants to learn Rust by implementing linked lists, then https://rust-unofficial.github.io/too-many-lists/ is a must-read :)

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 05:42:56

Arguably, that's just too hardcore.

You can implement doubly-linked lists in safe Rust by using Rc and Weak, Option, and RefCell (or Arc and Mutex for a safe concurrent doubly-linked list).

Learning about safe Rust is gentler and more useful for a beginner, than starting to learn Rust by learning about "unsafe code".

You have to know what safe Rust allows so that you can appropriately know to which contract unsafe Rust must adhere to.

reply

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-11 05:57:14

> You can implement doubly-linked lists in safe Rust by using Rc and Weak, Option, and RefCell (or Arc and Mutex for a safe concurrent doubly-linked list).

You can, but not without introducing runtime overhead relative to C and C++, e.g., by forcing reference-counting where none would be necessary in C.

Rust's safety checks do in fact block some safe and zero-overhead abstractions familiar to people working in C and C++, and denying that isn't helpful. Stating that these techniques can be implemented in Rust with overhead is missing the point.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 06:14:37

> You can, but not without introducing runtime overhead relative to C and C++, e.g., by forcing reference-counting where none would be necessary in C.

The runtime overhead is minimal. 96% of it comes from the pointer indirection in the linked list, and this you have either way.

Last time I benchmarked, the checks made the list 4% slower, which is acceptable for many, since doubly-linked lists are already very slow.

This 4% is buying you a correctness proof that your list API cannot introduce undefined behavior. If you write this doubly-linked list abstraction in safe Rust, and make a mistake, the resulting code will still not have undefined behavior.

C and C++, even paying this 4% performance cost, do not provide this guarantee, which is valuable to many.

Unsafe code that removes this 4% runtime overhead is an optimization.

The fact that you can remove this overhead by providing a safe wrapper over unsafe code is a selling point of Rust, but learning how to write such optimized code without learning first what it takes to make it correct IMO completely misses the point of Rust.

It is very easy to write broken safe Rust abstractions over broken unsafe code. At that point, you are in a worse place than C or C++, since you are paying many costs for introducing Rust in a project, but leaving the most valuable feature off the table (correctness, lack of segfaults, lack of data-races, etc.).

reply

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 06:21:22

Reference counting also adds per-element space overhead.

Also linked lists are often [1] used in C and C++ for intrusive chaining of nodes otherwise owned by other data structures and the reference counting would either impose overhead to those other datastrucutres or be pointless.

[1] In fact I would say this is very common at least in C++ as linked lists make otherwise very poor datastructures by themselves.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 06:25:23

I know. So? How does that change anything?

A doubly-linked list is already space inefficient, it adds 2 pointers per element.

Using Rc + Weak adds one extra word per element to store the two (strong and weak) refcounts. So for a 64-bit type, a Vec takes 1x space 1xN, a list takes 3xN, and a ref-counted list 4xN.

Performance wise, the extra space doesn't change anything, since that word is allocated with the element in the same cache line and would have been read anyways (even if you don't access it, the hardware does access it).

If you are using a doubly-linked list, you really don't care that much about any of this, and the only thing that makes a real difference, is the guarantee that your list implementation is correct.

reply

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 06:39:08

The additional refcount takes space that could be used for the payload that now either needs to spill into the next cacheline or needs to be compressed further.

You can still use double linked lists in high performance code, as long as the list traversal doesn't happen on the critical path (or at all).

reply

estebank | karma 4736 | avg karma 3.39 · 2021-10-11 07:04:43

If you are concerned about high performance/space overhead/cache awareness you wouldn't want to use a linked list in the first place, instead you'd use an arena.

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 08:05:47

An arena is not a replacement for linked lists. In some cases you can use it as a replacement, but in many cases you can't. So can the idiomatic rust apologists please stop with this? I know you love it the language and wants more people to use it, but these sort of statements doesn't help your case.

"Performance is almost the same in real world cases"

"You don't really need that data structure anyway"

"You can learn how to optimize code later"

People come and want to use Rust for some fast code. They know what they want the code to do and want it to be as fast as the code they wrote before, and see how Rust would help them achieve that. The above statements are then wrong or patronizing or completely misses the point.

reply

estebank | karma 4736 | avg karma 3.39 · 2021-10-11 08:15:55

Linked lists are not good for cache locality, it has nothing to do with the language. It'd be my advice over linked lists even if you were doing it in C. Rust does make doubly linked lists in safe code without reference counting impossible, but I don't the argument that you'd use one due to performance or memory overhead.

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 08:22:04

> Linked lists are not good for cache locality

That doesn't matter if a solution with good cache locality doesn't exist for your problem.

reply

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 10:25:03

They are not, that's why you would use a flat layout for your primary iteration order. But for any secondary ordering/indexing you might need, if you also need fast insert/removal, chaining in a list can be an option.

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 08:22:41

The question is fair.

Why would you want to use a non-intrusive doubly-linked list?

AFAIK there is only one answer: you have very big objects and you need O(1) splice.

If you don't have very big objects, or you don't need O(1), pretty much any other data-structure in existence is going to give you much much better performance than a doubly-linked list.

For the only use case for which non-intrusive doubly-linked lists are good at, however, there exists no hardware you can buy for which you can measure a performance difference between ref-counting and not ref-counting.

This has nothing to do with Rust. The same applies to C++, or Java, or Python, or even C. You can do ref counting on any language, and all these languages run pretty much everywhere.

The only thing that Rust gives you over Java, Python, or C here is the possibility to implement a doubly-linked list in safe Rust that will perform the same but that the compiler will prove for you to be memory and thread safe.

Sure, Rust also allows you to use unsafe code, and then prove that safe and create a safe wrapper. It even does this for you and provides this in std::List. But what value does this add? It's more work, and it doesn't perform better, and why a significant number of people have argued in the past that adding std::List was a mistake in one form or another.

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 08:30:42

> Why would you want to use a non-intrusive doubly-linked list?

Or you have a list of objects with many references to parts inside the list and you want to rearrange those parts from those references at O(1) speed.

Lets say I have pointers two elements A and B in the list. I now want to say that B should come after A, I can do this easily in O(1) time with this with no overhead and all references sees this update instantly. Lets say I put integers in these so the data is embedded in the list. You have many other places referring to elements inside the list and you want those references to track the objects position.

How would you solve this using a Rust compatible data structure?

> The question is fair.

No it is not. Experienced people almost surely knows their use case better than you do.

> there exists no hardware you can buy for which you can measure a performance difference between ref-counting and not ref-counting.

Are you kidding me? Seriously, I don't see how you could believe this unless you never tried to use ref counting to replace other code and compare the performance difference.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 08:51:08

> Lets say I have pointers two elements A and B in the list. I now want to say that B should come after A, I can do this easily in O(1) time with this with no overhead and all references sees this update instantly.

You can do this with a vector of pointers as well, at lower space complexity cost (1 pointer per element instead of two), and a lower runtime cost (1 swap of two pointers instead of swapping 4 pointers).

Such a data-structure has also other advantages, like higher cache efficiency, etc.

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 08:59:02

You can't move an element in a vector in O(1) time. Swapping isn't the same as moving, you'd have to move n elements to make room for the new one.

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 09:25:29

You mentioned changing the order of two elements in a list, such that one comes after the other.

With a vector of pointers to the elements (C++ std::vector<T*>) this can be easily done in O(1) as I explained above.

If now that I've proven you wrong, you want to change the problem to something else, feel free to state your new problem, and I'll proceed to prove you wrong again.

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 09:30:02

> You mentioned changing the order of two elements in a list, such that one comes after the other.

Putting B after A is not swapping them, it is putting the element B so it is right after A. You could technically interpret it as you did, but only a contrarian would do that. If you want more people to support rust then you should stop being a contrarian. If you want me and others to assume that the Rust community is full of hard to work with people then please continue, but that wont make people more likely to pick up rust.

If instead of behaving like you do here people would just say "Sure thing, in order to get that performance in rust you just do X and Y!" I bet people would be way more supportive of rust. But if working in the language means that people like you will come and argue like this then why not just write a C library, and then someone will write a Rust wrapper and people are happy? While if they'd write it in unsafe Rust people would come and complain like hell.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 10:05:15

> You could technically interpret it as you did, but only a contrarian would do that.

Or you could have been more clearer.

I didn't intend any animosity, but you clearly do.

> If you want more people to support rust

I don't care whether more people support Rust or not. I am just fighting the spectacular amount of disinformation in this thread by malicious actors that have never used Rust.

> Putting B after A is not swapping them, it is putting the element B so it is right after A.

Then you should have written that instead. Take the pointer in the vector right after A, and swap it with B. That way, A is now right before B, and B is now right after A.

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 10:12:58

> Take the pointer in the vector right after A, and swap it with B. That way, A is now right before B, and B is now right after A.

This is a joke, right? I don't see how it can't be.

But in case it isn't, in almost all problems like this it is assumed that you don't want to change the order of the other elements.

For example, if you wrote a function MoveToDirectlyAfter(a, b), and it did move elements other than a or b people would say that your function contains bugs or that it doesn't do what it says.

reply

pas | karma 7438 | avg karma 1.12 · 2021-10-11 11:52:36

In a real life system what is the usual use case for this?

Let's say a user reorders documents with drag and drop. What's the low-level equivalent? Could you help me come up with an algorithm that depends on this?

If write speed is important, I'd just use an in-memory LevelDB-like thing (so a log structured merge tree).

If the data is not much, then I'd optimize for code simplicity.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-12 07:34:50

Before I delve more, are you talking about `rotate` ?

https://en.cppreference.com/w/cpp/algorithm/rotate

Is that what you want?

reply

PaulDavisThe1st | karma 19282 | avg karma 2.7 · 2021-10-11 09:11:01

The objects don't have to be big. They have to be expensive to copy. These are related but not always identical. Objects that refer to things in the real world (or in a complex model) can be either actually non-copyable or just hard to copy.

ATsch | karma 3283 | avg karma 4.33 · 2021-10-11 09:45:14

> The above statements are then wrong or patronizing or completely misses the point.

I think more what's more going on is that longtime Rust programmers can get frustrated by the sheer volume of people who have never used Rust but dismiss it because they hypothesize that they will need to regularly write things like linked lists in Rust just because they are the easiest data structure to write in C.

When in reality let alone the amount of high performance data structures in the standard library and crate ecosystem mean it's extremely unlikely they'll even feel to write any data structures in the first place. Programming Rust and C or C++ is a very different experience in numerous ways and it is unproductive to expect that not to be the case.

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 10:05:28

But C++ programmers have ergonomic vectors in the standard library. There is no reason to implement data structures there unless you need them for performance reasons.

And correct use of linked lists is not uncommon at all. Binary trees are linked lists for example, would you implement binary trees as a vector pointing to vectors rather than a linked list containing pointers to similar linked lists?

reply

sanderjd | karma 14921 | avg karma 2.45 · 2021-10-11 10:20:48

Yes, I would avoid implementing my own binary tree using linked lists (unless it was just an educational exercise). But also yes, if I did decide that I had a production use case for creating my own binary tree implementation, I would be very likely to look for approaches that use contiguous rather than linked data structures.

I think this perspective you're espousing is "sometimes you just want to knock out a simple bespoke binary tree implementation", but I just don't think that use case makes sense. Either you want to use one from a good library or you have a really good reason to roll your own and it is worth the effort to do it well.

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 10:32:22

> I think this perspective you're espousing is "sometimes you just want to knock out a simple bespoke binary tree implementation", but I just don't think that use case makes sense. Either you want to use one from a good library or you have a really good reason to roll your own and it is worth the effort to do it well.

There is no library for generic tree data structures, you have to implement those yourself. There are libraries implementing a table interface using ordered binary trees but that is just one of many use cases.

reply

sanderjd | karma 14921 | avg karma 2.45 · 2021-10-13 16:43:33

Coincidentally I was just watching this talk from Bryan Cantrill: https://youtu.be/vvZA9n3e5pc. Around the 58 minute mark he shares an anecdote that is relevant to this thread. It boils down to him doing something where he would have written a simple bespoke balanced binary tree in C, but when writing a reimplementation in Rust used a "naive" library approach and was surprised to see a 35% speedup, which boiled down to the library using a btree instead of a balanced binary tree.

This doesn't invalidate your point about there being use cases for binary trees that may necessitate you to write your own, I take you at your word that you've often had strong use cases for this, but I do think it illustrates something I believe to be true, which is that people reach for linked data structures too often, when contiguous ones are a better default.

reply

ATsch | karma 3283 | avg karma 4.33 · 2021-10-11 10:43:42

No, I probably would use the BTreeMap in std instead, assuming I want a search tree. Which is even better, because the primary reason you use binary trees in C is that they are easy to make intrusive. But if I want a priority queue instead there's also BinaryHeap. If I wanted to store data on the branch nodes I might use intrusive_collections. Or maybe if I was dealing with a more complex graph I'd pull in a library like petgraph. If all that fails I could still arena-allocate the nodes to get around the lifetime issues. If I really absolutely wanted to, which I don't see happening any time soon, personally. I'd need a more concrete example there.

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 10:49:05

I said binary tree, not "Map interface implemented using an ordered binary tree". They are not the same thing.

ATsch | karma 3283 | avg karma 4.33 · 2021-10-11 11:06:14

I am aware, as you might be able to gather by reading past the first sentence of my comment. Of course if you'd just said what you needed the binary tree for, I could have given you a better answer.

If you don't have a use case in mind, I'd suggest just trying to actually use Rust for something instead of spending time trying to construct problems where Rust would be bad at the solution.

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 11:17:48

You can't even explain how doing something as simple as a binary tree with parent pointer is possible in rust, so my assumption is that it isn't possible and I'd have to make work arounds. I can make work arounds just fine in C++, you don't have to teach me those, but I don't see why I'd learn a language where I'd be forced to make those work arounds rather than use the version that fits my problem best.

It might be possible that you can write nice versions etc, if so I'd like to see them, but it seems everyone is hell bent on just saying "You shouldn't do it!".

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-13 02:08:25

> You can't even explain how doing something as simple as a binary tree with parent pointer is possible in rust, so my assumption is that it isn't possible

A beginner learns this in the first week:

    struct Node<'a, 'b> {
        left: Option<&'a Node>,
        right: Option<&'b Node>
    }

Many people don't know how to read or write, but you don't hear them every day claiming that "therefore it must be impossible", yet here we are.

> but I don't see why I'd learn a language where I'd be forced to make those work arounds rather than use the version that fits my problem best.

> but it seems everyone is hell bent on just saying "You shouldn't do it!".

The claim was and still is "there are infinitely more effective - as in faster - ways to learn Rust than by starting with how to write doubly-linked lists".

This whole thread is you failing at 3rd grader logic hard and concluding:

- therefore it is impossible to write doubly-linked lists in Rust

- therefore doubly-linked list in Rust are efficient

- therefore it is impossible to learn Rust

- ...and many other things...

I've taught Rust to a 9th year old, but I don't think I could have taught Rust to a 2 year old, so arguably, I think you are right, in that probably _FOR YOU_ it is impossible to learn Rust, you wouldn't be able to grok how to write doubly-linked lists in Rust properly, you wouldn't be able to master Rust to write efficient code in it, etc.

That's ok, don't be frustrated by it, everybody is different.

Just try to stop projecting your limitations to other people. Particularly in this forum, or when talking with Linux kernel developers, were most people are smart experienced programmers that would learn Rust well in a day and master it in a week.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 08:11:35

> The additional refcount takes space that could be used for the payload that now either needs to spill into the next cacheline or needs to be compressed further.

It could, but since the cachelines are adjacents the prefetcher will pull subsequent cache lines anyways, and since the objects typically used in doubly-linked lists are very large (much larger than 64-bit), then this doesn't matter on any practical application that I am aware of. Our results of 4% performance increase is for the worst-case that we found. Usually is 1% or less.

If you have a real-world benchmark that shows otherwise, can you please share it?

AFAIK there does not exist any hardware for which the performance model for:

- non-intrusive doubly-linked lists,

- typical object sizes used on these lists (>> 64-bit)

- typical operations done on these lists (splice, etc.)

would predict a performance difference, and in our case, having replaced unsafe lists with safe list in many large applications, we never were able to measure a difference in application performance, only in the synthetic worst-case micro-benchmarks.

So I am truly interested to learn about your use case.

reply

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-11 11:10:45

You're ignoring the case of a list being densely packed and embedded inside another data structure. If you bloat a data structure with a useless machine word, there are all sorts of ways for that machine word to hurt the resulting program, and you can't just handwave that away.

The Linux kernel is made by people who spend weeks to shave one bit off the size of a data structure. Do you really think they'd respond positively to your saying they don't need to do that?

You really need to stop assuming what kind of computers other people are programming

The larger point here is you don't get to decide whether other people's use cases are legitimate. Either Rust gives you complete control of the machine or it does not. It does not, not in safe mode, but Rust people keep doing this annoying motte and bailey thing about it.

reply

arcticbull | karma 30340 | avg karma 3.27 · 2021-10-11 13:25:01

This conversation started where folks were talking about where to begin learning Rust. Beginning by learning unsafe{} implementations of a doubly-linked list to save 1-4% of the performance isn't where I would recommend anyone begin learning.

If they do need to shave a single bit off, they can do that, that's a neat feature too.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-13 01:57:35

> You're ignoring the case of a list being densely packed and embedded inside another data structure.

No, I am not. This conversation is exclusively about non-intrusive doubly-linked lists.

If you want to have a conversation about intrusive doubly-linked lists we can have it, but it is a very different data-structure.

> You really need to stop assuming what kind of computers other people are programming

Every week I touch x86, arm, ppc64, riscv and gpu assembly.

I write code for apps daily that run from 16-bit micro controllers to the largest super computers in the world, going through phones, desktops, cloud, etc.

I am not assuming what others are programming, but rather talking about what I program for, which include most hardware that anyone can buy today, and quite a bit of hardware that almost no one can even buy, as well as hardware that's not for sale.

reply

jhgb | karma 3259 | avg karma 1.37 · 2021-10-11 11:56:47

> A doubly-linked list is already space inefficient, it adds 2 pointers per element.

...unless you use the XOR trick to store two pointers in one field?

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 07:50:40

> learning how to write such optimized code without learning first what it takes to make it correct IMO completely misses the point of Rust.

This culture is why rust will never replace C++ and why C++ will never replace C. You can write the same computations in C++ as in C, and in Rust as in C++, but it isn't "idiomatic" so people are really afraid to do it because it will get harshly rejected by the community.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 08:47:01

The premise that Rust does not intend to replace C and C++ is not true. Rust is designed to mesh with existing C and C++ code bases well, which is why many large projects support writing code in Rust (Chrome, Firefox, Linux, Windows, etc.).

Your claim that you need to learn all low level details first is also not true.

There are ~20 million programmers in the world, and about ~16 million of those are Javascript programmers. Many of them don't know and don't need to know the difference between the stack and the heap.

Many of them regularly optimize their Javascript hotspots by re-writing them in Rust and compiling it to webassembly. And almost all of them benefit from Javascript libraries that do this internally.

Rust is for many of them the ramp up into lower-level systems programming, and this is one of the reasons the Rust project has so many contributors. Rust enables Javascript programmers to actually hack on the Rust compiler, Firefox, etc.

This is something that C and C++ never achieved, and one of the main reasons for the Linux kernel to want to use Rust (they want to attract more junior developers to increase the developer base and make the project more accessible).

Your tone that the large majority of programmers in the world are somehow "doing programming wrong" by learning higher-level and safe languages first, and delving into low-level details as they need to, sounds very elitist and I personally find it disgusting.

reply

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 09:10:10

> The premise that Rust does not intend to replace C and C++ is not true.

My point was that Rust wont replace C and C++ as long as the Rust community is as it is now. And from your comments it seems like Rust isn't intended to replace C or C++, but act as a low level language for people who don't know how to write low level code, like javascript programmers.

> Your tone that the large majority of programmers in the world are somehow "doing programming wrong" by learning higher-level and safe languages first, and delving into low-level details as they need to, sounds very elitist and I personally find it disgusting.

I didn't say that everyone has to learn rust by learning how to write the low level parts. I am saying that if someone wants to learn how to write low level parts of rust rather than the safe parts because they are used to writing performance critical bits of code in C or C++ then you shouldn't discourage them from doing that. Please don't put words in my mouth.

reply

sanderjd | karma 14921 | avg karma 2.45 · 2021-10-11 10:29:41

It isn't "don't know how to write low level code" in my view, it is more "don't think it is productive to reimplementat basic data structures over and over again". All the stuff you're saying you can't write in idiomatic rust is stuff that has already been written and battle tested in libraries, which are perfectly idiomatic to use. I don't need to write my own linked list and binary tree implementations over and over again. I did that in school, I can go check the code in the libraries I'm using to confirm it makes sense to me, and then just use the library and go about doing what I'm actually trying to do.

Jensson | karma 10270 | avg karma 2.0 · 2021-10-11 10:44:17

> All the stuff you're saying you can't write in idiomatic rust is stuff that has already been written and battle tested in libraries

It is fine for you to not be able to think of any scenarios where the battle tested standard libraries wouldn't be sufficient. But as someone who wrote low level libraries running in production at Google that doesn't apply to me, for me knowing what it would take to eek out performance in Rust compared to C++ is absolutely necessary and given how hostile people are my guess is that the Rust wouldn't look pretty so there is no reason for me to even try to pick up Rust.

And you can't say that I am not your intended user, if you chase away people like me from the rust ecosystem then there is no way that Rust can replace C++ in any reasonable timeframe.

What I'd like to see is a lot of examples and such showing how to write very complex things using unsafe rust. Because without that there is no way rust can compete, there are already plenty of people who know how to write highly performant code using unsafe C++.

reply

sanderjd | karma 14921 | avg karma 2.45 · 2021-10-11 14:37:18

You are definitely (one of) the intended audiences of rust, just to dispense with any confusion there.

I don't at all think that there are not "any scenarios where the battle tested standard libraries wouldn't be sufficient". I spent a number of words on exactly that in another comment below. My point in that comment was that the use case I don't believe actually exists in production code is "I'm just gonna knock out a simple bespoke implementation of a binary tree". I'm not saying people don't do that, but they shouldn't. If what you need is simple, there is a good library for it. And if the library for it is not sufficient, then it is important enough to invest time into writing a good implementation.

Maybe here's a way to put it: simple bespoke implementations of linked data structures are easy in both c++ and rust using ref counting, slightly better but low-investment implementations are still pretty easy in c++ but are not really possible in rust, and really good implementations are possible in both languages if the investment is worthwhile for the use case. It is true that you would have to learn how to write such a thing in rust and that it would be non trivial to do so, but that is just part of learning a new programming language to an expert level.

I think one misconception in here is that the rust "wouldn't look so pretty". You could do a line for line port of your c++ code using raw pointers and declare it safe, and the two implementations would have similar aesthetics. If your c++ implementation were memory safe to begin with, your rust implementation will be as well. The same process of thinking through the invariants and implementing them correctly is necessary in both languages.

I think the pushback you get that you find off-putting is against the sometimes expressed idea that it is a big knock against rust that you can't do linked data structures with zero cost abstractions in safe rust. This is why people say either "yeah but you can do minimal cost implementations in safe rust and zero cost implementations in unsafe rust". The pushback is against the idea that this isn't a good enough solution. But it totally is, it's certainly no worse than the situation in c++ where the division between a safe and unsafe subset of the language does not exist. Unsafe rust is no different than generic c++ code.

To your last point, there's a great book / reference called The Rustonomicon that is all about writing unsafe rust code well.

reply

follower | karma 1628 | avg karma 2.24 · 2021-10-12 17:07:38

A link to the aforementioned Rustonomicon: https://doc.rust-lang.org/nomicon/

(I seem to recall a comment that it hasn't yet been fully updated for recent versions.)

reply

follower | karma 1628 | avg karma 2.24 · 2021-10-12 19:13:30

I just saw this and thought of you, "Learn Rust the Dangerous Way": http://cliffle.com/p/dangerust/

"LRtDW is a series of articles putting Rust features in context for low-level C programmers [...] the sort of people who work on firmware, game engines, OS kernels, and the like. Basically, people like me."

Also, it seems like the author probably also wrote some low level libraries at Google... :)

reply

fxtentacle | karma 18712 | avg karma 5.34 · 2021-10-11 07:20:05

Fully agree.

I have heard claims that Rust would be performance-equivalent so many times and each time I bothered to check it turned out to be wrong. Especially in video game AI, indirection, pointers and linked lists are common. They are usually used in the performance-critical input-to-frame part of the game and there fitting things nicely into CPU cache lines is crucial for performance.

The end result is that I now have this vague feeling that Rust is a religious cult and I shouldn't take their claims at face value.

reply

api | karma 31631 | avg karma 2.2 · 2021-10-11 07:23:58

C++ is better for performance because it’s old and tremendous effort has been put into optimizing it. Rust is a very young language. It benefits from the clang tool chain and gets some of that optimization for free, but there is a lot of room to make Rust faster.

Unlike GC’d or dynamic languages, there is nothing intrinsic about Rust that makes it a fundamentally harder language to optimize. It’s really like a modern provably safe C++.

It also depends on how you code. If you make heavy use of functional constructs and complex types your Rust code will probably not end up being optimized as well. For tight algorithms it’s good to write “thin” Rust. The exact same is true for C++. You would not want to use a lot of STL or functional stuff in a rendering pipeline core. High performance C++ looks like C.

reply

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 07:34:16

Which optimizations is the rust compiler missing?

A lot of the optimizations required for fast code, like manual layout of custom datastructures are done by the programmers, not by the compiler. I don't doubt that rust can also express this optimizations given the similar low level control provided by the language, but what the OP and the parent are saying is that some of these are not idiomatic in rust and harder or impossible to express in the safe subset of the language.

> You would not want to use a lot of STL or functional stuff in a rendering pipeline core. High performance C++ looks like C.

FWIW that's not at all my experience.

reply

BenFrantzDale | karma 346 | avg karma 2.14 · 2021-10-11 07:54:54

Agreed. My favorite example of C++ vs C performance is C++’s `std::sort` which is intrinsically faster than anything similar in C because C lacks the language features (templates) to support performant code reuse like that.

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 09:02:31

TBH, while I can think of many layout optimizations that Rust does and C does not, I can't think of any that C or C++ do that Rust does not.

In particular when it comes to unions which are heavily used in low-level code, Rust does optimizations that C and C++ can't even dream of (like all the niche optimizations).

reply

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 09:47:23

which optimizations on unions does rust do?

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 09:55:38

In C++, you have `std::optional<T>`. That's a type that either contains a `T` or contains nothing.

In C++, sizeof(optional<T>) > sizeof(T) because the discriminat has to be stored somewhere.

This is true even if, e.g., you do something like `optional<T&>`. You know that T& is a non-null pointer, and you only have two variants, and one of the variants has no state, so you technically can encode this as 0x0 is the "no reference" variant, and the != 0x0 is the reference variant, and have `optional<T&>` have the same size as `T&`.

Rust does these layout optimizations of compressing the discriminant into gaps in the values of discriminated unions automatically.

So:

    enum Option<T> {
        Some(T),
        None
    }

for `Option<T&>` has the same size as `T&` in Rust, as opposed to C++.

In C, an example would be:

    struct DU {
        enum { A, B } discriminant;
        union {
            bool A;
            bool B;
        }
    };

You could encode that into 3 bits (i.e. have sizeof(DU) == 1), but instead you'll have at least sizeof(DU) == 2, because you need one byte for the discriminant, and one byte for the payload.

It is very easy to create values with gaps in Rust, but C doesn't really support doing this.

Another optimization are alignment optimizations. In C, if you write:

   struct S {
       uint8_t a;
       uint32_t b;
       uint8_t c;
   };

that ends up being 12 bytes long. In Rust, by default, that gets reordered as uint32_t, uint8_t, uint8_t, so it only ends up being 8 bytes.

If you want that instead to be laid out like in C, you can write:

    #[repr(C)]
    struct S {
       a: u8,
       b: u32,
       c: u8
     }

and then you get the same 12 bytes as in C. There are many supported `repr(...)` options supported for algebraic data types, e.g., you can use repr(u32) for the Rust enum above to store the discriminant in a u32, and get the same layout as DU in C.

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 10:09:24

Right, c++ has not builtin discriminated union, and std::optional<T&> is actually not valid in C++ (unfortunately).

But discriminant-less custom optionals classes with 'zero' type support via traits are easy to do (and I have done it many times). You do not have to rely on the compiler identifying a safe empty state and you can define any application specific one. For example if for one specific (and common) use case strings are always non null, optional<string, non_null_trait> has an obvious implementation.

So, yes, these sorts of optimizations are not done by the compiler has it has no notion of discriminated types (one day maybe...), but can be done generically by the programmer. In fact you could in principle optimize multiple layers of variant<optional<...>,... > and collapse everything in one discriminant,as long as you do not provide reference access to the sub variants; the required metaprogram is not going to be pretty though.

One of the advantages of C++ is that it allows this sort of control.

reply

cogman10 | karma 11442 | avg karma 3.93 · 2021-10-11 13:36:35

> One of the advantages of C++ is that it allows this sort of control.

But I believe what the OP showed is that Rust also allows you to have that level of control. However, the default behavior for rust is to do the optimization that you have to go out of your way to manually implement in C++.

Rust isn't taking away control from what you can do in C++, instead, it's made the idiomatic approach one that is well optimized by the compiler.

reply

lenkite | karma 3979 | avg karma 1.77 · 2021-10-11 10:56:46

There are compiler directives for packed structs in C++. struct __attribute__ ((packed))

steveklabnik | karma 91260 | avg karma 5.08 · 2021-10-11 11:32:42

Rust also has a packed attribute; this is something different. The re-ordering still includes padding, whereas packed removes the padding.

fxtentacle | karma 18712 | avg karma 5.34 · 2021-10-11 13:35:01

Depending on the values in b, you could do bit-packing in C++ and munch it down to just 1 byte.

zozbot234 | karma 19616 | avg karma 2.16 · 2021-10-11 14:43:36

You can only do bitpacking if individual fields of a struct cannot be accessed via reference (or 'borrowed', in Rust terms). While C/C++ includes the 'register' keyword which forbids reference access, Rust does not.

osigurdson | karma 3456 | avg karma 1.68 · 2021-10-11 08:20:56

Rust uses LLVM so optimizations occur at the AST level. Therefore it should be able to take advantage of the same optimizations as Clang C++ which uses the same approach.

steveklabnik | karma 91260 | avg karma 5.08 · 2021-10-11 09:13:41

They happen at a lower level than an AST, but you're right that any front end can in theory generate the same IR and therefore get the same optimizations. The trick is if the language semantics allow for it; this bites both ways.

nuerow | karma 1179 | avg karma 2.93 · 2021-10-11 09:07:59

> Rust is a very young language. It benefits from the clang tool chain and gets some of that optimization for free, but there is a lot of room to make Rust faster.

I feel that this sort of belief relies too heavily on a presumption that performance is somehow proportional to the age of a programming language, and in the process ignores the fact that a) it's already using a highly optimized toolchain that benefits from decades of research and development, and b) there is absolutely no suggestion that any potential performance gains are relevant or exclusive to the Rust's front-end.

reply

api | karma 31631 | avg karma 2.2 · 2021-10-11 09:12:40

There is always a long tail of optimizations and tweaks that must be done to get a language that last 5% of the way to parity with a language like C. Building on LLVM gets them 95% of the way.

BTW Rust has some features that make optimization a lot easier, like the ability to assume strict aliasing in more cases. It’s safety features also make it easier to write zero copy, copy on write, and other optimization patterns without making mistakes. This enables some bold performance optimizations at the code level that would require serious bravery in C.

Long term I expect that Rust will be faster in some cases.

reply

pca006132 | karma 563 | avg karma 2.05 · 2021-10-11 08:18:55

You can also align data to cache lines if you want to in Rust, without unsafe [1]. It is not C/C++ specific.

For linked list, you can use an unsafe implementation if that is performance critical. Unsafe implementation can be wrapped in a safe API for others to use, and this is how the standard library is implemented.

There is also a new paper about structures with internal sharing such as graphs without runtime overhead that Rc/RefCell would cause [2].

[1]: https://doc.rust-lang.org/reference/type-layout.html#the-c-r...

[2]: http://plv.mpi-sws.org/rustbelt/ghostcell/

reply

tasogare | karma 1277 | avg karma 0.96 · 2021-10-11 08:23:27

> I now have this vague feeling that Rust is a religious cult

I had this feeling from the very beginning just by the tone of its zealots here and elsewhere. My theory is that since the language is very hard to learn but doesn't provide a lot of tangible benefits (memory safety exists in most mainstream languages fot decades), the sunk cost fallacy hits very hard. Then the only way to get some benefits for the time spent is to recruit new comers in a sort of pyramidal scheme structured around experience in the language. This way the zealots become priests and are compensated in form of social status. Of course, the whole scheme cramble if nobody joins, which is why the community is so aggressive (RIIR, vocal advocating, etc.) with the non-believers.

reply

sanderjd | karma 14921 | avg karma 2.45 · 2021-10-11 10:53:17

People like rust for a lot of different reasons; that is, different people see different tangible benefits to it, and that has resulted in both a large and fast growing community of people who like it, and (predictably) backlash to that.

On the side of people coming from other memory safe languages, the benefit is much like Go, which has also become very widely used very quickly, so it seems worth taking seriously that the existing memory safe languages were maybe not satisfying people in some way. I think the popularity of both languages in this space can be boiled down to lowering the amount of abstraction without giving up memory safety, and simplifying the toolchain. Both Go and Rust do both of those things (in my view). I prefer rust because I don't like how the Go type system requires so much repetition of implementation due to lacking generics, but both languages make some amount of sense in this niche IMO.

Other people like rust because they came from C or C++ and got tired of some of their weaknesses but did not want to give up many of their strengths. Of course whether rust requires giving up the strengths of this languages is debatable and this where most of the backlash comes from. But rust is one of a very very few languages (maybe also D and Zig?) that can plausibly make this claim at all, and I think is has become the most actively developed and attracted the largest community of those, which does matter.

For me, it's a huge advantage to be able to plausibly use the same language for everything, from kernel drivers to giant applications. The only real competitor in that space is C++, and I think its cracks show more on both extremes, that is, I think rust is both a better C replacement at the very low end (see people like Linus Torvalds and Bryan Cantrill who are C diehards who dislike C++ but are interested or all in on rust) and is much better at the high level because of easy memory safety.

So I dunno, maybe there's a cult, but there are also very straightforward reasons why a lot of people are attracted to the language.

reply

geodel | karma 6768 | avg karma 2.18 · 2021-10-11 08:56:21

> I now have this vague feeling that Rust is a religious cult

I'd say even if many of technical claims about Rust are true. It still seems very much like a cult.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 08:59:45

I've also heard your claim many times, and every time I checked, replacing a doubly-linked list with something else improved performance by orders of magnitude.

Particularly in video games.

---

> The end result is that I now have this vague feeling that Rust is a religious cult and I shouldn't take their claims at face value.

Also, you realize that Rust allows writing efficient doubly-linked lists right ? And that the Rust standard library provides one for you right?

The whole argument of the OP is that people learning programming need to start doing so by learning how to write doubly linked lists because performance trumps all and that all children, elderly, and even university students that learn programming with javascript are lesser human beings.

reply

platinumrad | karma 1625 | avg karma 4.11 · 2021-10-11 14:18:16

The linked list provided by the standard library is unfortunately not very useful. It doesn't provide O(1) operations on the middle of the list, making it more or less a worse VecDeque. Of course there are crates that provide better implementations but people really need to stop pointing people toward the one in std.

lenkite | karma 3979 | avg karma 1.77 · 2021-10-11 10:53:12

The lack of goto also bars it from the domain of very fast interpreters. Computed gotos are a common technique used in C/C++ interpreters which aren't possible in Rust.

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-11 11:22:21

To be fair, the last time I checked, LLVM was doing an okay job of generating jump tables from big switches and threading the cases for best jump target prediction --- same thing you'd do with computed goto.

But yes, lack of goto and computed goto are deficiencies, not strengths.

reply

zozbot234 | karma 19616 | avg karma 2.16 · 2021-10-11 07:53:36

> Rust's safety checks do in fact block some safe and zero-overhead abstractions familiar to people working in C and C++

The problem of course is that while these abstractions are zero-overhead and can sometimes be used safely, they aren't compositional in general. That is, they impose requirements on outside code which aren't easily captured by Rust's type system. This is exactly what the 'unsafe' facility in Rust was made for. Note also that more recently-developed abstractions of the "Qcell" and "GhostCell" type can in fact implement linked lists safely, and once these are better understood a variety of them will likely be included in the Rust standard library.

reply

moldavi | karma 163 | avg karma 2.17 · 2021-10-11 12:39:31

Doesn't GhostCell preclude deletion, and effectively grow forever?

UncleMeat | karma 13669 | avg karma 2.5 · 2021-10-11 09:01:26

Would anybody use a doubly-linked list if they care about performance? Cache misses are where performance lies in modern systems and linked lists are maximally bad for this.

You can also say almost the same things you are saying here about C++. Type punning is almost always undefined behavior. There is no way to construct an array of variable size at an address that you specify without a pointer indirection. But with ASM you've got no such problems!

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 09:09:27

> Would anybody use a doubly-linked list if they care about performance?

That's probably the only reason to use them. If you don't care about performance there are simpler data-structures available.

If you need to implement an algorithm for which you need O(1) splice, then doubly-linked lists are a data-structure that give you that. If your objects are very big the cache misses might not matter that much, and neither would ref counting.

If you go one step higher, and can modify your object data types, and are careful with how you allocate your data, then intrusive doubly-linked lists can give you equivalent performance to a vector with better algorithmic complexity for many insertion / removal / splice operations, etc.

The stars do however need to align a lot for a non-intrusive doubly-linked list, like the one being discussed above, to be the best answer for whatever performance / algorithmic problem you are having.

reply

cogman10 | karma 11442 | avg karma 3.93 · 2021-10-11 13:46:55

> The stars do however need to align a lot for a non-intrusive doubly-linked list, like the one being discussed above, to be the best answer for whatever performance / algorithmic problem you are having.

That O(1) splice must also be accompanied with an iteration. Which, in my experience, is really rare. If that iteration step wasn't already a part of the splice requirement then it can often be faster to do the splice via a memcopy.

My favorite algorithmic mistake was someone at my company used a binary search to maintain sort order on a linked list. IIRC, that turns insertion into something like an O(n^(log n)) operation whereas it's O(log n) operation on an array.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-12 07:39:32

> That O(1) splice must also be accompanied with an iteration.

I don't think it necessarily must, but it tends to be.

People tend to keep pointers to elements of the list all over the place, so if you have the right 3 pointers (being and end of list you want to insert, and position in another list), then you can do it in O(1).

If not, and you need to traverse the lists... as you mentioned there are other data-structures that might be much better.

reply

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-12 18:10:06

With a circular list, you don't even need the extra pointers.

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 09:14:14

> If you want good performance but don't care about precise control over the machine, write Java or C# or something high level like that. You'll be just as safe

No, you aren't.

None of these languages catch data-races, so writing multi-threaded code is pretty much as hard and error prone as in C. Other higher-level languages like Python fix this by holding a global mutex, so that using multiple threads doesn't really buy you that much there since threads always execute sequentially to avoid data-races.

This is in contrast to Rust, which allows anybody, even people without experience in low-level programming to accelerate their applications using multiple threads, without introducing bugs.

Actually, Mozilla tried to multi-thread Firefox multiple times using C++, and failed, over and over again, because every attempt would introduce subtle bugs. Rust was created to address this issue.

reply

quotemstr | karma 7442 | avg karma 2.34 · 2021-10-11 11:29:35

> None of these languages catch data-races, so writing multi-threaded code is pretty much as hard and error prone as in C.

All real world languages have facilities for various kinds of safe concurrency, e.g. actors and other kinds of message passing. When it comes to concurrency, Rust isn't anything special. It's not even that good.

Rust has one killer feature: memory safety without GC. Except for this feature, Rust is mediocre. If you don't need the memory safety without GC, you can use almost anything else and be better off.

reply

Const-me | karma 5797 | avg karma 1.88 · 2021-10-12 05:58:51

Rust only catches integer overflows in debug builds, my C# programs catch it everywhere. The compiler setting is off by default but it is available.

In Rust, many libraries especially the standard one have lots of unsafe code, with a history of security bugs there [1]. In C#, the standard library is written in almost 100% safe code.

About threading, while it doesn't catch all data races automatically, C# implements many useful things on the VM level. Monitor class, or memory model guarantees, are very hard to implement in languages who compile to native code.

[1] https://shnatsel.medium.com/how-rusts-standard-library-was-v...

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 09:16:52

> I'm really tired of this motte and bailey stuff. Rust proponents say Rust gives you fine grained machine control and safety with no performance compromises, and then when you point out that the language doesn't quite live up to that promise, Rust proponents start telling you that you didn't really need that performance anyway.

Citation needed: which part of my comment says this?

reply

mushishi | karma 590 | avg karma 3.21 · 2021-10-11 05:58:48

I wonder if that's still relevant way to learn Rust as seems to written in 2018.

ATsch | karma 3283 | avg karma 4.33 · 2021-10-11 06:01:32

The changes since 2018 have been very minor

dbrgn | karma 3910 | avg karma 5.16 · 2021-10-11 07:22:55

Yep, the core concepts did not change. It's still relevant.

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 06:19:24

I can't access the article at $JOB, but if I understand correctly, Rust would guarantee that usage of the RCU API si correct, but it would be silent on the API implementation itself as it would be written with unsafe (but crossbeam-rs). On the other hand the author is probably saying that the kernel RCU C API itself has been proven correct, either manually or by machine checking (directly the C code or a translation of it in a proof assistant); usages of the API might not be proven correct though.

Also note that the author of the article is the actual inventor of RCU.

reply

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 06:33:48

> Also note that the author of the article is the actual inventor of RCU.

I know, so?

> On the other hand the author is probably saying that the kernel RCU C API itself has been proven correct,

But this is not true right?

There is no formal proof, e.g., in Coq, that proves a useful subset of the RCU C API correct (as in, if you stick to this useful subset, you can't introduce UB).

OTOH, a safe Rust API could be proved correct to use at least, and an implementation of it that uses unsafe, like the one in crossbeam-rs, is typically very amenable to theorem provers (much more than C).

So what I am saying is that the author is ignoring the value of providing a safe Rust API for RCU, like crossbeam-rs.

If the Linux RCU APIs can't be mapped to it, then maybe they are incorrect to use, and this is why we can't prove them correct.

Doing the work of mapping them, and understanding any issues, could lead to changes to those APIs that might allow us to prove them correct.

reply

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 06:42:04

A quick search shows a bunch of papers about RCU possibly containing proofs. I don't know if there is a coq proof, but there are other dedicated tools to model lock-free algorithms, or possibly the proofs were done by hand.

Of course that's still not proving the actual C code, the translation probably still need to checked by hand.

reply

pca006132 | karma 563 | avg karma 2.05 · 2021-10-11 07:53:55

> an implementation of it that uses unsafe, like the one in crossbeam-rs, is typically very amenable to theorem provers (much more than C)

Just wondering, how many unsafe code in commonly used libraries (standard library, tokio etc.) are proved using theorem provers?

reply

steveklabnik | karma 91260 | avg karma 5.08 · 2021-10-11 09:17:22

Don't have comprehensive enough data to know, but part of the standard library was proven via the Rustbelt project, and it found some places where there were bugs, as well as some places where things could safely be loosened up.

volta83 | karma 1480 | avg karma 3.42 · 2021-10-12 07:41:41

Most of the unsafe code in libcore is proved in Iris.

Some of the unsafe code in the standard library also.

Some parts of crossbeam have pencil&paper proofs documented.

reply

mftb | karma 823 | avg karma 2.7 · 2021-10-11 07:58:54

RCU - Read, Copy, Update - A strategy for readers and writers, where the readers do not exclude writers (best as I can tell).

ncmncm | karma 13951 | avg karma 1.02 · 2021-10-11 13:52:49

... without any sort of locking.

GoblinSlayer | karma 1313 | avg karma 0.44 · 2021-10-11 08:06:38

>AFAIK no such correctness proof for the C APIs exists

There's a whole proven correct kernel: http://sel4.systems/

reply