Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> It's same issue as in C# and many others: someone has to write and support the bindings, which adds cost.

If C was on the table anyway then you can use bindgen to generate the bindings.

But yeah, safe/native-feeling bindings require someone to go through and annotate everything.

> In VM-based languages which don't use unsafe code, vulnerabilities like this are impossible by design:

The VM itself is still going to have unsafe code at some point.



sort by: page size:

The bindings are safe, not the C++ they bind to

> Some projects pride themselves to not use "unsafe" code

This is normal? not having unsafe code does not guarantee absence of bugs, but at least it isolates the problematic bits in the sections marked as unsafe.

If you need unsafe, you can always use unsafe.

Battle tested codebases like bind or openssh — and many others, independently of the language of implementation — had bugs in the past. It always helps to have extra assurances. Using safe/Unsafe is one more tool to have assurances.


> The proof of this is that several core concepts that are considered "safe" have "unsafe" portions that make let work

I think this is actually a common pattern in many different domains. For example, the OS kernel's virtual memory subsystem does many things with "unsafe" primitives -- it has direct access to page tables and can map anything anywhere -- yet it provides a "safe" abstraction of isolated process address spaces.

The way I think about it is that you have to define basic building blocks somewhere. It's not reasonable to build a static analysis that understands the N different varieties of smart pointers, and refcounting, and dynamically-checked mutability (RefCell), and custom allocators (TypedArena), and all that. It's much more elegant to separate concerns, have only raw pointers and lifetimes/borrowing built-in, and put those pieces together in "blessed" ways with all the unsafe code in one place in the library. If you try to build that understanding into the compiler instead, you're just moving the same unsafe-has-to-be-correct algorithm down one level, and unnecessarily complicating things.


> many other compiled language will do with easier coding

At the expense of safety.


> How about just writing safe code

This is a bad argument -- our tools shouldn't make the "easy path" a path that's littered with subtle bugs! It should be easy to do the right thing. See: PHP hand-coded apps that led to SQL injection often versus Python's DBAPI which makes it harder to make mistakes, Signal versus some PGP GUI, etc. People make mistakes when their tools make it easy for them to do the wrong thing.

I have the same complaint against much of the standard C library too, but for a language that the user interacts with on a daily basis, this is unacceptable.


> Nope, you can very much make them safer. It might not make them the "safest" it can be, but it can totally make them safer.

This one might come down to conflicting definitions of "safe". The author seems to define safety purely in relation to the number of language features with which you can contrive to shoot yourself in the foot. You seem to also take into account the number of language features you can use to make it harder to shoot yourself in the foot.

I think there's some value in both definitions. The author's perspective captures a lot of what worries me about languages like C++. But it also fails to give C++ any credit for all the language improvements that have materially improved the overall quality of most C++ code over the past couple decades.


>'Safe' programming languages will improve security, presumably at the cost of usability through decreased performance. A bit like how a Ferrari is faster than a Volvo, but not as safe.

This is fundamentally untrue. Safeness of the kind described in the article does not come at the cost of performance -- all type checks, invariant conditions and general formal proofs of correctness are determined at compile time. The produced code is to be indistinguishable from a correctly written, efficient C program. You are allowed to play as much as you like with direct memory accesses and use all the dirty tricks you like as longs as you can prove that the resulting program is formally correct.

What could be argued is that price you pay for all this is the difficulty of writing such programs. But definitely not their performance.


Aaah, that's what you meant by unsafe. I thought you meant that using a native code call was technically unsafe. Thanks for the clarification. (and that actually sounds pretty awesome for game development, I'll have to look into it)

"But I disagree that it's the responsibility of the language to keep potentially dangerous tools out of the hands of developers."

That is pretty much entirely the point of higher level programming languages.

Like preventing you from allocating and freeing memory on your own, because you might screw it up.

Or removing pointer arithmetic.

Or reducing the scope of mutability.

Or preventing access to "private" object variables.

Many programming language features are basically guards to make it less likely you cut your hand off.


So is it possible to write safe software in a native/unsafe/unmanaged language if you know what you're doing? Wow, that's exactly what I tell people, but it isn't what you read on the first page of any number of Java programming books (of course, those books tell you that you must do manual resource management in C or C++ which is not, in fact, true ( e.g., http://www.apachetutor.org/dev/pools and http://www.artima.com/intv/modern3.html )).

And, really, who wants to use a language designed for people who don't know what they're doing?

I'm not saying that it's impossible to write a JVM in a managed language. Just that it's funny to hear people talk about native languages with a hint of fear, and then run their "safe" programs on a VM that is implemented with a half million lines of C and C++.


More generally, I don't understand this argument. Assuming you can trust the C compiler (big if, but at least some validated (large subset of) C compilers exist; see CompCert), I don't get why this would be worse then generating machine code in a safe language.

C# actually has unsafe too. And it's pretty common in both of these languages for there to be C libraries involved somewhere in the stack.

It doesn't follow that anything but the lowest-level wrappers that call into the OS APIs have to be unsafe. Those wrappers can be a tiny fraction of the runtime code.

>It is a great language but there is very little you can do safely.

Even extremely performance-oriented and low level projects such as the standard library or the Redox kernel are less than 5% unsafe code. There's lots you can do safely.


"If C requires me to only code in certain styles and only use certain tools to only get some increase in safety... Why not just use a language that builds safety in from the beginning?"

What language actually meets the need? GC'd languages haven't proven themselves useful for things like OSs, databases, or libraries that are deployed extremely widely in many environments.

So what safe, non-GC language are you advocating?


I totally agree. That's the kind of complexity and BS that developers often have to deal with which I'm referring to. Hard to imagine an unsafe by default language not breaking eventually.

Pretty much ever safe language has unsafe features though, either by calling out to C or something like sun.misc.Unsafe in java.

> You can always smuggle unsafe past the compiler, it can't stop you even in principle.

The linked updated library uses a different method: it literally smuggles the "unsafe" keyword past the safety checks by removing the space character from "un safe".

This can and should be caught by the compiler -- it has full access the syntax tree at every intermediate stage of compilation! Instead, the Cargo tool and the rustc compiler are simply keyword-searching for the literal string "unsafe", and are hence easily fooled.

Note that this updated method is not the same thing as the Linux process memory mapping and doesn't rely on OS APIs in any way. It is a purely compile-time hack, not a runtime one.

What I'd love to see is an healthy ecosystem of open-source packages that are truly safe: using a safe language, without any escape hatches, etc...

E.g.: Libraries for decoding new image formats like JPEG XL ought to be 100% safe, but they're not! This hampers adoption, because browsers won't import huge chunks of potentially dangerous code.

Now we can't have HDR photos on the web because of this specific issue: The Chromium team removed libjxl from the code base because of legitimate security concerns. A million other things are also going to be "unsupported" for a long time because of the perceived (and real!) dangers of importing third-party dependencies.

We'll be stuck with JPG, GIF, and PNG forever because there is no easy way to share code that is truly safe in the "pure function with no side-effects" sense.

PS: This type of issues is also the root-cause of issues like Log4J and various XML decoder security problems. By default, most languages allow arbitrary code even for libraries that ought to perform "pure transformations" of one data format to another, such as reaching out to LDAP servers over TCP/IP sockets, as in the case with Log4J. Similarly, XML decoders by default use HTTP to retrieve referenced files, which is madness. Even "modern" formats like YAML make this mistake, with external file references on by default!

> What you probably want is a runtime VM, like WASM.

Sure, that's one way to sandbox applications, but in principle it ought to be entirely possible to have a totally safe ahead-of-time compiled language and/or standard library.

Rust is really close to this goal, but falls just short because of tricks like this macro backdoor. (It would also need a safe subset of the standard library.)


So why is `unsafe` even in the language, then? It would seem to me that it simultaneously undermines safety guarantees, and it can't even ensure performance improvements.
next

Legal | privacy