Hacker Read

wokwokwok · 2021-05-16 00:15:38

If you’re applying arbitrary invariants that aren’t codified in the type system, you may as well just use raw pointers.

Offsets cause bugs when you do things like assume your strings are of a particular length; that’s why they’re a bad choice.

Use a crate that provides a safe abstraction, don’t roll it by hand yourself, and the safe abstraction can be implemented in whatever way you want; but it’s not safe if it doesn’t enforce the invariants.

reply

uxcn | karma 235 | avg karma 0.85 · | 2015-09-14 17:54:18+00:00

I was recently bit using a similar technique to the original code for an intrusive type in C. Using manual non-typesafe raw offsets for things can definitely lead to nasty bugs.

Automating the calculation of offsets and assigning pointers definitely eliminates a lot of potential bugs, but I do wonder if this isn't a bigger deficiency in C++. Why force storing an extra pointer per array? I am still not sure why C++ doesn't allow FAMs [1].

There's probably a question of which would be more efficient, storing pointers in the object and calculating the offset from the pointer, or dynamically calculating the offset within the object. Still, it doesn't seem like a problem developers should need to solve.

[1] https://en.wikipedia.org/wiki/Flexible_array_member

reply

skohan | karma 9090 | avg karma 3.19 · | 2021-09-01 07:47:32

Yeah for me it's more around the ergonomics (and imprecision) about using indexes.

It's a perfectly workable approach, but passing around offsets feels like I'm breaking the contract a bit. The compiler doesn't know which index goes with which data structure, I'm just asking the compiler to trust me that I'm pairing them correctly.

Also it tends to be more boilerplate than just having a direct pointer stored inside the data structure.

Don't get me wrong, I understand all the problems with pointers, but it the UX is better in a lot of cases.

reply

coldacid | karma 1243 | avg karma 1.59 · | 2022-09-29 15:22:12

Offsets are important in a programming language where you are regularly dealing with memory addresses, not nearly so much in one where memory access is almost completely abstracted away.

gpderetta | karma 12081 | avg karma 1.83 · | 2021-04-07 20:05:07+00:00

There is no problem using raw pointers for non owning pointers. Also a lot of safe abstractions can be built on top of raw pointers (and smart pointers are obviously an example).

Pragmatism beats dogmatism in practice.

reply

tedunangst | karma 26000 | avg karma 2.74 · | 2010-12-12 19:41:37+00:00

I don't like offsets because I don't they are that straightforward. First, the offset is useless without the base pointer. So that either needs to be global or you need to store that too (and now you have fat pointers). But the base can change, so you really need to store a reference to it. yuck.

And you still can't delete anything when using offsets. Forgetting about random deletions (which are important) an array works for a stack, but not a fifo, and fifos are at least as common.

reply

IshKebab | karma 13023 | avg karma 1.29 · | 2023-05-30 01:11:49

The offset addressing modes all start from zero. You could hack around it in some places by storing the pointer to an element before but that's clearly going against how it is intended to be used.

For example if you have an array that you cast between different sized types (e.g. uint64 to uint8) then you have to change the pointer value!

reply

ithkuil | karma 7515 | avg karma 2.06 · | 2021-01-23 15:14:57+00:00

If you use offsets instead of pointers you're doing relocations "on the fly"

vvanders | karma 14177 | avg karma 4.33 · | 2018-02-23 22:50:18+00:00

If your statics have containers(yes I know it's a bad idea, but not forbidden) then you can easily have invalid pointers now with that approach.

Really you're going to have to go through so many gyrations that your GC code may start looking a lot like manual memory management.

reply

amluto | karma 19119 | avg karma 3.85 · | 2023-05-13 08:41:13

> Others don't as it breaks things like mmap of prebuilt hashtables.

Can you elaborate? An mmap of prebuilt hash tables doesn’t work well in practice of the mmapped area contains pointers regardless of provenance, and an mmap of a hashtable that uses integer offsets doesn’t involve pointers.

The only real issue I see is if the mmap contains objects but isn’t itself laid out like an object in the language in question, and you need to generate a pointer to one of those objects. (So mmapping an array of structs that don’t contain pointers is fine, but mmapping a mess that contains integer offsets referencing various things in the mmap that don’t nicely line up like an array is harder.)

But I imagine that a pointer provenance system could have an operation that takes as input an mmap, an offset and a type and returns a pointer to the object with the type in question at the offset in question. It would check that the type makes sense (no pointers!) and could, if needed for the degree of safety require, also check for invalid aliasing.

reply

95014_refugee | karma 459 | avg karma 2.22 · | 2024-03-21 20:20:43

You... haven't worked on a large codebase, have you?

The part that baffles me about this entire post is that it's trivial to obtain pointers to member functions legally, without the fragility associated with guessing VTable offsets.

reply

einpoklum | karma 3425 | avg karma 0.93 · | 2021-06-22 22:53:26+00:00

You can also use an offset instead of a pointer (or a pointer to the object and and an offset into its heap storage, or whatever).

gpderetta | karma 12081 | avg karma 1.83 · | 2024-03-12 19:19:55

Well, the point is safety for existing code. If you can annotate pointers to match them with their bound you can as easily replace them with a span and avoid needing compiler heroics.

Edit: unless you absolutely need the change to be ABI stable, but even then there are ways around that.

reply

Const-me | karma 5797 | avg karma 1.88 · | 2016-06-25 16:37:37

About raw pointers — no you can’t do same as in C++. There’s no malloc/free, and placement new is still unstable. Also they feel foreign to the language, and sometimes I can’t even get them from the standard collections (like when I need 2 mutable pointers to different elements).

About the safety, it’s important to understand there’s a price. In some cases, it causes simple things hard to implement, or cost performance.

For example, there’re algorithms out there that modify items of the same collection at the same time, both single threaded like sorting, and parallel.

Can such algorithm cause a data race? Yes.

Is it good the compiler tells me about that? Sure.

Is it good the compiler prevents me from doing that at all, and forces me to code some workarounds, that will cost me time to implement, will be ugly, and/or will sacrifice runtime performance? Not sure, I like having a choice.

reply

hinkley | karma 39933 | avg karma 2.46 · | 2022-06-30 11:28:16

Those change the type annotations for a memory location. Also if you tell me there are no concurrency gotchas for this then I’ll tell you that you have a single threaded interpreter or you’re lying.

We are talking about retaining the type and swapping the pointers from an independent object to a set of offsets into a struct of arrays.

reply

lordnacho | karma 27890 | avg karma 4.8 · | 2015-06-02 12:06:50+00:00

What about using smart pointers? There's a whole class of potential errors related to using raw pointers.

xfer | karma 1113 | avg karma 1.61 · | 2021-03-18 14:47:16+00:00

Are you talking about go-style explicit pointers? I don't know if it is worth it to add them with a different semantics and dealing with nulls when you can have value types with an attribute. Pointers as an abstraction also leak implementation detail.

shultays | karma 1649 | avg karma 1.84 · | 2022-11-28 09:50:02

I would be surprised if C guarantees those x and y to have same offsets

Plus you cant cast and dereference one type's pointer to the other. That would be UB

reply

AstralStorm | karma 5566 | avg karma 0.91 · | 2018-03-07 18:00:41+00:00

I suppose you shouldn't be using pointers if not required, not even smart(er) ones.

They are not free and prevent certain kinds of optimization as well as allow accidentally sharing state.

reply

afc | karma 553 | avg karma 4.01 · | 2023-12-01 02:42:34

And when he manages to show that smart pointers can be misused and lead to obscure crashes, what's his solution? Just use raw pointers.

Those can't be misused ... right?

reply