Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> You are allowed to dereference a null pointer

Can you cite me something for that? The very first mention of dereferencing in the C++03 standard - ISO/IEC 14882:2003 1.9 Program exection ¶ 4 (page 5) - would seem to disagree:

> Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. ]

EDIT: As does 8.3.2 References ¶ 4 (page 136):

> [...] [Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. ]



sort by: page size:

The standard doesn't say dereferencing a null pointer is invalid. The fact that it gives it as an example of undefined behaviour is a defect in the standard. In the discussion on core language issue #232, the intent has been explicitly stated:

> Notes from the October 2003 meeting:

> ...

> We agreed that the approach in the standard seems okay: p = 0; *p; is not inherently an error. An lvalue-to-rvalue conversion would give it undefined behavior.

http://open-std.org/JTC1/SC22/WG21/docs/cwg_active.html#232

WRT your edit: no, that says dereferencing a null pointer and binding a reference to the result produces undefined behaviour. That agrees with what I was saying. "which" refers to the whole of "to bind it to the "object" obtained by dereferencing a null pointer", not just to "dereferencing a null pointer".


I guess it counts as dereferencing. Here is what standard says

"A reference shall be initialized to refer to a valid object or function. Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior"


It is a bug, but it's a bug in the spec. Saying that a common mistake like dereferencing the null pointer is undefined and therefore your program can do anything is not useful behavior. The only sane design is for any attempt to dereference the null pointer to cause the program to signal an error somehow. Exactly how that happens can be left unspecified, but that it must happen cannot be unspecified in a sane design. I don't see how any reasonable person could possibly dispute this.

> could you or him point me to any page in the C99 standard document where it is explicitly stated along the lines that the compiler "should assume" the pointer to be not NULL after an undefined dereferencing in line

You won't find a line that states it explicitly, but the standard does allow it.

If the pointer is null, de-referencing it invokes undefined behavior, so the program is allowed to do literally anything. That includes doing whatever it would have done if the pointer wasn't null. So the compiler is allowed to assume that the pointer is not null.


I have a clarification: Dereferencing a null pointer in C++ doesn’t reliably crash anymore, unfortunately, and if you still believe it does, then I do not want to run your C++ code. I assume that the author understands this and is only trying to simplify. My problem is that this is already such a widespread and dangerous myth that I’m sad to see it perpetuated in an otherwise great article.

For anyone who’s wondering, I’m referencing “UB” here (which is short for Undefined Behavior, but don’t be confused by the English language meaning, it’s a precise technical term in the spec). Skipping the details, there’s a surprising (and growing) amount of situations where a null pointer access leads to silent incorrect code execution instead of a crash already, with standard compilers and CPUs. C++ programmers need to deal with that on any platform. As I’m sure the author is aware, what the WASM compilers do here is well within the spec.


> The problem is that dereferencing a null pointer does not actually have undefined semantics, it has system defined semantics. The compiler should compile source code in such a way as to produce machine code that does whatever that system does when a null pointer is dereferenced.

Null pointer dereference is just a special case of invalid pointer dereference. And that does not have consistent results anyways (may segfault or may just return garbage)


> The argument comes down to when the undefined behavior occurs: is it at the deference to create the reference, or is it on the first memory access using the reference? The language pedants will say the former, but in practice, it's the latter.

The undefined behavior is always when the null reference is created. The issue will _usually_ manifest when you try to dereference the pointer, but the undefined behavior was creating the null reference in the first place.


How do C++ references not protect you from NULL ?

From the standard:

> A reference shall be initialized to refer to a valid object or function. [Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. ]


> Dereferencing NULL, in a C program, is ALWAYS undefined behavior.

1. you're not required to use C

2. an extension to the C standard can decide to define that behaviour


That depends though. In Java/C# dereferencing a null reference is a runtime error and you can pinpoint exactly where it happened. In C++/Rust, dereferencing null pointers is undefined behavior since the compiler makes assumptions that a dereferenced pointer cannot be null.

Yep. :-)

The article doesn't make much sense at times: Dereferencing a NULL Pointer: >>>contrary to popular belief<<<, dereferencing a null pointer in C is undefined.

I guess C programmers are not counted in the "popular" group.


> Suppose we want to make division by zero and null pointer dereference defined.

A good example is WebAssembly*—address 0x00000000 is a perfectly fine and well-defined address in linear memory. In practice though, most code you’ll come across targeting WebAssembly treats it as if dereferencing it is undefined behavior.

* Of course WebAssembly is a compiler target rather than a language, but it serves as a good example of the point you’re making.


> The problem is that dereferencing a null pointer does not actually have undefined semantics, it has system defined semantics.

No it does not. Look in the standard.

Seriously though, I get what you mean but what is the point of having a systems-defined NULL dereference? The semantics of the NULL pointer (in C) is such that you are not meant to dereference it. If you dereference it, you have a bug no matter what the manifestation on a specific system is.

And there are probably actual systems where it would be hard to make the behaviour defined. For example MMU-less systems, where you can derefence byte 0 but it might not be statically clear what kind of value is stored there?

Maybe C allows a systems-defined behaviour to be implemented as undefined behaviour? But then the distinction is kinda moot.


Better try instead of reading standards (and inevitably misreading, or reading differently than compiler authors)...

    #include <stdio.h>
    void test(int& x)
    {
        printf("Hello, world\n"); fflush(stdout);
        printf("and the number is: %d\n", x);
    }
    int main(void)
    {
        int *x = NULL;
        test(*x);
        return 0;
    }
It's just a syntactic discipline. Null references are undefined in C++, just as NULL dereferences are undefined in C.

I'm sorry, but that is not what the current standard, no matter its weaknesses, requires. The standard says that dereferencing a null pointer results in undefined behavior. The compiler writer chose to (a) deduce that the pointer dereference was not an oversight and (b) then produce a pointless "optimization" that compounded the error. Compiling the dereference to a move instruction is perfectly compatible with the standard.

>NULL is a handy way to enforce that fundamentally, in that dereferencing a null pointer guarantees a crash

In the context of C and C++, dereferencing a null pointer does not guarantee a crash, or anything else for that matter. In fact, not only does it not give you any guarantees, it nullifies any guarantees you might otherwise have had, because the behaviour of a program that dereferences a null pointer is undefined. Now, an implementation is of course free to make guarantees about programs that have undefined behaviour, but none I know of does.

>Part of programming is about memory management. You can hide it with a sophisticated compiler but it's always going to be there, and there are always going to be two types of memory: that which is available, and that which is not.

This is not true. It is perfectly possible to write useful programs that never have to deal with unavailable memory or even memory management at all past compile time, even in C. This is quite common (as are implementations that don't crash programs which dereference null pointers) when working with embedded systems, for example.


Dereferencing a null pointer and then immediately "undoing" it by taking its address is actually legal, I believe. I think the undefined behavior here is the member access instead of the magic sequence &* which is supposed to cancel out.

Dereferencing a null pointer to convert it to a reference causes undefined behavior, there's nothing safe about it!

"Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior."


> Dereferencing a runtime-known NULL pointer is UB in the standard, but is well defined to trigger a segfault in pretty much every arch with an MMU.

This is incorrect for C/C++ though. Modern compilers definitely treat null dereferences as UB, with real consequences (e.g. eliminating redundant null pointer checks). The compiler is part of the architecture.

next

Legal | privacy