Hacker Read

volta83 | karma 1480 | avg karma 3.42 · 2021-10-11 09:55:38

In C++, you have `std::optional<T>`. That's a type that either contains a `T` or contains nothing.

In C++, sizeof(optional<T>) > sizeof(T) because the discriminat has to be stored somewhere.

This is true even if, e.g., you do something like `optional<T&>`. You know that T& is a non-null pointer, and you only have two variants, and one of the variants has no state, so you technically can encode this as 0x0 is the "no reference" variant, and the != 0x0 is the reference variant, and have `optional<T&>` have the same size as `T&`.

Rust does these layout optimizations of compressing the discriminant into gaps in the values of discriminated unions automatically.

So:

    enum Option<T> {
        Some(T),
        None
    }

for `Option<T&>` has the same size as `T&` in Rust, as opposed to C++.

In C, an example would be:

    struct DU {
        enum { A, B } discriminant;
        union {
            bool A;
            bool B;
        }
    };

You could encode that into 3 bits (i.e. have sizeof(DU) == 1), but instead you'll have at least sizeof(DU) == 2, because you need one byte for the discriminant, and one byte for the payload.

It is very easy to create values with gaps in Rust, but C doesn't really support doing this.

Another optimization are alignment optimizations. In C, if you write:

   struct S {
       uint8_t a;
       uint32_t b;
       uint8_t c;
   };

that ends up being 12 bytes long. In Rust, by default, that gets reordered as uint32_t, uint8_t, uint8_t, so it only ends up being 8 bytes.

If you want that instead to be laid out like in C, you can write:

    #[repr(C)]
    struct S {
       a: u8,
       b: u32,
       c: u8
     }

and then you get the same 12 bytes as in C. There are many supported `repr(...)` options supported for algebraic data types, e.g., you can use repr(u32) for the Rust enum above to store the discriminant in a u32, and get the same layout as DU in C.

gpderetta | karma 12081 | avg karma 1.83 · 2021-10-11 10:09:24

Right, c++ has not builtin discriminated union, and std::optional<T&> is actually not valid in C++ (unfortunately).

But discriminant-less custom optionals classes with 'zero' type support via traits are easy to do (and I have done it many times). You do not have to rely on the compiler identifying a safe empty state and you can define any application specific one. For example if for one specific (and common) use case strings are always non null, optional<string, non_null_trait> has an obvious implementation.

So, yes, these sorts of optimizations are not done by the compiler has it has no notion of discriminated types (one day maybe...), but can be done generically by the programmer. In fact you could in principle optimize multiple layers of variant<optional<...>,... > and collapse everything in one discriminant,as long as you do not provide reference access to the sub variants; the required metaprogram is not going to be pretty though.

One of the advantages of C++ is that it allows this sort of control.

reply

cogman10 | karma 11442 | avg karma 3.93 · 2021-10-11 13:36:35

> One of the advantages of C++ is that it allows this sort of control.

But I believe what the OP showed is that Rust also allows you to have that level of control. However, the default behavior for rust is to do the optimization that you have to go out of your way to manually implement in C++.

Rust isn't taking away control from what you can do in C++, instead, it's made the idiomatic approach one that is well optimized by the compiler.

reply

lenkite | karma 3979 | avg karma 1.77 · 2021-10-11 10:56:46

There are compiler directives for packed structs in C++. struct __attribute__ ((packed))

steveklabnik | karma 91260 | avg karma 5.08 · 2021-10-11 11:32:42

Rust also has a packed attribute; this is something different. The re-ordering still includes padding, whereas packed removes the padding.

fxtentacle | karma 18712 | avg karma 5.34 · 2021-10-11 13:35:01

Depending on the values in b, you could do bit-packing in C++ and munch it down to just 1 byte.

zozbot234 | karma 19616 | avg karma 2.16 · 2021-10-11 14:43:36

You can only do bitpacking if individual fields of a struct cannot be accessed via reference (or 'borrowed', in Rust terms). While C/C++ includes the 'register' keyword which forbids reference access, Rust does not.