> In both cases, the compiler supposedly looks up the name on the right to figure out which struct type you intended. So foo->bar if foo is an integer would do something like ((Foo)foo)->bar where Foo is a struct that contains a member named bar, and foo.bar would be like ((Foo)&foo)->bar. The text doesn’t say how ambiguity is resolved (i.e., if there are multiple structs that have a member with the given name).
This is because in pre-ANSI C the names of struct fields were not internal to their enclosing struct but global identifiers that simply represented a type and offset. There was no ambiguity because different structs could not have the same member names. As raimue pointed out this is why some *nix struct members are still prefixed with a namespace like st_ for struct stat.
By the way, in the original C language by Dennis R. struct member names were in a kind of global namespace of all such names. So for instance if we had a
struct stat { ... int st_ino; ... }
struct dirent { ... int d_ino; ... } d;
It was possible to access:
d.st_ino;
The compiler would just use the st_ino offset from struct stat, and generate an access into the struct bar b.
This is the main reason members have funny prefixes in Unix structures. Once upon a time they had to, to prevent clashes.
// C
typedef struct
{
int a;
struct
{
double x;
} bar;
} Foo;
// C3
struct Foo
{
int a;
struct bar
{
double x;
}
}
Very confused by this. The C code declares an anonymous struct type, then aliases the typename "Foo" to that anonymous struct type. The C3 code seems to declares a named struct type "Foo" -- why isn't the C equivalent here just "struct Foo"?
But then within the struct it gets weirder... the C code declares a second anonymous struct, and then declares a member variable of that type. The C3 code... declares a struct named "bar" and also a member variable with name matching the type? Except the primer says that these are equivalent, so the C3 code is declaring an anonymous struct and a member of that type? Using the same syntax as the outer declaration did to declare a named type but no (global) variable?? Is this case sensitive?
I don't think I can get further into the primer than this... even taking the author at their word that the two snippets are equivalent, I don't understand what's in play (case sensitivity? declarations where variable name must match type name?) to make this sane, and there's zero rationale given for these decisions.
This is true. It also explains why a lot of old-time Unix struct member names (including a lot that made it into POSIX or are otherwise in use today) have redundant-seeming prefixes. eg. struct stat has st_mode and not simply mode. struct sockaddr_in has sin_addr and not simply addr.
The first part about the name is just like C++: you use the name without `struct` unlike C where structs has its own namespace. That's what it's meant to illustrate.
The second question is more subtle. In C, the syntax is `struct { ... } [optional member name] ;`. Because there is no anonymous struct at the top level, the anonymous structs inside of a struct has a different syntax, also eschewing the final `;`, changing the syntax to `struct [optional member name] { ... }`. If the C syntax structure is desired a final `;` would be required. This syntax change comes from C2.
No, you misunderstand. There is no `struct Foo foo`. Unlike C `struct Foo` would only ever be valid at the top level.
Neither `struct Foo foo` nor `struct bar Bar` works.
The reason why `Bar` is a type and `bar` is a variable or function is to resolve the ambiguity of C syntax without having to rely on unlimited lookahead or the lexer hack. Basically it allows tools to parse the code much more easily than they would C.
You can declare multiple fields in a struct, e.g. `struct Foo { int x, y; }`, but you can't write `int x, y = 123;` like in C. This is because it would create ambiguities in the extended for/while/if syntax C3 adds.
Once upon a time, when += was spelled =+, struct members in C were global names whose "values" were the offsets from the beginning of the struct[1]; a.b is simply a[ptrtab[b]] and a->b is a[0][ptrtab[b]].
[1]: This is why all of the names of struct members in unix are "needlessly" prefixed; i.e. struct stat.st_mode is not struct stat.mode because that could conflict with struct foo.mode until the early 1980s.
But "XX XX;" (the example was given in the article) is valid, and the parser needs to know the difference.
The article says that this code is "evil", but I disagree. There is a valid pattern that I use quite often (I described it in an article for ACCU a few years ago, too). It's an example from C++ but I think I could apply it to C as well. Consider (with a modification of the example I used in that article):
PirateShip s;
printf("Name of the first cannon: %s\n", s.Cannons.cannon1);
which is very readable and idiomatic (once you get used to it ;) )
More abstract, it provides an idiomatic way to group collections of objects/structs at the source code/syntactic level.
(one could name the struct and the member differently, I realize; the benefit of using the same name is when you define constant members in the inner struct, that way you can reference constants and properties with the same name and you never have to think about which one to use.)
(also again I use this from C++, there may be things that are different in C, but afaik this part is the same).
When folks talk about namespaces, structs, & typedefs in C, it's to call out that `struct X` and `typedef int X;` can both exist at the same time. In other words, nothing prevents one from doing a `typedef struct X { int f; } X;` or similar, using `X` as the name in both places.
It is true that it's more correct to say "struct names have their own namespace", though I think we can infer what's actually meant here fairly easily.
I wasn't aware of that last point. Is this the answer why we have uniquely prefixed struct members in POSIX? Such as st_* in struct stat or d_* in struct dirent? Is that just a coincidence?
> Now you have a named struct bla_t which is typedef’ed to a type alias bla_t, and the named struct can be properly forward-declared.
The article doesn't mention it, but C11 allows typedef redefinitions which means instead of forward declaring "struct bla_t" and referring to it with the struct keyword, you can instead forward declare it as "typedef struct bla_t bla_t" and refer to it _without_ the struct keyword.
> In particular, there is no shortcut for the zero value for a struct value (not a pointer) — you have to specify the struct name followed by an empty set of braces
There's no such thing as a struct name. Structs are unnamed. The only way to have a zero valued struct... is, obviously, to instantiate it with its members as zero values. Exactly what struct{members}{} is.
`type X struct { age int }` just makes X refer to a particular type of a struct, but this also works:
I disagree with the "always typedef structs" part. Personally I often find it nicer to have the "struct" there as part of the name. Partly because people sometimes typedef a pointer to the underlying struct type, and that can cause some issues.
The part about the author's take on RAII is interesting, but it would really merit a separate post with code examples.
> It's much more difficult to read since you have to know the position of elements of the struct
At least the Go editor I use (GoLand by JetBrains, one of the more popular ones) will display the names of all the fields in positional initializers inline with the code (just as they would be in a named initializer), so it's pretty much the same readability-wise. There's a special color for these automatic labels that indicates they aren't actually part of the document.
> in the case you add and remove a field of the same type, it's entirely possible the program will compile fine
True although I think it's easier to work around this (by fixing the compile errors from removing the old field before adding the new one).
I think it's aimed at C programmers. foo is a struct, so it's a type, it's not a variable. The point is just that struct bar is also defined by the definition of struct foo.
Oh gosh... digging further on the same page under "Identifiers" it looks like case sensitivity is the key here. So "struct Foo" declares a type "Foo", and "struct foo" declares a variable "foo" of new anonymous type. I assume "struct Foo foo" and "struct bar Bar" do exactly what you (don't) expect, and maybe even "struct foo bar baz {}" to be the equivalent of the C code "struct {} foo, bar, baz"... yikes.
Edit: "Declaring more than one variable at a time is not allowed." So there's no equivalent to the C code ""struct {} foo, bar, baz"... not clear if "struct IDontNeedANameButTheLanguageIsForcingMe foo {}; IDontNeedANameButTheLanguageIsForcingMe bar; IDontNeedANameButTheLanguageIsForcingMe baz;" is legal (modulo that some of those semicolons are illegal I think?).
This is because in pre-ANSI C the names of struct fields were not internal to their enclosing struct but global identifiers that simply represented a type and offset. There was no ambiguity because different structs could not have the same member names. As raimue pointed out this is why some *nix struct members are still prefixed with a namespace like st_ for struct stat.
reply