Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
C99/C11 dynamic array that mimics C++'s std::vector (solarianprogrammer.com) similar stories update story
84.0 points by AlexeyBrin | karma 9334 | avg karma 9.57 2017-01-07 13:56:05+00:00 | hide | past | favorite | 165 comments



view as:

> What if we want to be able to store more than integers in our dynamic array ? [...] The array_push_back function also needs to be refactored in order to account for the size and type of what we store in the data buffer. A possible approach is to use a macro, instead of the original function:

This is a situation where C++ really shines: you can use C++ with templates and not only have much cleaner code, but also avoid forcing the compiler to inline every single call to array_push_back. And you don't even need to use anything more than structs and functions, you don't need to go "full OO".

I wish C programmers would be more open to C++ but it seems like they're for the most part pretty closed. That's probably the C++ community's fault, but I'm not sure how to make amends.


In that regards, I also think C++ programmers should be more open to C programmers that just needs a few new language constructs.

It is not an all or nothing approach. It's fine to write code that looks and feels mostly like C, but still takes advantage of a few nice C++ features.


I agree, which is why I mentioned that you can do what this article suggests without using anything except functions and structs. No classes or class features required.

There are perfectly legitimate use-cases for such a "C+" style. The disadvantage is that it won't be considered "good C++ style" and would limit the attractiveness of the project to C++ programmers that enjoy modern C++.

If one would include vector, string, array, smart pointers and maybe simple template code in C+ it would already have a big safety advantage over plain C.


> would limit the attractiveness of the project to C++ programmers that enjoy modern C++

As Linus wrote when someone told him on the mailing lists that Git should've be done in C++

"""

Quite frankly, even if the choice of C were to do nothing but keep the C++ programmers out, that in itself would be a huge reason to use C.

"""


Yet he had to move to Qt for Substrate due to issues with Gtk.

Well, Linus is a great C -programmer, so that's something he knows well and likes to protect and embrace.

Sounds like a bitter old fossil. My first look at Linux, there were 200 sound-card drivers in the source base. Mostly identical except for the layout of bits in the control register. Any change to the api or conventions, 200 files had to be edited. An ideal place to refactor as a 'sound card' base class and derive classes to handle the (miniscule) differences.

"Keeping new ideas away" is a good working definition of an old coot.


You don't necessarily need C++ to implement a good driver design with base classes and minimal code duplication. It is perfectly possible in C.

That's pretty much how I use C++. Granted, my knowledge of it is very minimal. I was forced into C++ because of my Broker's api so...

One of my largest gripes making the switch, is the OO paradigm. I get it, I understand it, and I really want to love it. But the theory of it vs. the implementations that I've seen, ugh. And the number of ways you can initialize variables etc... just makes no sense to me. It's like they just keep adding new ways to do things, for no other reason than because they can. What's wrong with only have one way to do simple things like that?

I don't know, maybe it's just me but I strive to keep my code as simple, concise as I possibly can. I have enough complexity to deal with solving problems than having to wrestle with my language on top of it. Just to note, I'm strictly talking about having to use other people's code merged with my own. If it was solely me writing, I'd just use C++ for some of the nice things built in and toss classes and all that into the fire. YMMV.


We need a book with modern best practices like what happened in the Javascript world with "JavaScript: The Good Parts" from which we can glean a clean effective dialect, leading to a new C++ uptake and renaissance similar to what happened in the Javascript world in 2008


"However, no new edition of The C Programming Language has been issued to cover the more recent standards."

The book should be read by every C programmer, but it is really outdated and has many bad practices including buffer overflows etc. in examples. For learning "good parts" it certainly isn't the right thing.


Which section(s) contain buffer overflows? Also, do you remember the details on any other bad practices from the book? Thanks!

I don't think it will ever happen, at least in the enterprise space.

Companies are quite happy to use more productive, safer, languages like the ones on top of JVM and CLR, with C++'s role being left to infrastructure code.

The OSes and tooling from Apple, Google and Microsoft are good examples of it.

C++ is there on the lower levels, for hardware support, low level graphics, language runtimes but everything else ends up in Objective-C, Swift, Java, VB.NET, C# and F#.

From all those OSes, UWP is the only one where C++ enjoys parity with the remaining languages and apparently isn't that much used.

Even Microsoft does most of their UWP presentations in C# and despite in ongoing work in C++/WinRT as C++/CX, I doubt it will change the situation that much.


They're not really equivalent, but Scott Meyers' series of books on C++ are some of my favorite technical writing ever, and his most recent book "Effective Modern C++" should be on the shelf of every working C++ programmer.

I agree with this - I finished reading this just before last year and I cannot recommend this book enough. It really is excellent (including topics like reference collapsing, type deduction in different scenarios, idiosyncrasies of async tasks). A great book and essential.

"JavaScript: The Good Parts" is outdated and should be no longer recommended. Even when I was first reading this book I had issues with some parts (adding methods to global classes, module pattern)

2008 revival in JS is not because of this book but from node.js, npm, jQuery and browsers becoming more capable. There where other factors like that fact that there is no alternative to JS on client side and HTML was lagging behind before HTML5. Both Java Applets an Flash become abandomware leaving no alternatives.

There is Rust, there should be stronger push towards this language. C++ is like PHP even with heroic efforts cannot be fully fixed. There is too much cruft, arcane syntax, platforms compatibility issues and finally lack module system.


Classes get useful once you're tired of adding "this" pointers to all your "methods" and all the "this->" inside them...

Once a C program gets large enough, it seems to become rather "OO" naturally: there's going to be data grouped into structures, and code that operates on that data. That's what classes and methods are useful for. But you don't have to obsess over "forcing" your design to be OO, i.e. things like debating between A.foo(B) or B.foo(A), because sometimes foo(A,B) or even foo(A,B,C) is the answer.

I agree completely with the complexity argument --- and I'll add that hidden ("encapsulated", "abstracted", whatever you want to call it) complexity is still complexity that can get in the way of debugging and efficiency both in terms of machine and programmer time. The latter point is something that a lot of "modern C++" seems to miss, finding more complex ways to do simple things, which look simpler on the surface but are actually far more complex in total.


> Once a C program gets large enough, it seems to become rather "OO" naturally

Very well put. One of my current projects started in plain C but gradually evolved to the point that it just "became" a C++ project. Once you write the same code a few times in a few different places, it's blindingly obvious what would be would be less painful if represented as objects. And so on with other language constructs (inheritance, templates, etc.).


The main point I'm trying to get across is to let it happen naturally, instead of trying to force everything to become objects and methods. There will still be parts that are "not OO", but they are not OO because they don't need to be; and that's perfectly fine.

Out of curiosity, what methods of initialisation are you confused with?

We have default constructors, copy construction, move construction, braced initialisation and std::initializer_list.

For my part I am working at work on a C++2003 codebase and although I can initialise my C-style array with braced initialisers, I would love to be able to initialise my vector in the same way, but I cannot as it is C++2003. eg.

int myArray[] = { 0, 1, 2, 3}; vector<int> myVector = {0, 1, 2, 3}; // This is impossible in C++03

It may be that the OO implementations you have seen are bad because they don't understand encapsulation or haven't designed their classes from the outside in (i.e., design them interfaces first so that they are expressed from the simple point of view from the vocabulary of the user, not from you who designed the class).

I have worked with horrible C++ that had a hierarchy of inheritance yet failed to understand the point of virtual functions so would cast to child elements from a base class to decide which function to call on the child object, or would have functions in the base class that would dynamic_cast to every possible type of possible child to work out what to do. This was truly horrible. (The correct solution is to use a virtual or pure virtual function on the base and implement this in the children so that the right function gets called). This was within an apparently "object-oriented" system, so if this is your sort of experience I can understand why you would detest it.

The other horrible thing is writing C++ like it is C - pointers everywhere, no understanding or use of RAII, C-style casts everywhere (do you really know better than the compiler what a type is???), writing a million functions to do the same thing instead of a single template function etc. etc. And no const correctness, no use of STL algorithms, putting everything in a C array instead of using the correct STL container for the job, etc. etc. the list is endless


I think people are starting to get annoyed at how how these are all equivalent, and the switch in a lot of books seems to be towards the third version with braces.

    int x = 0; // makes sense
    int x(0); // sure
    int x{0}; // braces?

As a C++ programmer I'm more annoyed that they're not equivalent.

> these are all equivalent

I'm almost positive that isn't the case. Need to find my copy of Effective Modern C++ and get back to you. It's been a while


except they are not all equal. hence why there's 3 ways to do it.

(Optimizations can change things, but per the spec, they are not the same)

the first creates a new variable with a default value and then copies 0 into it. (This is trivial with an int, but not so with a more complex type)

the second case creates x using the copy constructor

the third uses an initializer list, and works similarly to #2


> the first creates a new variable with a default value and then copies 0 into it.

No, no default constructor is invoked. In this specific example, until C++17, a temporary int object is created then x is copy constructed from it [1]. The compiler is explicitly allowed to omit the temporary+copy and directly construct from the parameter as per int x(0), but the constructor must be non-explicit.

From c++-17 on this is actually required and additionally a copy constructor is not required to exist. In practice is equivalent to #2 except for the non-explicit requirement.

Pedantic, I know, but as long as we are trying to clarify the rules is better to be clear.

[1] note: is different from default initialize then assign.


> It's like they just keep adding new ways to do things, for no other reason than because they can. What's wrong with only have one way to do simple things like that?

Precise control over those things is a feature in C(++).

If you weren't unfortunately forced to use C++, a principle of Python is to (try to) have one obvious way of doing things. https://www.python.org/dev/peps/pep-0020/


> One of my largest gripes making the switch, is the OO paradigm.

C++ supports object-oriented programming, but OO is not intrinsic to the language. OO is just one of the many tools. You can write good functional and procedural C++. If you try to shoehorn every problem into the OO model, you're probably doing it wrong.

From the Bjarne Stroustrup himself:

http://www.stroustrup.com/oopsla.pdf

and again here:

https://isocpp.org/blog/2014/12/myths-1

"C++ supports OOP and other programming styles, but is deliberately not limited to any narrow view of “Object Oriented.” It supports a synthesis of programming techniques including object-oriented and generic programming."


The thing is, we get dispointed when the code just looks like "C compiled with C++ compiler", with all security exploits that entails.

A good example is how Turbo Vision, Object Windows Library, Visual Components Library, Qt look like and the concessions Microsoft had to make to Afx so that it got rebranded into MFC and appealed to the Windows C developers.


The idiosyncrasies in Qt etc. are because they were first designed when C++ was very young, and many compilers had poor support for the STL.

You misunderstood me.

I was first talking about the set of frameworks that dared to embrace the then C++ best practices. All of them are older than C++98, or even before STL was being considered.

Then I mentioned the fact that when Microsoft presented their Afx framework to their beta testers, mostly Windows C developers, it was considered too high level and it was rebuilt as MFC.


The term "C with classes" is almost used as an insult in some circles. I wonder if "C++ without classes" would be a better idea.

C++ without polymorphism or inheritance is how the people that I think are worth listening to seem to do things.

How would that be a good idea? I don't understand how I could write effective software without polymorphism or inheritance, other than in tiny tiny applications.

Inheritance is syntactic sugar for composition. But, I think of it as reverse-SS, because I find composition much more readable.

I've written entire components for networked and multiplayer modes for games, and an algorithmic trading system that uses ML in C++ using just structs, unions, templates, functions and composition (no polymorphism or inheritance).

Think about this: what does it give you besides generic execution and data when the size of the data varies?

Templates already let you build generic data structures and algorithms when the size of the data on doesn't vary.


without runtime polymorphism.

/pedantic++


It's always just one or two more features and before you know it - C++

The thing is, everyone has a different idea of what those one or two features should be! Even me,

https://digitalmars.com/articles/b44.html


Yes, despite all the problems in C++ it's just a much better language than C for defining new types that are on par with "native types". std::vector has so many benefits over any possible C implementation that there is no real comparison. Genericity is one, no need to remember to free after use is another.

By the way, why is this C implementation something that requires C99 or C11 features? It looked just like standard old-fashioned ANSI C to me.


I like my C alternative better. I pass growth factors/increments as parameters to the vector macros so that I can affect how it grows on capacity exhaustion during each call and I have macros for creating and closing uninitialized gaps. C++ loses on many potential optimizations by insisting that its types always be in fully well-defined states except inside methods. Moreover the particular GNU implementation of the STL on Linux completely fails to turn certain vectors methods into memsets, memcpies and memmoves where it could, which pesimizes those particular ops by like two decimal orders of magnitude. The insistence on using new/delete based allocators instead of reallocs is a significant pessimization too. Reallocs perform, on average, several tens of percents better than new allocs followed by copies. An even more noticable pessimization is in the compilation times. Including `<vector>` adds good chunks of a second to the build time of an average-length translation unit. In comparison, working with C, even with all my generics included, on top of a good build system makes me feel as if I was working with a scripting language. No lags.

Your criticisms of vector are well put, but if anything they're all really problems with a particular implementation of the standard library, not the C++ language. If you make your own libraries you're free to do whatever you want, and template metaprogramming can help you gain even more performance by easily using optimized implementations for specific types, etc.

These are fair points and issues like these are why many larger projects have their own custom vector-like classes. One thing I did want to call attention to is your point that C++ best practices require that objects be in well-defined states except inside methods. That's true, although it's not a language requirement, but in many cases you can get the best of both worlds through the use of lambdas. A simple example to give the flavor:

  template <typename Function>
  void my_vector::munge(Function initialize_gap) {
    create_uninitialized_gap();

    // Some of our elements now contain uninitialized memory.
    // We don't want to expose this to arbitrary code, but
    // it's OK to expose this to the function the user
    // provided specifically to handle this.
    initialize_gap(gap_begin(), gap_end());

    // The gap is now initialized, so when we return from
    // this method, we'll be in a well-defined state.
  }
This is the standard pattern you use for such things in functional languages, and it works very well. The idea is to ensure that intermediate states are only seen by code that explicitly expects to handle them.

More like the clusterfuck that the language is than the community.

In my codebase, I use expression macros (with a little bit of __typeof__-based type checks) to do it fully generically. Much of what you think you need C++ for can be done almost just as succinctly on top of plain C (I'm talking automated scope cleanups, semi-automated error handling, many generic things) and the compile times fly. I agree C++ has had some good ideas. In fact, I originally wanted to use it. But I came to the conclusion that it's too bloated, and more importantly, fundamentally broken in certain ways (RAII, exceptions, even templates and namespaces), and I ended up emulating what I think the good parts of C++ are on top of plain C. When C++ programmers think C, they usually think lack of generics and lots of explicit manual, micromanagement and pointer arithmetic, but it can be a lot more than that.

I'm not familiar with this approach, but I think it would entail code size increase for each use of the macros, and a runtime overhead for the typeof checks, correct?

If so, C++ has a nice advantage in that all the type checks are at compile-time so you don't pay any more for those, and also you don't duplicate binary code with macros.


__typeof_ is a compile-time feature. It's like C++'s decltype. (You can make C++'s auto out of it too.) There's is a potential for code size increases with this approach, that's true. But the potential also exists with inlinable vector methods.

Yes, but for the vector methods, you leave that decision up to the compiler which should be able to take the best route.

True. However, I think in the case of the dynamic array, this effect is minor. The macros are small, and most of the time, they end up wrapped in a function anyway. But, admittedly,it doesn't scale well to very big generics, which should be always wrapped in a function. (Eventually I plan to shove a simple transpiler in front of all my C code and do this, along with namespacing, how I think it should be done.)

Unfortunately __typeof__ is non-standard. Despite its usefulness it always gets left out of the standard (and with weird excuses, I think last time it was lack of implementations - despite the fact that eg. both GCC and LLVM implement it).

Interestingly, C11 defines _Generic selection, a kind of a switch for types, but its usefulness is hindered by the fact that it cannot be used in conjunction with sizeof.


Expression macros `({, })` aren't standard either. But I sort of go with, if all gcc/clang/tinycc support it and it's a highly useful feature, then it's part of C as far as I'm concerned. It would certainly be nice if ({,}),__typeof__, and __label__ became part of the C standard, though.

My Google skills are failing me here. Is `({,})` a special construct, or are the '{' and '}' metacharacters and you're just referring to variadic macros, which are standard as of C11?

It's a common (gcc/clang/tcc) compiler extension (chapter 6.1 of the gcc manual) that allows you to use parentheses around a compound statement to turn the compound statement into an expression whose value is the last statement in the compound statement. For example `({ int _r; if((_r=foo())) bar(); _r })` behaves like an inline function that returns `_r`. If you want to do generic vector ops purely with macros and you want to do it robustly, you effectively need for the macros to "return" a value signalling whether the potential realloc that might have happened inside the macro succeeded or not. It's a very powerful feature because it allows you to have macros that behave like ducktyped, value-returning inline functions. (Sometimes you'll need to tone down the ducktyping a little, like it's probably a good idea to use some __typeof__-based typechecks (http://stackoverflow.com/questions/41250083/typechecking-in-...) to make the compiler warn you if you attempt to memcpy doubles into an int vector from your vector__insert method.)

Thank you! I've never seen this before; looks handy.

What is fundamentally broken about RAII? To me that is one of the killer features of C++ over C, which requires manual goto statements to achieve something approaching RAII.

Yes I cannot fathom how you could write successful and maintainable C++ code without using RAII. How can you guarantee safe deterministic lifetimes of objects without using RAII?

By not using exceptions, and putting a delete at the end of the scope where your RAII object would have been cleaned up.

What about early return? In practice this logic becomes much very error-prone.

I know three solutions for this:

* (Classic): use goto so that instead of returning directly, all branches jump to a cleanup label. This is tricky to do well, but possible.

* (Nonstandard): GCC offers a nonstandard function attribute to register a function as a destructor for this kind of thing. I'm sure they claim it was inspired by Scheme's dynamic-wind/dynamic-wind concept.

* (Memory pooling): Apache APR relies heavily on memory pools. One feature of APR's memory pools is the ability to register functions to run (in, I believe, non-deterministic order) on data when the pool is eventually destroyed.


Aargh! I should look at nonstandard features that I don't use before I say they solve problems. GCC's construcotrs and destructors (outside of C++ code) are limited to globals and statics. Marking a function a constructor guarantees it's automatically called before main(), and marking a function a destructor guarantees it's automatically called after main() completes.

If you are writing high-reliability C (think military, aerospace, industrial control systems), early return is usually prohibited by coding standards. Then again, dynamic memory allocation usually is too.

Personally, I think early return is lazy programming because not cleaning up non-RAII objects is only one class of bugs that it can introduce. Things like vector renormalization, update notification and even later sections of algorithms can all be missed by early return. Not having early return means you need to explicitly skip later code if you don't want it, rather than essentially turning on "skip-all".


Naked new and deletes should be seen very little in code post-C++11.

Containers and resource handles are Stroustrup's advice in his blue C++ book.

Manual deletion is something I would never really want to see outside a destructor.


I hope this comment is some sort of weird joke.

What's broken about namespaces? If I create a class named Bitmap whilst using Windows.h, will I conflict with GDI::Bitmap???

RAII - how can you possibly guarantee safe lifetimes of objects without RAII?

Templates - do you like writing the same thing a million times and having to update it and fix it in those million places?


> What's broken about namespaces?

The difficulty I have with them is they are open, not closed. Any piece of code can crack open a namespace and insert more names into it. Thus, you can have problems with "hijacking", where foo(int) is inserted, while unknown to you someone else inserted foo(unsigned) that does something completely different, and the compiler decides the latter is a better overload match.

Complicating things further, the names visible in a namespace are dependent on where the compiler is lexically in the code - more names can be inserted further down.

As a counterpoint, many have suggested to me that this openness is a critical capability they need for their code. My answer is that using namespaces in that manner offers little improvement over just using a prefix on the identifiers.

And so the debate goes on!


Well yes, it is a prefix, but it is a prefix that doesn't need to be used when you are already inside the namespace or if you explicity 'using namespace' it. If I had to type the full namespace of functions and enums all the time I'd go insane.

well yes they are open by design. If you want closed namespaces just put your functions (as statics) inside a struct.

Btw, visibility rules in a namespace (but not in a struct/class) are the same as for the global namespace, so no, a function in a namespace won't see another declared after it.


I want something like std::vector, but I don't want exceptions. How does C++ help in this case?

std::vector is a container in a c++ standard library. C++ is a language. You're mixing apples and oranges.

What I'm saying is that you can use C++ without having to use STL and you can use C++ just fine to implement your own exceptionless container if you want (and many such implementations exist already).

And you can also compile with exceptions disabled. Should your code throw it will terminate the process.


So, garbage like "And you can also compile with exceptions disabled. Should your code throw it will terminate the process." is in the plus, and the parent comment is downvoted.

The C++ community at its finest.


It will help you either use std::vector with a custom allocator that does not throw exceptions, or else write your own non-throwing vector class with templates, which would, I hope, be more maintainable and easier to use correctly than an equivalent written in C.

Compile with -fno-exceptions

You can use any part of C++ without having to use the rest of it. For example, you don't have to write your own std::allocator to use std::vector, nor do you have to use exceptions.

You can use which bits of C++ that you want - that's its great strength, permitting low-level behaviour whilst supporting high level abstractions (and as simple or as complex as you like).


While technically you don't need exceptions and can disable it, the reality is the design of the STL is such that exceptions are required. It makes no affordance to signal errors in any other way. You can get farther if you decide OOM is a crash - but otherwise much of the STL is out of bounds.

Are there many applications that recover from OOM?

I haven't seen it in practice, but OOM is conceivably recoverable in certain cases (e.g., exponential vector growth with large capacities fails on a 32 bit system, but switching to linear growth as a recovery strategy succeeds in allocating sufficient additional memory). But OOM errors aren't the only type of errors that you can get in ctors. Disabling exceptions makes all those errors either fatal or your class invariants become silently broken, which makes exceptions and RAII kind of a package deal in robust software.

The point was that STL is unusable without exception because of OOM. If you're not using exception anywhere in your codebase and you don't plan to recover from OOM, is this still an issue? (I don't have much experience with exception handling in large codebase, I've mostly worked on codebase that disabled exceptions)

Realistically even our operating systems aren't OOM safe. Linux will happily over commit swap but kill you when you try to use it.

Windows actually provides a stronger gurantee here. But if any of your dependencies are not OOM safe then neither are you. And much of windows userland is not OOM safe.


A solution might be a minimal subset of C++, so with STL containers, some syntactic sugar, easier and faster to compile (modules), etc.

Basically what go and rust have achieved, except you keep most of the C/C++ syntax "taste", and you remove too high level stuff like templates and inheritance. Honestly I don't think templates are so useful, since most programmers already re-use STL containers, which are templates, but nobody really write relevant template classes.

> That's probably the C++ community's fault

The problem with C++ is backward compatibility with C and big corporations trying to not break their codebase. It makes it very hard to make the language evolve. For example D is a very good language, but it can't gain momentum as long as it doesn't have a real way to gain presence. We just have to wait that corporation clean their codebase to make room for C++ compilers to adapt, and then things should improve.


You can't have the STL containers without templates though. If you want to special-case them in the compiler, then what about the algorithms library? If you want to keep the algorithms library, then what about your own code that implements a higher-level algorithm using the lower-level ones in the algorithms library? Without templates you have to write your algorithm once for every type of container, or else use indirect function calls which would not be good. So this won't really work well I think.

For me, the only selling point of C++ is templates, and I've never written a nontrivial program that didn't use them (and I like to think that I don't just use them frivolously).


Parametric polymorphism is a powerful abstraction. Makes for much more readable programs than ad-hoc polymorphism a la vanilla C.

You do realise that STL containers fundamentally rely on templates, right?

We use loads of template classes at where I work. Our code would not be maintainable without them.


I know, but that doesn't change the fact that other languages don't have templates and still have containers, meaning maps, vectors, etc.

They have simplified templates, aka generics.

The biggest difference is that templates are more expressive, allowing for compile time meta-programming.


I'd rather say templates are a hack to get generics in C++, because C++ lacks real generics contrary to eg. Java, C#, Haskell or Scala. Templates used as generics are limited in a way that the typechecker can't reason about generic types, only concrete types, which leads to horrendeous error messages and makes it impossible to prove generic library code type-safety. Type templates are templates, not types.

On the other hand templates used as a metaprogramming tool fall behind any modern macro system.


Rust is getting there but still lacks type-level integers, and macros are waiting for macros 2.0. It would also be neat to have the ability to run any function at compile time, but I don't think that's in the cards.

True, but as someone that spends the majority of his programming time between Java and C#, with C++ only when required, I do miss their power.

Also it is way better with latest standard improvements.


Sure, and those without templates, or at least generics, aren't type-safe.

The chief benefits to pure C IMO are that it's more portable, compiles faster, and it's a simple and stable language.

Writing C takes more effort in some cases, so it's a trade-off.


It's goofy, but one of the reasons I'll pick C for a quick program is that both Windows and Linux will give better error messages for standard functions that fail and set errno than for equivalent C++ code that throws an exception.

This isn't a question about the language, it's definitely a quality-of-implementation issue for standard exceptions to have what() give a useful message.

Just to clarify perror will generally give a better error message in:

    FILE* fh = fopen("foo.txt", "w+");
    if (!fh) {
        perror("unable to open foo.txt");
        return 1;
    }
Than what() will in:

    std::fstream fs;
    fs.exceptions(std::ifstream::failbit | std::ifstream::badbit);
    try {
        fs.open("too.txt");
    }
    catch (const std::exception& ex)
    {
        std::cerr << "unable to open foo.txt: "
           << ex.what() << '\n';
    }
(Ignoring other issues such as whether exceptions are better than return codes, etc.)

A lot of people seem to use stdio even in C++.

Having comprehensible error messages is related to being a simple language, IMO. C++ compile-time errors are notorious for being unreadable. I'm not familiar with the output of what(), but I'd wager that the problem is a consequence of all the abstraction.


>This is a situation where C++ really shines

I agree but passing in the size is really not much of a deterrent.


> you can use C++ with templates and not only have much cleaner code, but also avoid forcing the compiler to inline every single call to array_push_back

Actually, you really, really, really want to inline every call to push_back for std::vector. Here is a relevant presentation by Chandler Carruth https://www.youtube.com/watch?v=s4wnuiCwTGU . But, hey, you don't want to make the compiler work for you :).


I gave up on C++ right after I learned it. At the time, this was before the STL was part of the standard, it looked really cool but I could tell that it was easy to get into trouble and it would take years and years to be really good at it. When Java became popular, it looked like it solved a lot of the problems that I was afraid of with C++. But it was slow and really verbose compared to C++ and C. Recently, I heard someone say that today's C++ is not your grandfather's C++ and the speaker made some interesting comments about how much better it was. So I'm intrigued by it and definitely open to looking into the new C++.

Yep, worth giving another look: I couldn't stand C++ before C++11, and now I can't stand C anymore :)

With the use of defer http://pastebin.com/EXZuRAdT you could create it w/o the need to use array_free.

The dynamic memory allocation of the array's fields itself does not mimic std::vector, it's an extra indirection that C++ does not pay for. You can make it a non-opaque struct in C and copy it around.

std::vector does indeed have at least one pointer member (otherwise you couldn't have a std::vector with automatic storage duration because the size would be unknowable) so there is some indirection. Maybe you're thinking of std::array?

kzrdude is referring to the allocation of the Array struct on the heap. It should be something like this instead:

    Array array_create(size_t size, size_t sizeof_data) {
        Array result;
        result.size = size;
        result.capacity = size;
        if(size) {
            result.data = malloc(size * sizeof_data);
        } else {
            result.data = NULL;
        }
        return result;
    }

I think kzrdude is actually referring to the void* pointers that the individual elements of the array live behind. Each of those requires an allocation to insert them, and an extra pointer traversal to read them. In C++ they would live side by side instead, as in regular C arrays. In C you'd need the struct to be redefined for each element type (maybe with another macro) if you wanted the same efficiency.

I'm not seeing that?

In ARRAY_PUSH_BACK, it just inserts the element directly into the buffer, provided there is enough capacity. There is no separate allocation for an element.

You don't need to redefine the struct for each element type, since the macro casts 'data' (type void *) to whatever the array's type is.


You're totally right, I don't know what I was reading >.<

Yes exactly.

This C code is more like `std::unique_ptr<std::vector<T>>`, so now you might see where the extra indirection is located.

Probably more efficient to store the data array inline, with a flexible array member. That way creating takes only one malloc, and destruction only one free.

Those are great. I haven't used C a lot in a long time, but I remember back when I wrote C code for a living that I ran into the exact situation flexible array members are made for around the time I also learned of them.

"That will solve my problem elegantly", I thought, but unfortunately, the compiler we used only understood C89, so my hands were tied.


but unfortunately, the compiler we used only understood C89, so my hands were tied.

You can do it in C89 too, just allocate sizeof(header) + n * sizeof(element).


Yes, but it's not the same. :( { It is fun, though - I once did it to write/read a data structure with two levels of indirection (i.e. an array of arrays) - Slurp the whole thing to memory, then adjust the pointers, and voila. }

In C89 it's trickery and a little bit of black magic (at least a whiff of it), while in C99, it is an officially supported convention.

On x86, where the C code I wrote ran, it wouldn't make much of a difference, but allocating header + array manually also means - if the code needs to run across a variety of CPU architectures - that one needs to look at alignment issues.


IIRC You don't have to play with pointers if you use the null array trick (put an array of size zero at the end of your struct and use it anyway)

Yes, but you can't use flexible array members with a void array. So you will need to have a specialized structure for every data type (or use some macro system to generate these).

It is an excellent alternative for numerics when you usually work with simple types like int and double.


This sort of approach is a pain to use, because you keep having to cast when you're in the debugger, and there's zero type safety. And I'm afraid I don't have much positive to say about something like "((Vector2i * )arr3->data)[0].x = 333".

You can do better than this!

What is an array? It's 3 variables: base, length and capacity. So why not decide that an array is just that. 3 variables of the right size and type.

    #define ARRAY(T,S) T S;size_t S##_length;size_t S##_capacity
Then you can make one like this:

    ARRAY(int,xs);
You'll also need to initialise and these destroy array "objects".

    #define ARRAY_INIT(S)     \
        do {                  \
            S=NULL;           \
            (S##_length)=0;   \
            (S##_capacity)=0; \
        } while(0)

    #define ARRAY_DESTROY(S) \
        do {                 \
            Array_Free(S);   \
            ARRAY_INIT(S);   \
        } while(0)

    
Add you'll probably want to add an item to an array too.

    #define ARRAY_ADD(S,X)                     \
        do {                                   \
            if((S##_length)>=(S##_capacity)) { \
                S=Array_Grow(S,                \
                             sizeof *S,        \
                             &(S##_length),    \
                             &(S##_capacity)); \
            S[S##_length++]=(X);               \
        } while(0)
So you might use them like this:

    ARRAY(int,xs);
    ARRAY_INIT(xs);
    for(int i=0;i<100;++i)
        ARRAY_ADD(xs,i);
    ARRAY_DESTROY(xs);
Array_Free is very simple, and Array_Grow is barely more complicated (however I wrote it off the cuff, so of course it could still be wrong). Both of these mainly exist just to keep stdlib.h out of the header.

    void Array_Free(void *p) {
        free(p);
    }

    void *Array_Grow(void *base,size_t stride,size_t *length,size_t *capacity) {
        *capacity+=*capacity/2;
        *capacity=MAX(*capacity,MAX(MIN_CAPACITY,*length));
        return realloc(base,*capacity*stride);
    }
Array accesses and iteration and the like are just done in the traditional way:

    for(size_t i=0;i<xs_length;++i) {
        printf("%d\n",xs[i]);
    }
Even performs nicely with -O0.

For a full implementation you'll probably also need a way of generating a static array. (I mainly found myself needing this for test code, which uses globals for convenience; most arrays I create normally are locals, or parts of structs.)

You'll also need a parameters list for use in a function declaration or definition, and a macro that expands to all 3 variables.

    #define ARRAY_PARAMS(T,S) T *S,size_t S##_length,size_t S##_capacity
    #define ARRAY_ARG(S) S,S##_length,S##_capacity
Like then you might have a function that takes a pointer to an "array":

    void FunctionThatTakesAnArray(ARRAY_PARAMS(T,*p));
And you call it like this:

    ARRAY(T,myarray);
    FunctionThatTakesAnArray(ARRAY_ARG(&myarray));
(I found this cropped up often enough that I needed the macro, but it was less common than I thought.)

There's more you can do, but the above is the long and the short of it.

This might all look terrible - or perhaps it sort of looks OK, but you're just not sure that it would actually work - but I've used this in a prototype project and thought it worked out well. (I've been using C for 20+ years, so hopefully even if I've got no taste, I've at least got a rough feel for what works out OK and what's going to end up a disaster.)


This gets messy in a couple of ways; for example, what if you want to pass two of them to a function? Then the names of the parameters generated by the ARRAY_ARG macro will clash and you'll have to add a counter to it, etc. (Also I'm not sure that you can concatenate `* p` with `_length` in `S##_length` where `S` is `* p`, and the same thing for the other, but I understand what you meant.) You'll also have potentially very confusing errors for the users of your library when they happen to create a variable whose name collides with one that the macro generates. And those are just the cursory observations.

This is what the macros expand to.

From:

    $ cat test.c
    #define ARRAY(T,S) T S;size_t S##_length;size_t S##_capacity
    #define ARRAY_INIT(S)   \
            do {                \
                S=NULL;         \
                S##_length=0;   \
                S##_capacity=0; \
            } while(0)

    #define ARRAY_DESTROY(S) \
            do {                 \
                Array_Free(S);   \
                ARRAY_INIT(S);   \
            } while(0)
    #define ARRAY_ADD(S,X)                                 \
            do {                                               \
                if(S##_length>=S##_capacity)                   \
                    S=Array_Grow(S,sizeof &S,&S##_length,&S##_capacity); \
                S[S##_length++]=(X);                           \
            } while(0)


    void Array_Free(void *p) {
    	free(p);
    }

    void *Array_Grow(void *base,size_t stride,size_t *length,size_t *capacity) {
    	*capacity+=*capacity/2;
    	*capacity=MAX(*capacity,MAX(MIN_CAPACITY,*length));
    	return realloc(base,*capacity*stride);
    }

    #define ARRAY_PARAMS(T,S) T *S,size_t S##_length,size_t S##_capacity
    #define ARRAY_ARG(S) S,S##_length,S##_capacity


    void test(ARRAY_ARG(p), ARRAY_ARG(a))
    {
    }

    int main() {
    	ARRAY(int, myarray);
    	ARRAY(int, ourarray);
    	test(ARRAY_ARG(myarray), ARRAY_ARG(ourarray));
    }
To:

    $ gcc -E test.c
    # 1 "test.c"
    # 1 "<built-in>"
    # 1 "<command-line>"
    # 31 "<command-line>"
    # 1 /usr/include/stdc-predef.h" 1 3 4
    # 32 "<command-line>" 2
    # 1 "test.c"
    # 22 "test.c"
    void Array_Free(void *p) {
        free(p);
    }

    void *Array_Grow(void *base,size_t stride,size_t *length,size_t *capacity) {
        *capacity+=*capacity/2;
        *capacity=MAX(*capacity,MAX(MIN_CAPACITY,*length));
        return realloc(base,*capacity*stride);
    }

    void test(p,p_length,p_capacity, a,a_length,a_capacity)
    {
    }

    int main() {
        int myarray;size_t myarray_length;size_t myarray_capacity;
        int ourarray;size_t ourarray_length;size_t ourarray_capacity;
        test(myarray,myarray_length,myarray_capacity, ourarray,ourarray_length,ourarray_capacity);
    }

Looks like fairly reasonable code.

Edit: I would like to see it take a type in the Params and a __typeof__ while passing the args to make sure you know what is going where.


Thanks for trying it out - my post was assembled by copying bits out of the (rather gnarlier) code that I actually used, so I'm glad it mostly survived the process ;)

Interesting point about the type checking; my thinking was that the compiler could check they matched, and that this would suffice - and sure enough it worked absolutely fine in practice. But now that I'm made to think about it again, I think it's probably still not quite good enough to be perfect, because you could do this:

    void f(ARRAY_PARAMS(const char *,*xs)) {
        ARRAY_ADD(*xs,"fred");
    }

    ...
    ARRAY(char *,xs);
    ARRAY_INIT(xs);
    f(&xs);
And now you've got a char * that points to a const string. Erm... that's not good!

EDIT: maybe I see what you're getting at with ARRAY_ARG now. The intention is that you use ARRAY_PARAMS to generate the text for the function declaration or definition (that's why it has the type in it), and ARRAY_ARG to generate the text for the code where you pass one to a function so declared (that's why it's just 3 names - they're intended to be expressions, not names for function parameters). That means test should (?) be like this:

    void test(ARRAY_PARAMS(int, p), ARRAY_PARAMS(int, a))
    {
    }

Hopefully that makes sense.

Maybe they'd have been better off with the common C terminology of formal and actual parameters. Then you'd have ARRAY_FORMAL_PARAMS for ARRAY_PARAMS, and ARRAY_ACTUAL_PARAMS for ARRAY_ARG.


ARRAY_PARAM (as I assume you mean?) has no problem with two arrays. You just give them different names. Suppose you do this:

    void CopyIntArray(ARRAY_PARAMS(int,*dest),ARRAY_PARAMS(int,src))
Now you end up with this:

    void CopyIntArray(int **dest,size_t *dest_length,size_t *dest_capacity,
                      int *src,size_t src_length,size_t src_capacity);
And you can call it like this:

    ARRAY(int,xs);
    ARRAY(int,ys);
    CopyIntArray(ARRAY_ARG(&xs),ARRAY_ARG(ys))
Your point about token pasting with * p is a very good one, and I don't think that had occurred to me... but neither clang, gcc or VC++ seems to mind (and I used a number of different versions of each). I need to go and look up what the C standard has to say about this now.

I do note that I didn't use ARRAY_PARAM or ARRAY_ARG all that much in my code, though - but I don't remember whether this is because I found some problem with them in practice, or whether it just ended up that way.

(I'm on OS X right now and I just tried my code with clang. Probably-relevant compile flags were "-std=c1x -Wall -Wuninitialized -Winit-self -pedantic -Werror=implicit-function-declaration -Wsign-conversion -Wunused-result -Werror=incompatible-pointer-types -Werror=int-conversion -Werror=return-type -Wno-overlength-strings -Wunused-parameter".)


Ah good catch, I misread it.

> #define ARRAY(T,S) T S;size_t S##_length;size_t S##_capacity

I don't understand why you don't wrap this in a struct, something like (not tested):

    #define MAKE_ARRAY_T(T) typedef struct array_##T { T data; size_t length; size_t capacity; } array_##T;
(This could be generalized for types that don't paste cleanly with ##, requiring the user to specify an extra type name.)

This would buy you several advantages:

- shallow copies of arrays using =

- easier parameter passing

- easier declarations due to real type names: array_int my_integer_array;


This way you either have to declare every type you want to use this with at file scope (which is fine and some people do this, but it's a little irksome), or not be able to pass them as parameters to functions. Also it doesn't solve the problem of function genericity.

Shallow copies aren't important to me, and after years of using C++ templates and C# generics I've come to quite like having the type where I can see it!

As for why I didn't use this approach in general, a couple of reasons:

- I often have arrays of pointers. Now you need a second parameter, to give the struct its name...

- You need to decide in advance all the possible array types you're going to have, and keep the list up to date. I wanted this to feel a lot more like using std::vector

- You can't redeclare a struct, even with the same declaration. So if you have MAKE_ARRAY_T(int) in one header, because your object has an array of int, and MAKE_ARRAY_T(int) in another, because ditto, you can't include both headers from the same file. So again you need to be able to give the type a name

- Structs can't be mixed and matched as flexibly as variables. So say you've got MAKE_ARRAY_T(int,Module1Ints) in the header for module 1 and MAKE_ARRAY_T(int,Module2Ints) in the header for module 2 - both are arrays of ints, and you avoid the naming problems I described above. But now you can't use one in place of the other, even when this would make sense

It also doesn't help with passing your arrays into your generic array functions, since you don't have a single type that you can pass around. So you're restricted to passing in the 3 individiual pieces, or inlining snippest of code via macro, like mine does.

Something I did try in the past was using anonymous structs:

    #define ARRAY(T) struct { T *p; size_t len, cap; }
However, anonymous structs are always different, even if they're structurally equivalent, so... no go. (I also dimly remember Visual Studio not even being able to show you the struct in the debugger!) But I guess trying that out was probably a stepping stone on the way to my thinking of the code I've put here.

> It also doesn't help with passing your arrays into your generic array functions, since you don't have a single type that you can pass around. So you're restricted to passing in the 3 individiual pieces, or inlining snippest of code via macro, like mine does.

When I did something like this, the type definition macro also defined strongly typed implementations of all the array functions for the given type T. Then a call is a real function call and is only inlined if the compiler chooses to do it.

The other points you mention were not problems in the particular application I was working on, but yes, these are interesting trade-offs.


I think the main drawback for every C implementation, including this one, is that you still have to write a new version of every algorithm for every type that you want to use your array struct with, unless you put all your functions in macros too. And even then, they'll only work with pointers, not with other data structures like the C++ algorithms will, unless you start doing dynamic dispatch which is what we want to avoid.

I think C++ solves this in a neater way (not saying it's good, just better) with templates, the iterator idea, and the algorithms library because you only write things once and the code is only generated for each type (not each use of the function like it would with macros).


Instead of putting it all in macros you could write the algorithms once in a "template include file" array_template_impl.h:

    #define CONCAT2(a, b) a##b
    #define CONCAT(a, b) CONCAT2(a, b)
    #define ARRAY CONCAT(array_,ARRAY_ELEMENT_TYPE)
    #define ARRAY_FUNC(name) CONCAT(CONCAT(array_,ARRAY_ELEMENT_TYPE), CONCAT(_,name))

    typedef struct {
        size_t size;
        size_t capacity;
        ARRAY_ELEMENT_TYPE *data;
    } ARRAY;

    ARRAY *ARRAY_FUNC(create)(size_t size) {
        ARRAY *p = malloc(sizeof(ARRAY));
        p->size = size;
        p->capacity = size;
        if(size) {
            p->data = malloc(size * sizeof(ARRAY_ELEMENT_TYPE));
        } else {
            p->data = NULL;
        }
        return p;
    }

    //...

    #undef ARRAY
    #undef ARRAY_FUNC
    #undef ARRAY_ELEMENT_TYPE
Then you can use them with the #include code generation trick like this:

    #include <stdlib.h>

    #define ARRAY_ELEMENT_TYPE int
    #include "array_template_impl.h"

    #define ARRAY_ELEMENT_TYPE float
    #include "array_template_impl.h"

    int main() {
        array_int *ai = array_int_create(0);
        array_float *af = array_float_create(0);

        for(int i = 0; i < 10; ++i) {
            array_int_push_back(ai, i);
            array_float_push_back(af, i * 0.3f);
        }

        array_int_free(ai);
        array_float_free(af);
        return 0;
    }
(In practice you might want even more indirection files: Put #define and #include in array_int.c. and array_float.c, and create also array_int.h, array_float.h and array_template_decl.h with just the usual declarations.)

This is a cool technique, but I tried compiling the first block of code you posted (with ARRAY_ELEMENT_TYPE being defined to be int* ) and got "error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token" from gcc 6.2 using the flags that to3m specified in one of the parent posts posts. Is there a special trick I can use to get that to work?

You'd need a second #define, something like:

    #define ARRAY_ELEMENT_NAME int_ptr
    #define ARRAY_ELEMENT_TYPE int*
    #include "array_template_impl.h"
And then use ARRAY_ELEMENT_NAME only in the definitions of ARRAY and ARRAY_FUNC, and use ARRAY_ELEMENT_TYPE instead everywhere else.

Actually now that I think about it, it might be better to just make a typedef and then you can go back to using the same name everywhere. I think that would solve the problem.

This technique seems to take care of the bulk of the use of templates in C++, i.e. simple data structure or function definitions. But it's not a full replacement, because I don't think you can use this to do things like pass an integer to one of these templates, then have that template use another template and pass a calculation based on that number to the other template like you could do in C++. Something like this:

    template<int X>
    struct foo_t { /* ... */ };
     
    template<int Y> 
    struct bar_t {
       foo_t<Y / 2> foo;
    };
so that bar_t<50> and bar_t<24> are different types which contain a foo_t<25> and foo_t<12> respectively.

But it's cool nevertheless.


Ah yes, using a typedef would work too, nice.

It for sure not quite the same as C++ templates, but if you can tolerate crazy things you can do a lot with the preprocessor. See http://www.boost.org/libs/preprocessor/ (supports both C++ and C).

If you have foo_template.h:

    #define FOO_T CONCAT(CONCAT(foo_,X),_t)
    struct FOO_T { /* ... */ };
and bar_template.h:

    #define X Y/2
    #include "foo_template.h"

    #define BAT_T CONCAT(CONCAT(bar_,Y),_t)
    struct BAR_T {
       FOO_T foo;
    };
It might almost work, but you'd need to pull out some more tricks I'm sure. Maybe BOOST_PP_DIV(Y,2) would help. In practice I'd prefer something sane. :)

I think it's cleaner to allocate the header information before the actual pointer (or, conversely, to return the pointer sizeof(array_header) into the allocation). Then it's trivial to pass around.

  struct vec_header_t {
      size_t length;
      size_t capacity;
  };

  static inline struct vec_header_t *vec_to_header(void *vec)
  {
      return ((struct vec_header_t *)vec) - 1;
  }

  #define _vec_length(vec) (vec_to_header(vec)->length)

  static inline void vec_free(void *vec)
  {
      if (vec)
          free(vec_to_header(vec));
  }

  #define vec_foreach(vec, iter) \
      for ((iter) = (vec); (iter) < ((vec) + vec_length(vec)); ++(iter))
It's slightly more cognitive overhead when, for example, debugging, but the vast improvement in usability (no special macros for normal/static declaration, trivial passing to functions, etc) is worth it IMHO.

Compared to the code in the linked post, I guess I'm a lot more strongly in favour of type safety, which this solution has too. But as a bear of very little brain, I really must insist on no extra cognitive overhead while debugging!

Sadly there's no really good way of doing this in C, so whatever you do there's a tradeoff somewhere. You just have to decide what type of crap you want to put up with, and code accordingly ;)

(If you're working on your own, I do say this is a fair argument for at least giving C a go when you might otherwise have gone with C++. If you can come up with some stuff that doesn't annoy you in any way you care about, and you don't mind the fact that C lacks a few of C++'s creature comforts, you'll reap the benefit of hilariously better build times.)


Is there a simple way to write the ARRAY_PUSH_BACK without a macro?

Probably, if you use memcpy and store the size of the data type as an extra parameter in the containing struct.

I was thinking about that but I don't know what to do with the type of the param you're accepting. Could you do a const void* ref? Might just be easier to make it variadic.

See this comment https://news.ycombinator.com/item?id=13346432 . At this point creating a function like:

    array_push_back(int nr_arguments, ...);
that will replace the original macro ~~should be possible~~. Actually it won't work without using non-standard compiler extensions like typeof.

If you have a pointer-only array you could just make it based on void*. Primitives and by-value arrays make it a lot harder.

I wrote pretty much this exact implementation of a dynamic array once. For several data types.

And I had that same idea, let's use macro to fake generic programming. But while I admire the trickery some people pull off using the C preprocessor, I admire them from afar. My coworkers would not have let me get away with that, anyway.

I am not a C++ programmer, but templates are immensely powerful, and after learning about them (a little, at least), I found statically typed languages without some form of type-generic programming to be very bothersome.

Looking at C++ as "C with Templates" instead of "C with Classes" gives a very different picture (plus, Classes and such are still around in case they are needed, anyway). Every other year or so, I try to get my C++ up to usable standards, but I do not need it for work (except for that one time about three years ago), so I eventually lose interest. Maybe approaching C++ as "C with Templates" is a more promising route.


> Maybe approaching C++ as "C with Templates" is a more promising route.

Genius. I'm doing this from now on.


For many years now I treat C++ as "C with destructors and the STL". Examples:

1. RAII: https://github.com/akkartik/mu/blob/61fb1da0b6/010vm.cc#L484

2. STL: https://github.com/akkartik/mu/blob/61fb1da0b6/020run.cc#L50

I really don't use anything else.


Oh yes, don't forget RAII as I did! (The name is super-awkward, but the concept is mind-blowingly awesome when it fully sinks in.)

Other languages have begun to pick up on this, think of Python's with-Statements, and C#'s using (x = SomeClass())-blocks. but C++ still makes it easier to take advantage of this feature.

Unless you play around with setjmp/longjmp. But to do that, you have to be ... special enough to not care about deterministic invocation of destructors in the first place. ;-)


The author could have just used glib.

Also, if one is not restricted because of license conditions, the Judy Array comes to mind: https://en.wikipedia.org/wiki/Judy_array

The API is very easy, and it's really fast.


You could add some function pointers to the struct initialize them and then you could do:

Array a; a.add(&a, item).


I wrote a similar thing for C99 [1], but "safe" (bound-checked), having both stack and heap allocation, and many other array/vector functions [2], including integer-optimized sort (in-place MSD binary radix sort -wich is availabe in typical C++ sort implementations, but not in C, as default qsort() relies on sorting functions-). With some benchmarks, too [3]

[1] https://github.com/faragon/libsrt

[2] https://faragon.github.io/svector.h.html

[3] https://github.com/faragon/libsrt/blob/master/doc/benchmarks...



for cases where malloc checks complicate the example it doesn't hurt to use assert instead.

AFAIK when the array is created, p->size should be set to 0, not to the size argument.

This mimics C++ vector, e.g.:

    vector<int> vec(5);
    vec.push_back(11);
    vec.push_back(12);
Now you have in vec:

    0, 0, 0, 0, 0, 11, 12
But your suggestion is better from an API point of view.

Oh... thanks! I was not aware.

PS please note that there is no memset nor any other mean to zero initialize the elements. So the code appears to have a bug anyway.

Correct, this is a bug. calloc will better mimic the way C++ std::vector works.

I wrote something similar in C99 as well for another project. Initially, I used the same form used in this blog post however it's easy to see that the ergonomics for accessing the data are pretty terrible.

I eventually moved to a solution where I prepended the capacity and size to the block returned to the caller and then wrote helper functions that accessed/modified these values. This way the caller can access values in the returned array just as they would one returned from malloc.

The code (note, the `vec` type is just a typedef'd `void*`): https://github.com/crossroads1112/marcel/blob/master/src/ds/...


> A possible approach is to use a macro

When macros start being used for metaprogramming in C, it's time to reconsider using C++.


When it's time to reconsider using C++, it's time to consider using something else :-)

Like the D programming language? :-)

I do see people extending C++ using the preprocessor for more complex metaprogramming. Those should consider D.

Why does array_push_back need three parameters? Couldn't the macro just use size_of on the second param?

Not without some serious modifications, see line 6 of the macro:

    data_type *pp = arr->data;\
this is expanded to something like (when data_type is double):

    double *pp = arr->data;
You can get rid of the third parameter. Just store the size of the data type in the container struct and use memcpy. Something like this (probably slower than the original):

    char *pp = arr->data;\
    memcpy(pp + (size - 1) * arr->size_of_data, &(x), arr->size_of_data);\

I would recommend David R. Hanson's "C Interfaces and Implementations: Techniques for Creating Reusable Software (Addison-Wesley Professional Computing Series, 1997, ISBN 0-201-49841-3).

https://github.com/kev009/cii/blob/master/src/array.c - this leaves resizing on the caller, but that could be retrofitted in. Most importantly is how the book explains everything.


Ugh. Multi-line macro.

how about #define PUSH_BACK(a,x,t) push_back(a,&x,sizeof(t))

No multi-evaluation problems or other madness.

Edit: Actually that won't work for expressions. So

#define PUSH_BACK(a,x,t) do { t tmp = x; push_back(a,&tmp,sizeof t) } while(0)

slightly better.


Legal | privacy