Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
CppCon 2022 Best Practices Every C++ Programmer Needs to Follow – Oz Syed (isocpp.org) similar stories update story
2 points by mikece | karma 35535 | avg karma 5.86 2023-05-19 06:14:01 | hide | past | favorite | 77 comments



view as:

Recommending counting new/delete operations in 2023? What about smart pointers?

I would imagine that nowadays you should use new/delete only if you have to account for every CPU cycle to literally get every drop of performance...


His larger point was having a focus on memory management. He mentioned new and delete in reference to ensuring they match up to avoid memory leaks. He also mentioned smart pointers. If you wanted every drop of CPU performance, you probably wouldn’t be doing lots of memory allocation. You would probably look at every operation in your critical path/loop and ask 1) do I have to do this at all? 2) do I have to do this right here/now? Cut out, defer, pre-allocate until you get as fast as you can.

I was surprised to see no talk of RAII when it came to memory management

He says "use smart pointers, if you have to" at around 1:33.

But anyway, if performance matters at all you shouldn't heap-allocate in any hot code path in the first place. E.g. if memory management overhead is showing up in profiling, don't start looking for a faster general-purpose allocator, but instead rethink your memory management strategy ... and "I'll just put every C++ object behind a smart pointer" isn't really a memory management strategy ;)


Junior C++ dev here, but I feel like "use smart pointers, if you have to" is not the sentiment I usually hear.

Isn’t it usually “use raw pointers, if you have to”?


Yes. The main problem with the talk is that it's just too general. It gives good advice, but it's only really good advice if you understand what it's trying to say, at which point you don't need the advice. Kinda impossible to cover "what every programmer needs to know" in 5 minutes.

So, "use smart points, if you have to", should have been: "if you need to allocate memory dynamically, use smart pointers".


> if you need to allocate memory dynamically, use smart pointers

Oooh. Okay. :p I was like, "As opposed to what, raw pointers?!"


As always the answer is "it depends". Smart pointers kind of 'encourage' that each C++ object lives in its own heap allocation, but this quickly becomes a problem if you deal with tens- or hundreds-of-thousands of small C++ objects that way, each in its own heap allocation, and on top a high number of those objects being frequently created and destroyed (for instance with std::make_unique or std::make_shared). I call this a "Java-style C++ code base".

Technically this works and it's reasonably safe, because RAII takes care of memory management, just like GC takes care of memory management in Java.

But the problem there is deeper: it's not smart pointers vs raw pointers, but that each small object lives in its own heap allocation.

That's the typical scenario where you will see memory management becoming a performance problem (for at least two reasons: (1) heap allocation and especially deallocation isn't free, and (2) lots of cache misses when accessing objects that are spread more or less randomly over the address space).

Ideally you'd group many similar objects into long-lived and compact arrays, process the data in those arrays in tight loops, minimize pointer indirections, and minimize allocations (ideally you'd only allocate those arrays once at program start, or when those arrays need to grow). Once you did all those things, memory managament suddenly becomes a non-problem (because you only have a handful of long-lived allocations to care about), and RAII becomes a lot less useful (because the 'objects' in those arrays will most likely just be C-style plain-old-data structs). This is basically the 'antithesis' to OOP. And before you know it you're back to writing C code, like it happened to me ;P

TL;DR: there is no silver bullet for automatic memory management if performance matters.


I might be totally missing something here, but wouldn’t using raw pointers include using new/delete, so all those small C++ objects would also live on the heap.

Are you saying smart pointers and there overhead make the heap allocations even bigger and that’s the cause of the cache misses?


I guess this got lost in the wall of text :)

> But the problem there is deeper: it's not smart pointers vs raw pointers, but that each small object lives in its own heap allocation.

Smart-pointers do encourage to create small heap allocations though (because unlike raw pointers they manage ownership).


You might want to watch "GoingNative 2013 C++ Seasoning" video or at least check it's slides [1]. It may sound like it is old but the part about pointers is essential for anyone who wants to write C++.

IMHO from gamedev: HN is not going to get it ever. They either have moved to high level languages or do not need performance at all.

1. https://sean-parent.stlab.cc/presentations/2013-09-11-cpp-se...


One developer's "best practice" is another developer's nightmare. Much of it depends on context.

Examples?

goto.

> In this session, learn some of the best practices that every C++ programmer needs to ensure successful completion of a project.

I find it annoying when people say you “need” some best practice. The things you actually need are generally enforced by the tools (compiler errors in this case?). Everything else is subjective and/or depends on your use case.


I would disagree. In most programming languages, compiler errors and even linters lag by orders of magnitude behind the practices that will actually let you write code that stands a chance of being safe and secure.

I haven't looked at this specific list yet, but I am yet to see a C++ project in which code works just by following what the tools tell you.


I find tools are very helpful to reminding me of rules I forgot to apply, but they can only detect a limited subset of places where I forget to apply tools.

Maybe I’m arguing semantics, but my point is “need” seems a little subjective here. You can write insecure/unsafe/unreliable code and it’ll still run and compile just fine. You probably don’t want to, but you don’t “need” every program to be safe.

That's not trivial, in particular with C++. However, the talk actually talks about setting up tooling as early as you can as a best practice, like multiple different compilers or regression test suits on all targeted platforms.

> The things you actually need are generally enforced by the tools (compiler errors in this case?).

No. This is specifically C++ which has IFNDR ("Ill-formed, No Diagnostic Required") which means the ISO document says there are things (a lot of things it turns out, WG21 appears to have given up even trying to enumerate them) which aren't valid C++ - and so you mustn't do them because the resulting program is meaningless - and yet the compiler isn't expected to detect them and reject your program.

Henry Gordon Rice wrote an important PhD thesis about computation in like 1951. Rice's Theorem says that all non-trivial semantic properties are Undecidable. But we want semantic properties! So, if your programming language is going to have non-trivial semantic properties then you have two practical options:

1. Accept all programs which may have the desired semantic properties, since you can't always decide which those are, you give up and accept programs where you aren't sure, these programs are nonsense, but too bad. That's C++ with IFNDR

2. Reject all programs which may not have the desired semantic properties, since you can't always decide which those are either, you give up and spit out a compiler error when you aren't sure. If a program is rejected maybe the human programmer will rewrite it so that it's acceptable, which is a burden on them. That's what Rust does.


Could you give a concise example of IFNDR in C++? I'm having trouble picturing what that would look like.

The most classic example of IFNDR is ODR the One Definition Rule, which goes like this: In the final C++ program, there can only be one definition of any particular thing, however, C++ is actually built by taking individual source code files and just pasting in stuff (using #include) and when you do that, obviously you'll often be defining the same stuff, each time. So, what if the definitions are different?

For example maybe when mainprogram.cpp is compiled, the variable max_princesses is defined as the literal 10, but when the princess.cpp file is compiled, perhaps several minutes later, the variable max_princesses is now defined as the literal 0.5 - that's not even the same type! What happens? The ODR means that's IFNDR, so instead of C++ needing to somehow guarantee that this definitely is caught by compilers [these days some compilers will catch some ODR violations but it's not all of them and not always] the standard just says too bad, that's not a valid C++ program so whatever it does is your problem.

A really shiny modern example of IFNDR is C++ 20 Concepts semantic requirements. See, functionally C++ 20 Concepts are just syntax matching, but their names imply semantic value, the standard says they do have semantic value, but it's not actually enforced by the tooling, so syntactically float (a floating point number) matches the concept std::totally_ordered - but of course floats aren't actually totally ordered, that's silly. How do they square this circle? IFNDR. Using a type which matches the syntax, but doesn't fulfil the semantic criteria means your C++ program is ill-formed, but the compiler wasn't expected to tell you about that, your program is just gibberish, it might do anything - after all, floats aren't in fact a totally ordered type.


Thanks.

So of your two examples, here's what I would expect. In the first one, I would expect that either everything would work, or I'd get a syntax error, depending on which definition was actually used at the time I compiled the line in question. Are there examples where anything else happens besides those two options? (OK, I guess there's also the option that two different compilation units have two different definitions at the time I compile them, and then I link them together. Best case the linker catches it; worst case I'm doing floating point operations on what is sometimes an integer variable, and I can see some very weird things happening from there.)

And in the second example, I'd expect everything to work right up until I tried to sort (or whatever) the floating point numbers, at which point it would either work, or fail to sort, or infinite loop, depending on the exact floating point values that the program was operating on. Here I could see there being other options, depending on exactly what the program was trying to do, but not dramatically different. And, would it do anything but work as expected if there were no NANs or negative zeroes or something exotic like that?

That is: Despite the "it can do anything" statements, in practice, with production compilers, does it do completely unreasonable things? Or does the "anything" it does have some reasonableness to it?

Yeah, I know, I'm not guaranteed that. In practice, I don't actually care how my code might break on Windows 3000 with its new 197-bit bytes. I care some about problems on platforms and compilers that it's reasonably likely to need to run on someday. (I'd care more if I were writing library code, and even more if I were writing code for the STL. But I'm not, and while portability is desirable, it's not the only input to decisions.)


Well, sort of. The problem is that it's not up to you what counts as reasonable. As you observed stuff will get pretty crazy if the actual machine code executed tries to do floating point arithmetic on integer values for example. Because of the ODR C++ washes its hands of responsibility, no matter why it happened (e.g. re-used the name of a macro, forgot to use a dummy do-while loop) the ODR itself means that's not a C++ program so it's not their problem. You are not entitled to an error. Some tools might make best efforts to report ODR violations but they are all fallible.

The compilers aren't intentionally breaking your code, but they have no responsibility for what happens once you break the rules. I care about Correctness too much for this to be acceptable. If I wrote 100 programs and ten are faulty, I'd rather have twenty errors, half of which are false positives and half catch my ten mistakes, than five errors and five of the "working" programs compile but have mysterious bugs because they're faulty.


How does modern C++ compare to Rust? As in, if someone is already familiar with C++, and is willing and able to use c++20, and maybe beyond, features, are there still some compelling reasons to switch to Rust?

Rust does have a few drawbacks (custom allocators, some people miss sfinae, I don't). However, the Rust standard library is, IMHO, vastly better designed that the C++ stdlib, in particular when it comes to concurrency. Also, the toolchain is quite good. And of course, linting is miles ahead of C++ and will presumably stay there forever, as Rust code is much easier to analyze than C++.

So, I would say yes, it is worth. But truly, it probably depends on what you're coding.


IMO a major exception is that if you’re interfacing with complex C++ libraries, it’s probably better to stay in C++ land rather than try to make it work from Rust. (Fixing bugs involving FFI is truly the worst, and C++ is particularly hairy to wrap from other languages.)

One strategy is to put a minimal application-specific C layer around your C++ library, and call that from the Rust app. I’ve done that in the past and it kept the Rusties on the team happy enough. But you need to be fairly certain that the API can remain small enough and high-level enough even as needs grow, because nobody wants to be maintaining a C wrapper of hundreds of API calls a few years down the road.


Assuming that you also make use of static analysis as part of the language, mostly developer culture.

In C++ circles, safety conscious developers are always fighting against C like code, or disabling bounds checking.

If you are in a team that values bounds checking enabled by default, and where C style coding is forbidden unless for FFI reasons, then it is another matter.

See Jason Turner's starter packs.

https://youtu.be/ucl0cw9X3e8


Not a video fan and you have another comment mentioning static analysis I replied to. I have hunch you have something important to say about C++ static analysis and I’m curious as it could be a worthy topic for people to discuss. Tell us more if you can, even high level.

Answered on the other thread.

Basically it isn't perfect as solution, but by enabling them as if they part of the language, we get to clean up wrong defaults, and force best practices, specially if they are integrated as part of a CI/CD pipeline, including breaking the build when they aren't followed upon.

It isn't a fullproof solution, but better than nothing, and plenty of domains aren't going to switch to anything else anyway.

It is kind of ironic that even C's authors saw the need of such tooling and created lint in 1979, but apparently the FOSS hardly cared about such kind of tooling until clang came to be.


FWIW it's foolproof not "fullproof". The correct analysis is that this is proof against fools, in the same way that e.g. a rainproof electrical box is proof against rain, or a bulletproof armoured vehicle is proof against bullets.

This is an example of an eggcorn, humans acquire language by exposure, you misunderstand a word or saying, you analyse its apparent meaning based on what you thought you understood, and then you apply that analysis, it's a completely normal part of human language use. In some cases, eggcorns become normalised enough that it's reasonable to say they're just a variant use, but in most cases they'd be regarded as errors, and so it's probably useful to know if you've picked up any eggcorns. https://en.wikipedia.org/wiki/Eggcorn


Fair enough, thanks for the correction.

Maintaining memory safety, thread safety, and avoiding UB is still much easier in Rust than C++. In C++ you have be vigilant to avoid a mistake slipping in. In Rust, you get a compile error if you make a mistake.

All rust code is UB because there is no spec. I don't mean this as a knock against rust, but against UB fear.

That is not what UB means. Undefined Behaviour is behaviour that the compiler is allowed to assume will never happen, and which can consequently cause miscompilations due to optimisation passes gone wrong if it does in fact occur in the source code.

It's true that Rust does not have a written specification that clearly delineates what is and isn't UB in a single place. But:

1. UB is impossible in safe code (modulo bugs in unsafe code)

2. There are resources such as the Rustinomicon (https://doc.rust-lang.org/nomicon/) that provide a detailed guide on what is and isn't allowed in unsafe code.

In practice, it's much easier to avoid UB in Rust than it is in C++.


I am familiar with UB as a result of memory unsafety, but the way it is talked about it sounds like the only ways to ever cause UB is with memory unsafety.

Based on that definition it feels like it should be possible to have UB outside of memory violations, is there really no UB in languages like Java/Haskell/Go?


You can have it for reasons other than memory safety, for example signed integer overflow is UB in C and C++ (but not in Rust). However, higher level languages typically go to great lengths to avoid it. For example, in Java you will get a NullPointerException rather a null pointer actually being dereferenced, which immediately rules out any UB due to a pointer being dereferenced where doing so is not allowed.

Wow signed overflow is UB? I would have assumed it was defined, it just allows overflow.

And I am assuming something like the NullPointerException comes with a huge performance hit? Otherwise I assume every systems language would do something similar.


> Wow signed overflow is UB? I would have assumed it was defined, it just allows overflow.

Presumably it's not defined because the behaviour depends on the signedness representation.


I cannot think of a useful way to define signed overflow. I can make it do something, but at the end of the day no matter how you define it, if it happens in the real world your program has a bug.

Since we can be sure if it ever happens your code has a bug, making it undefined is a good thing: the compiler can then assume it doesn't happen and so back track to prove some other things can't happen and so make your program run a little faster.


I'd much rather have a bug in my program than UB. At least the bug is easy to track down and fix, and is limited in scope to the line of code that contains the error.

You sacrifice speedy code for this case that probably won't even happen and so you probably won't have to debug anyway. Is it really worth it?

But Rust has kind of a spec: https://doc.rust-lang.org/reference/ Sure, it's not as well-specified as C++; so one could say it's "not a real spec".

But C++ also isn't perfect, there are plenty of programs for which no two compiler developers can agree on whether they have UB. The C++ spec language is just too ambiguous and underspecified in several areas.

If you want to be sure, you need an actual machine-checkable formal specification. Neither C++ nor Rust have that.

In the end, what really matter is the contract between the programmer and the compiler: are compilers allowed to break a program in weird ways because the programmer forgot about one of the arcane rules in the spec? For C++ and unsafe Rust, the answer is yes (we don't know how to build optimizing compilers for low-level languages otherwise). But for safe Rust, the answer is no. That's a big deal.



"modern c++" is an expression people have been using for 20 years.

The reality is that C++ can be used very well and very badly. What makes C++ code good or not is not defined by the fact it's more or less C++, but rather by who wrote it.


Much more than 20 years. Andrei Alexandrescu wrote "Modern C++ Design" at the turn of the century and it was published in February 2001.

how is that "much more" than 20 years?

> n, if someone is already familiar with C++, and is willing and able to use c++20, and maybe beyond, features, are there still some compelling reasons to switch to Rust?

imho Rust is modern C++ done right (i.e. without all the baggage from the C++ past due to backwards compatibility) -> less mental overhead, still safer, more convenient features, less debugging, proper tooling...


I think Rust is "C done right". It feels more like C than C++.

To do C++ right, you'd need to do "marriage of OOP and low level programming" right, which is probably impossible because it's the most cursed combination of ideas ever conceived.

WAIT, I take it back -- D improves on C++, so it's "C++ done right". :D


> To do C++ right, you'd need to do "marriage of OOP and low level programming" right,

not really, C++ is a multiparadigm language, so is Rust


The main compelling reason is that rust is much simpler and easier to use and understand. The “rust is difficult” meme came from people who don’t primarily use c++.

I used to use primarily C++ (haven't for a few years, at least professionally). Non-trivial Rust is not much simpler and easier to use or understand. It has different kinds of complexity that can easily make it as inscrutable as non-trivial C++ code.

One feature of Rust that I don't see talked about as much is the ability for research to make its way into language improvements. Lots of PhD research in non-GC languages goes like, "assume C or C++ but written in a really specific way", where the research is interesting but there are too many corner cases to deal with when applied to a real project.

Rust makes so many guarantees that I expect some of that Ph.D research will start to turn into concrete language improvements in ways it never could before. It'll be very exciting to watch the next 5-10 years of Rust as the best overlooked research ideas start to have a viable path to production projects.


I think it's better to just browse the 'CppCoreGuidelines' for the aspects that you are working on. C++ is multi paradigm and has an extremely broad user-base, no size fits all so select what you need.

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines...


Not only browsing, enabling them of the static analysis tooling of your preference.

What C++ static analysis tools is considered the best currently?

I need to work on a large C++ code base soon and haven’t in 2-3 years, and tbh have neglected leveraging such tool properly in the past, so hoping for some good pointers on a good setup on Linux.


Stuff like clang-tidy, Clion and Visual Studio builtin analysers (works better alongside SAL), or commercial tooling like these ones,

https://www.incredibuild.com/blog/top-9-c-static-code-analys...


This is a lightning talk of the category "someone on the internet has an opinion".

The level is "count your new and delete and see that you have the same amount".


Every C++ dev has their own C++ that is idiomatic. Good luck matching that up with reality.

How do you define reality then?

How do we know that your reality, my reality, and his/her reality aren't three different things?

Exactly. If "every C++ dev has their own C++ that is idiomatic", which is the reality you'd want to match it up with?

You can skip straight to the video: https://www.youtube.com/watch?v=xhTINjoihrk

It's 5 minutes of very basic and generic statements, like "don't forget to free your memory", "follow rule of five" and "test your code".


C++ isn't a "needs to follow" kind of language. It's more "Here are some guidelines. If they don't fit what you need to do, feel free to not follow them, but you should probably know why."

I would recommand this video from Hurb Sutter instead: https://www.youtube.com/watch?v=ELeZAKCN4tY&list=PLHTh1Inhhw...

There are no "Best Practices", this is a toxic myth.

Like engineering, methodology an design are always tradeoffs, highly dependent on context.

Business constraints, company culture, legacy code, team strength and even individual preferences, all of that should be taken into account, there are different appropriate styles for different cases.

The only "best practice" is a wide knowledge of what can be done, and the ability to pick and mix to optimize given the context.


I don’t think having many options with trade-offs and “best practices” are mutually exclusive.

Best practices at a minimum can be guidelines for how to build, and at there best they describe exactly what you should do based on the trade-offs and your current situation.


There are best practices in a given context that can be worst practices in an other one.

The coding style that is used in safety critical embedded software or high frequency trading would not be appropriate for gamedev or web, and vice versa.


When I what circumstances would not using a version control system, for example, be the right choice?

If you are a 8 years old kid learning to program, maybe.

This has been a revelation to me. I have always wanted to test in prod but best practices stopped me. Now is my time.

Some things are indisputable - eg avoid `new` and `delete`.

New developers often take them too far but I wouldn't call it a toxic myth. Everyone eventually figures out they're not sacred, just very useful. Not everything is a major tradeoff, sometimes there's just multiple ways to do the same thing and one is clearly better than the others for most things. Unless you know there's a good reason to use one of the other ways, you can save tons of time not having to think about it, especially when there are many multiple ways.

Sometimes the "good reason" is that a certain way won't work with the giant code base that already exists and the current way isn't actively causing any bug or major security risks. It can be hard to convince team members that want everything to be the "best" way of that, but that's a problem with people, not a problem with the concept of "Best Practices".

Every major life-or-death software system has "Best Practices". Hopefully, everyone that works on those system understands it's a fluid concept that will change as new things are learned but it's important to understand and follow them unless there's a valid, peer reviewed, reason not to.


I agree that there are no unified "best practices". But, I'm sure most understand that what we mean are "some good practices, and definitely try to avoid some problematic ones".

Posts like these are examples of what I find a bit disappointing on Hackernews. It looks interesting on the surface, but the ends up being generic and completely lacking of any substance. I wonder why these end up in the front-page.

Legal | privacy