Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.

A modern compile time type checked formatter would have prevented this mistake, you are deliberately choosing to use poor tools and calling this "pragmatism" because it sounds better than admitting you're bad at this and you don't even want to improve.

In fact C++ even shipped a pair of functions here. There's a compile time type checked formatter std::format, which almost everybody should use almost always (and which is what std::println calls), and there's also a runtime type checked formatter std::vformat, for those few cases where you absolutely can't know the format string until the last moment. That is a hell of a thing, if you need one of those I have to admit nobody else has one today with equal ergonomics.



sort by: page size:

> Maybe fmt fixes these problems, I don't know.

Yeah, it looks like you did a lot of guesswork in that comment, and a lot of those guesses were inaccurate. Not really trying to be hostile here, but you did acknowledge that you were unfamiliar with std::format.

The part that fmtlib / std::format has, which is printf-like, is the idea of having a format string and arguments, rather than having a bunch of separate, piecemeal strings.

  // Old printf code, works ok for most people
  std::printf("failed to clone %s from %s", target, src);
  // <iostream>
  std::cout << "failed to clone " << target << " from " << src;
  // New std::format / fmtlib
  std::print("failed to clone {} from {}", target, src);
You can see that you don't need to remember what kind of format specifier you need. This is C++, and that kind of problem is solved with overloading.

The std::print interface can work equally well with FILE or std::ofstream, or whatever you want. This is C++, and so you can just use a templated output iterator—or one of the overloads that creates one automatically.

There are a lot of problems with <iostream>. I think it’s telling that lots of languages have copied printf, but nobody (or almost nobody) thought <iostream> was good enough to copy. There are just too many serious design flaws with <iostream>. It would be one thing if <iostream> were just annoying to use, but it poses problems for localization, thread-safety, accidental misuse through its statefulness, and its operator overloading syntax is bad.


> a feature C manages to have, but here's Rust's

This is clearly disingenuous. C's string formatting is a completely different thing from that Rust snippet or your snide remark about std::format.

I'd rather use printf than cout, I think iostream is clunky, but printf is definitely a footgun & a fairly common source of bugs itself. So much so that compilers had to add support for specifically recognizing printf() calls & validating the inputs.

Also don't forget C still doesn't have standard placeholders for sized types. Gotta use that derpy PRId64 & friends which makes printf() almost as ugly looking as iostreams.


> C Printf never was, never could be and never will be a suitable way to output data from C++. Now excuse me while I go through the list of thousands of predefined format macros to find out which I need to use to output a uint_fast16_t without making the compiler vomit nonsense.

   printf("%d\n", (int) myfast16_t);
Not that terrible for a type that I've never used, nor seen used.

> The second point (how to print an integer) is about something way less easily discoverable than in the previous example. You need to learn about how to print, then about `fmt` and from there the only authoritative place that contains information about format specifiers is the doc comment of `std.fmt.format`.

I don't know if ZSF can guide this or if we can incept it into Andy's head, but if there were any coordinated plan to "solidify certain parts of the stdlib sooner than other parts" the std.fmt.format would be high on my list. And document it. Hell. document it, even if it changes around all the time. I'm often forgetting how that beautiful bastard works and having to jump to the comment in the stdlib is kind of annoying.


> However, in C++98, how else would you handle when you have an arbitrary number of things to print in a type-safe way? e.g.

If I were the one designing the language, I'd have simply fixed printf to be type-safe. (e.g. via adding macro support a la Lisp)


> Are we seriously talking about nested pointers, alignment and CPU registers before we've even introduced printf?

No, printf is introduced in the very first hello world example. But the concept of format strings is not explained, nor are variadic functions, and no documentation for printf is linked.

Also, the example

    #define MY_TITLE "Hello world"
 
    printf("This is: %n", MY_TITLE);
is really quite broken. %n needs and int* argument and writes the number of characters written so far to it. A string constant is not only of the wrong type, even more importantly it is not writable (though not const).

Of course a modern compiler with -Wall would catch this, but the tutorial never mentions any flags you might want to pass to your compiler...


> A test about pointers is not stupid. A test about printf formatters is.

Obviously you did not have the joy of working with people who write their own byzantine 1k LoC formatting code for cases that would have been perfectly covered by printf (and yes, this was on a system where a printf implementation was available).


> But what use case do you have for printf that can't be implemented with println!/format! and friends?

I'm guessing they mean that the format! machinery is rather complex and brings in a lot of code. That it's also somewhat slow is a recurring concern .

Have none of the embedded folks created a specialised version of format! which is not generic, and basically only supports the types and styles which c's printf supports?

edit: there's ufmt though it seems somewhat inactive and I've no idea how good it is: https://github.com/japaric/ufmt

> Optimized for binary size and speed (rather than for compilation time)

> No dynamic dispatch in generated code


> you have to run a single time to know it works

A common use case for cstdio/iostreams/std::format is logging. It's not at all uncommon to have many, many log statements that are rarely executed because they're mainly for debugging and therefore disabled most of the time. There you go, lots of rarely used and even more rarely 'tested' formatting constructs.

I don't want things to start blowing up just because I enabled some logging, so I'm going to do what I can to let the compiler find as many problems as possible at compile time.


"Consider the hoops one needs to jump through to make std::print work with a custom type when compared to the old stream operators."

As an old timer, this kind of made me laugh. I remember when people found the old stream operators burdensome.


>I find that the dumbest code is the best code

Not always, though.

See every bug and exploit with C arrays or pointers that exists because C devs think even minimal attempts at safety are too complicated or slow, or old-style PHP code that builds SQL queries out of printf strings directly from POST values, or probably countless other examples in other languages. C++ code that uses raw pointers instead of references or that uses std::vector but never actually bothers to bounds-check anything.

It's entirely possible for code to be too dumb for its own good.


> I prefer using C-style casts for brevity,

There are strong reasons to avoid them altogether in C++, not the least is that you can sometimes get a reinterpret_cast where you expected a static_cast semantically - usually because you have forgotten to include the header that has the definition of the type, and are dealing with an incomplete type (explicit static_cast will fail in that case; but C-style cast will be treated as a reinterpret_cast, even if it would have been a static_cast if the definition was visible and type was complete).

> Who does that?

The typical example is when people pass std::strings to printf, either because they expect them to work with %s, or because they simply forgot to call data(). Unfortunately, on compilers that don't try to detect structs-as-varargs, this often works by accident.


> In the 1990s there was only one string type - NUL-terminated characters.

Indeed. If I could wave a magic wand, I would add a fat pointer to the language that we use to represent strings and arrays, and that we can automatically convert to/from const char* at compile time. This would replace string_view and span. All of the functionality of std::string would be available to it, and a call to strlen, or strcpy on it would work via the length, rather than the null terminator.

But alas, no magic wand.


> The few times I run into memory corruption, a memory leak, a segfault, dereferencing a shit pointer, undefined behavior, and so on, its always, always because I'm doing something I shouldn't be doing. Like working with raw pointers or pointers to pointers to pointers, or traversing an array of bytes to do something there's already a library that does, or manually calling delete on something, or using reinterpret_cast<>, or using one of the many footguns C++ happily gives me.

You've never used an out-of-bounds index? Accidentally used an object that was on the stack beyond the function call? Let an integer overflow? The problem with C++ is that all of these things don't look like unsafe operations, and people end up making mistakes with them that are not obvious.


> Will you ever add / have you considered adding sane formatting options for fixed length variables in printf? Say %u32 or %s64 ?

I'm not certain about the historical answer to this, but I do know that we're currently considering a proposal to introduce an exact bit-width integer type '_ExtInt(N)' to the language, and how to handle format specifiers for it is part of those discussions, so we are considering some changes in this area.

> Have you considered adding access to structure members by index or by string name? Have you considered dynamic structures?

I don't recall seeing any such proposals. I'm not familiar with the term "dynamic structures", what do you have in mind there?


> why exactly you should be aware about the differences?

Because they aren't assured to be the same. They might happen to be the same on your current platform.

C++ isn't like Java. It's the programmer's job to be very aware of these sorts of platform-specific points of variation.

I don't see why you'd deliberately write undefined behaviour when there's no benefit to doing so. This isn't a purely academic concern - I've seen a malformed printf call wreak absolute havoc, presumably because it was stripping too much or too little data off the stack. Was a nightmare to track it down.

I don't want to think about which particular families of dangerous type-based mistakes are permissible in my codebase -- I'd rather write code which is correct.

> The only cases when I encounter these differences, they look like broken C++ compiler

If you write dangerous, non-portable code that invokes UB, you may well find that it works on plenty of real-world platforms. Your code is still wrong. For the specific case of assuming that `int` is 32-bit, you'll probably get away with this sort of clumsiness, yes, but 'ILP64' targets exist, even if they're rare.

Additionally, using `int` as a dangerous alias for `int32_t` is not expressive when it comes to readability.

> Technically UB, practically works fine in 100% of cases

UB can bite you in bizarre, unpredictable, intermittent ways, especially with optimising compilers.

Why argue that a bad habit isn't that bad, when you could just avoid doing it?

> If you ask “but how do you know it works?” I’ll answer “because I read assembly”.

That's a very fragile assurance of correctness. Unless you explicitly embed that assembly code, the compiler is free to generate different code at any point in future. Can you guarantee that you'll never change or upgrade your compiler, tweak its flags, or target another platform? Can you guarantee the optimiser will never change its mind and generate different code for that function?


> random habit of mine to put std:: in front of these C functions

And did you learn your lesson about making random changes that "shouldn't matter" without proving they don't matter? :)

I find that once I spend the time to make these changes correctly, they are not worth the time to make correctly.


> It's also ugly to up-cast everything to the largest potential size whenever you use format strings.

I'm not sure that is true. It accurately captures the reality that you don't know the size of the type, but that you have determined what the maximum size can be and hopefully made considerations for it.

I should think there isn't even necessarily a performance cost, as it wouldn't be hard to trick out a compiler to recognize what was going on and optimize accordingly.

> What type would you use if you wanted to print uint128_t? %llld ?

IIRC, there is no standard portable format string length modifier for 128-bits (I think some platforms used %q for it, but that's definitely not portable), so literally nothing. Format strings suck.

> Finally, I think rejecting a standard C header file because it is "ugly" and coming up with your own solution is unnecessarily fragmenting things, especially when it isn't clearly better (IMO it is clearly worse).

Note that as the presentation points out, the better thing to do is whatever is going to easily adopted. In this case, where people are already using format strings, and already working with a time_t that might be only 32-bits wide, this might actually be that solution.


> I honestly don't know how C++ programmers typically handle this situation.

The verbose one. A few extra lines rarely matters. I think the number of times this has come up in my code base is very very small, maybe a few dozen extra lines across hundreds of thousands.

next

Legal | privacy