Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

C++ before C++11 didn't. I've fixed errors in projects which was due to string literals not being std::string. After C++11 things are more murky due to user literals[1]. I'd lean towards saying the language still doesn't, but yeah, murky.

Zig and Rust I don't know enough about.

And I'm not pretending. C has string literals which are of a non-distinct type. You can't distinguish between a string literal and an array of characters. This is the crucial bit.

The result is that the standard library, and lots of other code, relies on convention alone to pass strings around. This has been and continues to be the source for countless serious bugs. The kind of bugs which are a total non-issue in languages which has strings.

[1]: https://en.cppreference.com/w/cpp/language/user_literal



sort by: page size:

C++ has string support in the standard library.

It doesn't have the same breadth of features as, say, Python's string class, but it's ok.

See, eg. https://en.cppreference.com/w/cpp/string


This just further confirms, for me, that there is no feature C++ is not willing to include. Seems like C++ is getting closer and closer to the day when any randomly typed string of characters has a decent chance of being syntactically valid C++.

To the contrary, modern C++ has solved very few, if any, of the problems described in the article.

Being generous:

* There's now `std::string_view` to address some of the problems with `std::string`, but the rest are still there. There are some attempts to specify the encoding now, at least.

* Lambdas and `std::function` pretty much solve the function pointer complaints, with some added complexity.

* Containers still do silly things when you use `c[..]` syntax with no element there. (Both when trying to insert and when trying to retrieve!)

* The general level of language size and complexity, especially around templates, has only gotten worse. Concepts will finally help in some ways here.


Story time: Chrome is a Frankenstein of C and C++ code (both directly and through called libraries, static or dynamic). Now C++ has `std::string` obviously. C of course has `const char *`. To facilitate interoperability C++ has various methods to implicitly or explicitly convert between the two.

At one point it was found these every keypress on the OmniBar resulted in 25,000 string copies.

So my point is that you can write C++ as carefully as you can but on any sufficiently complex code base you'll going to need a pointer to something and then you've really lost all control and safety so the safety in C++ is a bit of an illusion.


Strings are a known problem in C++. Wish the standard committee could end this string nonsense once and for all. Until then I'll just stick to std::string and char *.

This implies that C++'s std:;string is mutable. I think I'll continue to run the other way, to avoid C++ (and C as well) whenever I can. Mutable strings are insane.

C++ has pretty lousy string handling, too. std::string is a step up from char*, but why can't I do std::string(10)? Or str = nErrors + std::string(" errors"); ? Why do I have to go through the hassle of a std::stringstream just because I want to have a number in my string? Parsing stinks, where's the std::string::split() function?

Boost has a bunch of string algorithms, but that's design overkill; I don't really need to have the algorithms work on C strings because I won't be using C strings if there were an simple, useful string class like Java's, or Qt's.


> Do you know many languages where literal strings come with a big warning sign saying "probably not what you want, use this (rather opaque) alternative syntax instead"?

Haskell? Python 2?

(Also, C/C++ string literals are perfectly safe to use — they're guaranteed-null-terminated immutable arrays that implicitly convert to `string` and `string_view`; it's more that its safer in modern C++ for function parameters to be `string_view`/`string`/`const char(&)[N]` instead of `const char*`).


Unfortunately, in C++ even the notion of the string type itself is not part of the language...

Before C++11, std::string with non-ASCII characters was terrible. Old programming languages/standards tend to disregard i18n.

Sadly, the C++ standard bodies are actually breaking old code in new versions of the language.

For example, C++/11 introduced UTF8 string literals. Great feature which does what you’d expect – declare a string literal in source code, get a const pointer to null-terminated array of utf-8 bytes.

Then a decade later, in C++/20 they refactored these UTF8 literals to evaluate into const pointers to the new incompatible data type, char8_t.

It’s so bad that compiler developers had to implement switches to disable the new BS. Unfortunately, these switches are incompatible across compilers, -fno-char8_t in gcc, /Zc:char8_t- in msvc.


What exactly is a problem with C++’s strings, or especially Rust’s? Everything you mentioned can be controlled as explicitly as you want.

The only problem is C is simply not expressive enough to have proper abstractions like that.


The problem with std::string is, it was useless in the real world because its interface is basically unsuitable for handling Unicode. That's why every project that cares about i18n had to define its own string class to get anything done.

Only with C++11 did they add std::u16string, which is at least suitable for the (horrible) UTF16 encoding. Maybe we'll get an UTF8 capable std::string class in another 20 years.


How is the support for strings in C++? Presumably better than in C, but is it good enough when compared to other compiled languages - Go, Rust etc?

There are user-defined string literals (1) in C++11. They can do pretty much whatever you want.

Also, using templates it is trivial to distinguish between arrays of characters and character pointers.

Finally, there is a proposal for a string_view (2), which could be used to represent string literals, no copies needed.

(1) http://en.cppreference.com/w/cpp/language/user_literal (2) http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n360...


"Lessons learned, their strings work"

Except that they don't. Either they are 'complete' but massive and thus slow, or they start as 'array of byte' and then their designers spend 10 years implementing a more 'complete' string type that is still fast enough and end up as #1 anyway.

Of course the C++ way where there is no string type that everyone uses sucks too, it's just that strings are almost impossible to get 'right' because there is no real 'right' and so many special cases that aren't apparent at first sight.


Even an article as shallow as this raises some red flags about the code base:

* A common base class is rarely a good pattern in C++. Such classes have a tendency to bloat.

* A custom string implementation is a cliché for new C++ adopters. Everybody thinks they need to reimplement std::string. Most are wrong.

* The code as shown separates declaration and initialization of local variables, which decreases readability and safety of the code.


Because it sucks to deal with strings in C & C++, and the web is mostly strings.

Kind of weird how he wrote his own string class, as well as rewriting some other perfectly good standard library stuff.

not that weird: Qt and others do it as well. Not sure why though, maybe because at the time they started there was a lack of a decent std::string implementation on all platforms?

next

Legal | privacy