Hacker Read

einpoklum · 2021-03-04 10:54:31+00:00

1. People often use set instead of unordered_set (and same for map) despite not needing order. This slows things down.

2. The C++ standard library's maps and sets are known to be rather slow. See, for example:

https://stackoverflow.com/q/42588264/1593077

when you have string values, it's even worse, as you describe. But it's not clear that an overly-clever implementation, which caches numeric ranks of strings etc., is a good idea to have.

reply

MauranKilom | karma 4838 | avg karma 2.94 · | 2021-03-15 13:23:50+00:00

Regarding speed of the C++ version: The fundamental mistake is to use std::unordered_map. The standard library map and set implementations (both ordered -> tree-based and unordered -> hash-based) are almost never the correct choice if you care about performance.

That's definitely a flaw in the language, but like with C you would just use a different implementation for this purpose.

reply

einpoklum | karma 3425 | avg karma 0.93 · | 2023-03-02 02:16:05

Both std::map and std::unordered_map are slow, ,actually. See this discussion on StackOverflow:

https://stackoverflow.com/q/42588264/1593077

reply

twic | karma 21889 | avg karma 3.14 · | 2018-06-16 18:46:46

std::unordered_map is a bit of a red-headed stepchild in the C++ world. I'm not surprised that it hasn't had nearly the amount of tuning that hashmaps have had in other languages.

When i asked some C++ers about it, they warned me off using it, for reasons i didn't fully understand, but i got the impression that there are structural reasons why it can never be really fast, so anyone who needs a really fast hashmap uses some non-standard one anyway.

reply

panic | karma 27935 | avg karma 9.26 · | 2023-03-02 01:45:30

std::unordered_map is forced to be slow and make lots of allocations due to iterator invalidation requirements. It's usually a good choice to avoid it (and use a different hash table implementation) if you care about performance.

dragontamer | karma 30240 | avg karma 2.84 · | 2021-11-23 11:15:35

Looking at the code more carefully, I'd actually bet its string that's slow, and not actually map.

unordered_map is actually known to have relatively bad performance on GCC and some other open-source implementations. I don't know if there's a spec-problem that favors a slower-hash table (rather than the modern open-addressing scheme, which is faster on modern computers), but I can believe unordered_map has an issue on common standard libraries today.

------------

> ret[string("id.")+to_string(i%500000)] = string("val.")+to_string(i);

Lets count the mallocs.

> string("id.")

One

> to_string(i%500000)

Two

> string("id.")+to_string(i%500000)

Three, then two frees() as the previous two strings are deallocated.

> ret[string("id.")+to_string(i%500000)]

Four. The ret[] operator searches for the string. Since it doesn't exist, it now has to malloc the key and insert the value in there. std::move() might solve that issue and cut this one out easily?

> string("val.")

Five

> to_string(i)

Six

> string("val.")+to_string(i)

Seven + 2x frees.

> ret[string("id.")+to_string(i%500000)] = string("val.")+to_string(i);

Eight, to place the value in there. Again, probably solved with std::move. +1 more free.

So eight mallocs and 6 frees?

---------

Yeah, C++ strings shoot yourself in the foot performance-wise. But they're easy to use and free of stupid malloc/free bugs, but it means that they will malloc/free a LOT more than you might guess with the untrained eye.

I know that Perl / PHP / etc. etc. were explicitly designed to be more favorable towards string manipulation. And I'd bet that there's some old C++98 stuff that might lock in the performance of string to this older, less performant model.

EDIT: I see you managed to avoid the problem of excessive mallocs/frees in the Rust code with "format". The C++ equivalent might be snprintf, though that dips down into C-style strings.

reply

panic | karma 27935 | avg karma 9.26 · | 2017-06-26 11:38:35+00:00

For instance, the unordered_hash_map in C++ is based around a linked list of key/value pair buckets. This means iterating through the keys or values is very slow (lots of cache misses!), but insertion is fast.

Has this changed recently? Last time I used unordered_map, insertion was also slow because it had to allocate a linked list entry for every item in the hash table.

reply

einpoklum | karma 3425 | avg karma 0.93 · | 2021-08-19 05:57:01

Well, std::unordered_map is faster than than std::map, but it is still slow, since it uses lists for its buckets, with lots of dynamic allocation. See also:

https://stackoverflow.com/a/42588384/1593077

and the link there. Not sure that's what GP meant though.

reply

TonyTrapp | karma 3275 | avg karma 5.06 · | 2023-03-02 06:12:11

I rarely need to use maps, but everytime I had to, std::unordered_map was measurably slower than std::map. In particular in debug mode, it was really slow, and if you choose an unfortunate hashing function (I think older MSVC versions had a really bad one), it can easily become a bottleneck. In particular with small data or data that has trivial comparison functions, std::map might often be better and should be the go-to choice.

sharpneli | karma 2667 | avg karma 4.39 · | 2014-12-23 16:54:29+00:00

C++ unordered map does not maintain the order of the elements, which PHP does. So it is not a fair comparison.

creata | karma 1853 | avg karma 2.87 · | 2023-06-29 11:27:00

Can someone explain why std::unordered_map has to be slow?

I vaguely remember that it's because it isn't allowed to invalidate references when inserting elements.

reply

zelly | karma 2630 | avg karma 2.16 · | 2020-08-13 07:38:29+00:00

To be fair std::unordered_map is one of the slowest hash tables in any programming language.

codr7 | karma 2545 | avg karma 2.02 · | 2023-11-30 10:13:38

It's ordered and predictable, which is far from rubbish.

In most cases std::unordered_map will be faster, but hashtables have nasty edge cases and are usually more expensive to create.

I can pretty much guarantee it's been optimized to hell and back.

reply

adev_ | karma 1564 | avg karma 3.51 · | 2019-03-14 05:04:00+00:00

Not even considering that unordered_map is not the fastest hash map implementation for C++. There is plenty of the flat hash map implementation (swiss tables) around that beats unordered_map by an other factor....

https://abseil.io/blog/20180927-swisstables

reply

likpok | karma 1450 | avg karma 2.52 · | 2018-09-07 18:58:26+00:00

Ordered maps require other tradeoffs. For example, c++ has both: std::map and std::unordered_map. std::map is a treemap, and so has logn access/insertion times. It also does a ton of small allocations, and scatters your data across memory, leading to poor cache performance at scale.

But it has ordered output, and doesn't invalidate iterators on insertion (hashmaps might, because they sometimes need to rehash).

Generally, the suggestion that I've heard is to avoid using std::map unless you really really what it specifically provides, because it's expensive and it's hard to know if you can safely relax those constraints.

reply

jb1991 | karma 16037 | avg karma 8.92 · | 2023-03-02 02:43:32

The thing is, it’s not actually effortless in many situations to switch from one map to the other. In C++, if you switch to an unordered map, you now have to contend with the annoying boilerplate of implementing a hash function for many key types. Using an ordered map may sometimes give you a slight hit in performance, but it offers significant benefits in developer convenience, and for small maps, the binary tree could actually be faster anyway.

usefulcat | karma 3745 | avg karma 2.73 · | 2021-03-15 15:05:42

Although std::unordered_map is definitely not nearly as fast as a hash map can be in c++, that's probably not the main problem here. I would bet on iostreams and/or tolower being bigger problems.

Note that tolower is called for every /character/. I don't know everything it does, but I know it's doing some semi-esoteric locale stuff, and in general it's doing a lot more than a naive ascii implementation would. I seem to recall it's even doing a dynamic_cast in there somewhere. For each character of input.

I was curious, so I ran callgrind on it. Here are the top 10 offenders (by Ir, 'instructions retired'). For clarity I have elided many details. Total Ir is 5,647,099,795, so the stuff below accounts for greater than 50% of all Ir.

    1,079,594,033  std::istream& operator>>(std::istream&, std::string&)
      553,657,533  std::istream::sentry::sentry(std::istream&, bool)
      550,159,646  __cxxabiv1::__vmi_class_type_info::__do_dyncast(...)
      426,992,280  __dynamic_cast
      413,240,330  std::_Hash_bytes(void const*, unsigned long, unsigned long)
      383,294,340  tolower 
      230,374,111  std::string::_M_append(char const*, unsigned long) 
      223,653,549  /usr/include/c++/9/bits/stl_algo.h:main
      172,438,098  __strcmp_avx2
      164,226,680  std::ctype const& std::use_facet(std::locale const&)

UncleMeat | karma 13669 | avg karma 2.5 · | 2021-06-22 23:59:59

This has its own problems. Pointer stability limits the internal implementation so much that unordered_map is embarrassingly slow given what we know about modern hashmap design. There is a good reason why the swisstables that Google built dropped this behavior.

danpat | karma 580 | avg karma 3.09 · | 2016-09-09 23:11:27+00:00

For once, C++ got some usability right and created std::map and std::unordered_map which have explicit key ordering behaviour.

Now, back to waiting for my project to compile.....

reply

habibur | karma 2373 | avg karma 3.62 · | 2021-11-23 10:45:06

unordered_map performs even worse than map. Tried it.