> Single-threaded performance is a toss up and depends on work load.
I would actually say the exact opposite is true. Single threaded performance is much more reliable and every single application can use it. Multithreaded performance is much more workload dependent, and there are many applications that can’t fully utilize it.
> This win will be constrained by Amdahl's law. A single program/process that is multithreaded can effectively use only a finite number of cores.
This assumes that the workload is fixed. A classical example would be compiling source code: you have a fixed number of source files and the compiler is only able to parallelize so much.
On the other hand, there are applications where more threads allow you to increase the workload itself. For example, a modern DAW (digital audio workstation) is able to render tracks in parallel; if you have more threads, you can use more tracks with more plugins. In a computer game, more threads might mean more complex simulations, more objects, better graphics, etc.
Having multiple threads does not mean that they are all doing equally useful work. Single threaded performance is absolutely critical for a desktop machine.
Even in multithreaded desktop applications, it's rare to see them effectively use more than 8 threads.
> only relevant for workloads that can't be parallelized
This ends up dominating anyway. Partly because the easy stuff does get parallelized. So you're left with things like "browser layout engine" being single threaded. Almost all systems have single-threaded UIs, because trying to do anything else becomes unreasonably hard to reason about. The UI thread will delegate work to other threads, but there is nearly always a single "UI thread" bottleneck.
> How would this make any difference for a single threaded program?
Coz can slow down individual sections, not only threads. This will show you how your program behaves with specific lines sped up regardless of threads. A modern computer is also a distributed system, so it has the same counter-intuitive performance behaviours, even when just looking at the memory hierarchy.
Huh? Of course it has some benefits. Unfortunately everything is just trade-offs. A benefit of single-threadedness is that you don't deal with the certain pitfalls of multi-threadedness.
> just spawning as many threads as there are cores is the optimal amount of threads in a thread pool for every use case
Absolutely not.
Any task with any amount of I/O will have a significant amount of blocking. A GPU kernel may take microseconds or milliseconds to respond. RDMA (a memory-access over Ethernet) may take many microseconds.
Having multiple threads per core would be more efficient: it gives the cores something to do while waiting for SSD, RDMA, or GPUs. Remember: even the earliest single-core systems from the 1970s had multiple threads on one core: its just more efficient to have multiple terminals to read/write to at a time.
--------
One hardware-thread per thread (since SMT8 machines exist like POWER9 / POWER10) is only efficient in the most computationally expensive situations. Which is in fact, a rarity in today's world. Your typical programs will be waiting on the network interface or SSD.
IIRC: there's professional thread-pools out there that are 1.5x threads per hardware thread as a default option, and then scale up/down depending on how computationally expensive things look. That is: a 64-core/128-thread Threadripper would be overloaded with 192 threads, under the assumption that at least 33% of them would be waiting on I/O at any given time.
The part of Armstrong's argument I was (explicitly) referring to was not about relative gains of multi-threading but about presumed absolute losses of single-thread performance. My argument against this is not refuted by relative gains of multi-threading..
Of course you realize even bigger gains on many common workloads using parallelism, but this part of his argument doesn't need the first part, which was wrong.
Because people still buy new computers and use those applications on them – that's why single-thread performance is (for the moment at least) still important.
I do understand where you're coming from, but real-world performance is important, especially when that world is imperfect.
Depends on the program. If it requires, say, to synchronize hundreds of thousands of entities every 16ms, it's probably better to go with single threaded instead...
> you're talking about one thread per core, not one thread per unit of work?
Yes, a thread pool, consisting of one thread per core/computing unit. The units of work are then scheduled between the threads. Units of work here being some kind of IO, e.g. servicing HTTP requests.
> but if you want to spin up a few hundred thousand of them...
Hm. Thought there was a limit for work that can be done concurrently by the CPU, based on the number of cores/hyper-threads available. Found this on threads and IO performance [1], it seems to make the same point.
What kind of work load is common to spread over so many threads (on the same machine)? Does the OS switch efficiently between hundred of threads on regular CPUs? Genuinely interested.
Fires up your activity monitor or task manager and you can see that most applications have much more than one thread. Many even have tens or several tens of threads.
I would actually say the exact opposite is true. Single threaded performance is much more reliable and every single application can use it. Multithreaded performance is much more workload dependent, and there are many applications that can’t fully utilize it.
reply