GC is more than just pauses. Throughput matters too; in fact, it often matters more than latency (for example, when writing batch jobs like compilers).
You actually want GC pauses in a challenge like this, because it's a pure batch problem with no latency constraints at all (like a compiler). When that's the case you can use GC algorithms that are optimized for throughput rather than pause times, and things will go faster.
It probably won't make much difference on this problem for various reasons, but normally that's the case.
I think people generally assume systems programming requires low latency and that using a GC implies high latency. That's mostly, but not strictly, true. If you're writing a web server, as opposed to a missile guidance system, the occasional GC pause is probably not a dealbreaker.
Pause-free GCs have existed for years, for what it’s worth. They do still have some CPU overhead in many cases, but it’s a question of throughput, not latency.
I mostly agree with what you're saying, but I'll also add that GC pauses are mostly a problem of yester-year unless you're either managing truly enormous amounts of memory or have hard real-time requirements (and even then it's debatable). Modern GCs, as seen in Go, Java 11+, .NET 4.5+ guarantee sub-millisecond pauses on terrabyte-large heaps (I believe the JS GC does as well, but I'm less sure).
Let me add: The key point of the GC per process is that when heaps are small, pause times are also small. And since GC is counted against a process reductions frequently GCing processes get swapped out all the time and will not hurt the latency of other processes.
Yes, these days, GC might easily be the biggest problem to the point that focusing on anything else would be a waste of time and premature optimization.
GC pauses can be anywhere between 300ms to 30 seconds or more when it starts becoming an issue.
This should reduce pauses, I think. However, it's important to note that latency (short pauses) and raw throughput are antagonistic criteria for GCs.
A concurrent GC means that you can collect garbage while other code is running, but to get it working you need to introduce overhead both when reading and writing, which means that time spent on GC-related things will go up, even if latency goes down.
In fact, if memory serves, the best GC strategy for raw throughput is actually stop-the-world, since there's so little overhead for memory accesses (basically you can just follow pointers without any bookkeeping). Of course, ten-second (or multi-minute) GC pauses are probably unacceptable for most tasks. =)
In throughput-optimized GC setups (not latency optimized) that's the case. Though free(), despite being seemingly "constant", can have unpredictable pause times as well when you hit a bad point.
ZGC's pause latency is at a point where it rivals, or about to rival, nondeterministic pauses by the OS. So unless you're running on a realtime kernel, GC is not an issue any more as far as latency is concerned. The only real, serious cost for modern GCs is RAM overhead.
I don't write software at fang scale and I've been bitten by GC pauses in other languages. Even when it doesn't cause issues, you can see it in the response times charts.
Not having to worry about that is very nice.
Using less memory, I agree it's not terribly important but it is a nice bonus - which comes in handy when working on embedded, videogames or in constrained conditions (like when you can put a lower limit to your containers and run more containers on the same machine).
Very good point. Performance folds in latency (determinism) and throughput. There is usually a trade-off between the two. GC might handle throughput reasonably, latency is a bit tougher. You are sort of left tweaking knobs on a black box hoping to get good results in the end.
Digression here, but I like the "(nearly)". This bit always amuses me about garbage collection wonks. The pauseless bit is a real time requirement. Saying your GC has great latencies or is "(nearly) pauseless" is tantamount to telling a real time engineer your system only fails some of the time. It makes you look dumb.
GC is great. GC makes a ton of things simpler. GC as implemented in popular environments still sucks for real time use.
reply