Hacker Read

agentultra · 2023-09-18 09:39:05

GC often gets more flak than is deserved.

Often such flak ignores the differences between throughput and latency.

For long, lived processes you’ll end up writing some kind of garbage collection system.

kjeetgill | karma 3922 | avg karma 3.2 · | 2019-12-15 21:28:32

I think the rock-paper-scissors dance between throughput, latency, and efficancy that a GC plays is a pretty facinating topic. It really brings out the our biases from the kinds of work we do.

If a service has reduced it's latency that only increases the throughput the overall system. CPU/Memory efficiency of a single process often much less important because there's often headroom and GC can help leverage underutilized that cpu/memory.

reply

erik_seaberg | karma 4377 | avg karma 1.56 · | 2022-11-11 14:21:27

The GC is low latency, but its throughput isn’t great. The key is avoiding the heap by mostly brute forcing trivial data structures on the stack (which is why you see so many repetitive O(n) loops).

tedunangst | karma 26000 | avg karma 2.74 · | 2012-04-03 20:38:01+00:00

I think people generally assume systems programming requires low latency and that using a GC implies high latency. That's mostly, but not strictly, true. If you're writing a web server, as opposed to a missile guidance system, the occasional GC pause is probably not a dealbreaker.

rdtsc | karma 41164 | avg karma 4.52 · | 2014-04-29 11:53:54

Very good point. Performance folds in latency (determinism) and throughput. There is usually a trade-off between the two. GC might handle throughput reasonably, latency is a bit tougher. You are sort of left tweaking knobs on a black box hoping to get good results in the end.

xxs | karma 3320 | avg karma 1.47 · | 2021-10-15 10:16:40

What I meant is that low latency GC is sort of a commodity/mainstream now. Not advising to use java or a specific JIT+Collector.

kaba0 | karma 9701 | avg karma 1.18 · | 2021-11-23 13:48:10

Also, throughput. But latency and throughput are almost universally opposite ends of the same axis — that’s why it’s great that Java allows for choosing a GC implementation.

Tuna-Fish | karma 12536 | avg karma 5.5 · | 2013-05-09 11:36:40+00:00

It's perfectly passable for web serving and other such high-latency tasks. In a GC, the goals of throughput and latency are diametrically opposed -- optimizing for one makes the other worse. The present simplistic GC is a typical throughput-oriented design.

weberc2 | karma 4751 | avg karma 1.26 · | 2019-09-21 19:58:53+00:00

You didn't have a bad experience with GC in the past, you had a bad experience with a single GC implementation, one which was almost certainly optimized for throughput and not latency and in a language that pushes you toward GC pressure by default. :)

pcwalton | karma 42908 | avg karma 5.48 · | 2019-09-16 18:37:46+00:00

GC is more than just pauses. Throughput matters too; in fact, it often matters more than latency (for example, when writing batch jobs like compilers).

dignan | karma 145 | avg karma 2.5 · | 2021-03-23 17:12:51+00:00

GC is a memory management technique with tradeoffs like all the others.

GC has many different implementations, with widely ranging properties. For example, the JVM itself currently supports at least 3 different GC implementations. There are also different types of GC's, so for example in a generational garbage collection system you'll typically see two or three generations of GCs, depending on the generation (how many GC cycles it has survived) of the objects it collects. The shortest GC's in those systems are usually a couple milliseconds, while the longest ones can be many seconds.

GC isn't always a problem. If your application isn't latency sensitive, it's not a big deal. Though if you tune your network timeouts to be too low, even something that is not really latency sensitive can have trouble because of GC causing network connections to timeout. Even if it is a latency sensitive applicatoin, if GC "stop the world" pauses - pauses that stop program execution, are short it can be OK.

One reason you'll see people say GCs are bad is for those latency sensitive applications. For example, I previously worked on distributed datastores where low latency responses were critical. If our 99th percentile response times jumped over say 250ms, that would result in customers calling our support line in massive numbers. These datastores ran on the JVM, where at the time G1GC was the state of the art low-latency GC. If the systems were overloaded or had badly tuned GC parameters, GC times could easily spike into the seconds range.

Other considerations are GC throughput and CPU usage. GC systems can use a lot of CPU. That's often the tradeoff you'll see for these low-latency GC implementations. GC's also can put a cap on memory throughput. How much memory can the GC implementation examine with how much CPU usage with what amount of stop-the-world time tends to be the nature of the question.

reply

chrisseaton | karma 36438 | avg karma 2.64 · | 2019-05-11 17:28:35+00:00

Well it’s easy to make a GC with consistent latency if your attitude is ‘don’t collect things that need inconsistent latency to collect.’

riku_iki | karma 1987 | avg karma 0.77 · | 2018-03-22 18:17:20+00:00

Which GC exactly you used in your experiments? JVM has plenty of them with different properties regarding latency/throughput balance.

tomp | karma 21535 | avg karma 2.82 · | 2020-01-15 09:18:55+00:00

Isn’t one of the benefits of RC better (lower and more predictable) latency than with GC? Basically, you can presict when (and how much) garbage will be collected, and if it turns out that’s too much, you can fix your code... good luck doingthat with GC.

The downside is, of course, that it requires much more care (to avoid cycles)

reply

joosters | karma 15062 | avg karma 4.78 · | 2015-07-19 14:57:46

Yeah, I'm being unfair in naming Go & Java specifically. But these stories of 'fixing' garbage collection come up all too often.

I wonder when we'll see a further GC update that trades latency for throughput...

The problem seems to be that no matter how you tweak GC, you will always have a class of program that it performs terribly for (and it seems to impact a large group of programs, never just some obscure corner case). So I suspect that this latest GC tweak will have unexpected results on some other class of program, leading to another tweak, and so on...

reply

rco8786 | karma 7116 | avg karma 3.48 · | 2016-02-03 21:13:21+00:00

> relying in GC behaviour

You don't really have a choice in the matter. If you're writing high throughput or low latency applications you are dependent on the JVM's GC behavior, period.

reply

wtetzner | karma 4883 | avg karma 1.99 · | 2020-01-30 19:24:00

But a low-latency GC won't solve that problem.

roca | karma 5364 | avg karma 3.68 · | 2021-03-10 06:52:06

GC always forces you into a tradeoff between pause time, throughput, and memory overhead.

ajross | karma 32824 | avg karma 3.42 · | 2011-11-30 11:41:54

Digression here, but I like the "(nearly)". This bit always amuses me about garbage collection wonks. The pauseless bit is a real time requirement. Saying your GC has great latencies or is "(nearly) pauseless" is tantamount to telling a real time engineer your system only fails some of the time. It makes you look dumb.

GC is great. GC makes a ton of things simpler. GC as implemented in popular environments still sucks for real time use.

reply

lumpypua | karma 839 | avg karma 3.47 · | 2014-11-10 21:27:07+00:00

GC has better throughput at the expense of significantly increased memory usage, and variable latency for any individual task.

On servers, that's fine.

On my phone, I'll take lower memory usage and predictable latency any day.

Yay refcounting.

reply