Hacker Read

Hacker Read top | best | new | newcomments | leaders | about | bookmarklet

login

		What should I know about garbage collection as a Java developer? (www.azul.com) similar stories update story
		91 points by saikatsg \| karma 4403 \| avg karma 5.76 2023-01-09 23:38:20 \| hide \| past \| favorite \| 69 comments

view as:

turtledragonfly | karma 788 | avg karma 4.64 2023-01-10 16:01:18 | [–] similar comments

One thing that that I hadn't fully understood until recently is that garbage collectors can actually allow you to write more efficient code.

Previously, I had the general understanding that you were trading convenience (not thinking about memory management or dealing with the related bugs) in exchange for performance (GC slows your program down).

That's still true broadly, but there's an interesting class of algorithms where GC can give you a perf. improvement: immutable data structures, typically used in high-concurrency situations.

Consider a concurrent hash map: when you add a new key, the old revision of the map is left unchanged (so other threads can keep reading from it), and your additions create a new revision. Each revision of the map is immutable, and your "changes" to it are really creating new, immutable copies (with tricks, to stay efficient).

These data structures are great for concurrent performance, but there's a problem: how do you know when to clean up the memory? That is: how do you know when all users are done with the old revisions, and they should be freed?

Using something like a reference count adds contention to this high-concurrency data structure, slowing it down. Threads have to fight over updating that counter, so you have now introduced shared mutable state which was the whole thing you were trying to avoid.

But if there's a GC, you don't have to think about it. And the GC can choose a "good time" to do it's bookkeeping in bulk, rather than making all of your concurrent accesses pay a price. So, if done properly, it's an overall performance win.

Interestingly, a performant solution without using GC is "hazard pointers," which are essentially like adding a teeny tiny garbage collector, devoted just to that datastructure (concurrent map, or whatever).

mike_hearn | karma 11641 | avg karma 2.92 2023-01-10 16:20:49 | [–] similar comments

Yeah, but it's actually deeper than just adding refcounts. The algorithms themselves can change in some cases.

The issue is, the hardware can usually only do atomic/interlocked operations at the word level. If you have a GC then you can atomically update a pointer from one thing to another and not think about the thing that was being pointed to previously: an object becomes unreachable atomically due to the guarantees provided by the GC (either via global pauses or write barriers or both). If you don't have that then you need to both update a pointer and a refcount atomically, which goes beyond what the hardware can easily do without introducing locks, but that in turn creates new problems like needing an ordering.

zackangelo | karma 209 | avg karma 2.35 2023-01-10 16:23:51 | [–] similar comments

Most JVMs take advantage of a thread local "bump" allocator[1] as well to avoid having to cross JVM or kernel boundaries to allocate memory, which can result in huge speedups for memory-intensive use cases.

[1] https://shipilev.net/jvm/anatomy-quarks/4-tlab-allocation/

eldenring | karma 150 | avg karma 3.0 2023-01-10 16:37:24 | [–] similar comments

Bump allocators are incredibly fast, and are super efficient in generational GCs where compaction is super cheap, however almost all (maybe all) modern languages don't usually cross kernel boundaries when allocating memory, including C++ malloc.

Alifatisk | karma 2531 | avg karma 1.15 2023-01-10 16:26:13 | [–] similar comments

I don’t exactly know how but I’ve always connected GC-languages with slow performance, but today, I realized how wrong I was.

mike_hearn | karma 11641 | avg karma 2.92 2023-01-10 16:37:33 | [–] similar comments

Performance and GC is a tricky topic, partly because there's not so many GCd languages explicitly designed for performance above usability (maybe D would count? maybe Go?). GC is normally chosen for usability reasons, and then the language has other usability features that reduce performance and it gets difficult to disentangle them. Immutability is a common problem. GC makes allocating lots of objects easy, so people make immutable types (e.g. Java's String type) and that forces you to allocate lots of objects, which causes lots of cache misses as the young gen pointer constantly moves forwards, and that slows everything down whereas a C++ dev might shorten a string by just inserting a NULL byte into the middle of it. Functional programming patterns are a common culprit because of their emphasis on immutability. You bleed performance in ways that don't show up on profiles because they're smeared out all over the program.

Another complication is that people talk about the performance of languages, when often it's about the performance of an implementation. The most stunning example of this is TruffleRuby in which the GraalVM EE Ruby runtime often runs 50x faster or more than standard Ruby. Language design matters a lot, but how smart your runtime is matters a lot too.

A final problem is that many people associate GC with scripting languages like Python, JavaScript, Ruby, PHP etc and they often have poor or non-existent support for multi-threading. So then it's hard to get good performance on modern hardware of course and that gets generalized to all GC languages.

turtledragonfly | karma 788 | avg karma 4.64 2023-01-10 16:41:39 | [–] similar comments

Well, there's still truth to it in other cases, I think. One terrible thing GCs can do is make your performance unpredictable. In some performance-sensitive situations (eg: video games), your worst-case perf is more important than your average case. Adding a GC can mess with that worst-case behavior, and in unpredictable ways.

That being said, modern GCs are much better (less "stop the world" stuff), and more configurable. But it's still a real concern.

astrange | karma 14132 | avg karma 1.32 2023-01-10 18:32:00 | [–] similar comments

GCs also have very different memory access behavior, and so can be worse on a system with swap (or use more power on a system without one) even if theoretically they're more efficient.

theshrike79 | karma 10442 | avg karma 2.58 2023-01-10 17:14:53 | [–] similar comments

The "slow" here is very relative.

Yes, you can't push Discord -levels of messages with a GC language (like their famous "we moved from Go to Rust" post told us all).

But you're not going to be at those levels any time soon, GC pauses are just fine in a good 99.99% of cases.

This kind of thinking is a type of premature optimization, you skip whole categories of technology (garbage collection) because you "feel" it's slow based on what you read about an edge case. =)

lokedhs | karma 2892 | avg karma 2.22 2023-01-10 17:39:47 | [–] similar comments

Except whatsapp at least used to run on Erlang, which definitely was pushing that kind of data.

GC languages can have much better performance than non-GC languages (for many reasons, several of them mentioned here), and the main disadvantage being unpredictability. However, the unpredictability is not going to be problem for a chat application.

SirGiggles | karma 109 | avg karma 1.6 2023-01-10 20:40:42 | [–] similar comments

> Yes, you can't push Discord -levels of messages with a GC language (like their famous "we moved from Go to Rust" post told us all).

You can to a (pretty big) degree, as Discord have publicly stated in engineering blog posts:

- https://discord.com/blog/how-discord-scaled-elixir-to-5-000-...

- https://discord.com/blog/using-rust-to-scale-elixir-for-11-m...

keyczar931 | karma 5 | avg karma 2.5 2023-01-10 20:50:17 | [–] similar comments

99.9% of those posts from startup engineers saying "we migrated from X to Y" is because they wanted to play with technology Y and rationalised their decision.

winrid | karma 4020 | avg karma 1.39 2023-01-10 22:58:43 | [–] similar comments

IIRC they fixed a worst case latency of like 300ms in their LRU cache. These were periodic spikes. I doubt users really noticed a change from this. Maybe, it saved them some money...

GuB-42 | karma 15915 | avg karma 4.0 2023-01-10 18:40:17 | [–] similar comments

To beat a GC, you need really tight memory management. It can mean things like buffer reuse, bulk allocations, custom allocators, etc... But if you do, depending on your workload, you can get big performance improvement. Video games commonly use these techniques.

The downside, obviously, that it is a lot of effort. It is also unsafe.

And in the end, that's what make GC languages "slow". It is not really that the GC is slow, it is that programmers using GC languages tend to prefer productivity and safety over performance, they will usually not optimize their code as much, and they will generally avoid unsafe features, it they are available at all. Programmers using non-GC languages will usually focus more on performance, which is usually the reason why they have chosen a non-GC language in the first place.

As I always say, C usually tops the chart when it come to language performance. But it is not because of some kind of C magic, it is that it forces you to micromanage your code, and especially your memory. If you use some kind of framework that gives you GC-like features in C, you will be on the same level as other GC languages, sometimes slower, because that GC-like thing may not be as optimized as a true native GC.

sfpotter | karma 903 | avg karma 2.08 2023-01-10 21:01:58 | [–] similar comments

As far as using GC-like features from C goes, one benefit is that it isn't all or nothing. If you use Java, you basically need to have everything be GC'd. If you use C (or something similar like C++ or D), you can pick and choose from the most appropriate memory management techniques for different parts of your code.

jandrewrogers | karma 16134 | avg karma 5.16 2023-01-10 22:45:02 | [–] similar comments

> As I always say, C usually tops the chart when it come to language performance.

This has not been true for a long time. Modern C++ consistently outperforms C in practice, mostly due to its extensive metaprogramming facilities.

GuB-42 | karma 15915 | avg karma 4.0 2023-01-11 03:01:02 | [–] similar comments

C++ has more potential (indeed thanks to metaprogramming), but the average idiomatic C program is often faster. C++ tends to do a lot of implicit allocation, or initialize values even if not necessary. Also, smart pointers are not free (at least shared_ptr and weak_ptr are not).

If you have a good understanding of how C++ works, eschew convenient but costly C++ features, and care to optimize. You may get better performance than C. I guess that's part of the reason why most game engines are made in C++.

Important to note though that you will only get fast if you turn on compiler optimization, which actually is a problem as debug builds may end up unusably slow.

insanitybit | karma 4243 | avg karma 2.69 2023-01-10 22:22:02 | [–] similar comments

GC just describes a set of algorithms that track and manage resources at runtime. GC-languages would be languages that use a GC for all or nearly-all memory management.

But non-GC languages can absolutely use GC algorithms. It's actually pretty common - Chromium uses GC in C++, for example. This lets you selectively choose where your GC applies and tune it for not just the application but for that specific bit of the application.

There are lots of interesting algorithms for this that solve exactly the problem of "I have a concurrent data structure and I had to swap out a pointer" without a refcount - for example, epoch based collection.

tadfisher | karma 4907 | avg karma 2.71 2023-01-10 16:29:30 | [–] similar comments

Well put. I find it fascinating to watch memory-safe runtimes converge on automatic memory management (via GC or ARC) and owner/borrower models. I'm just not sure which I like better, or if I'm thinking too imperatively.

bob1029 | karma 15848 | avg karma 4.21 2023-01-10 16:37:59 | [–] similar comments

> But if there's a GC, you don't have to think about it. And the GC can choose a "good time" to do it's bookkeeping in bulk, rather than making all of your concurrent accesses pay a price. So, if done properly, it's an overall performance win.

In many environments, you can explicitly force a GC collection from application code. I've got a few situations where explicitly running GC helps reduce latency/jitter, since I can decide precisely where and how often it occurs.

In my environment, calling GC.Collect more frequently than the underlying runtime typically will result in the runtime-induced collections taking less time (and occurring less frequently). But, there is a tradeoff in that you are stopping the overall world more frequently (i.e. every frame or simulation tick) and theoretical max throughput drops off as a result.

Batching is the best way to do GC, but it is sometimes catastrophic for the UX.

Diggsey | karma 2494 | avg karma 6.22 2023-01-10 17:14:01 | [–] similar comments

Another fun technique similar to hazard pointers is epoch-based memory reclamation (implemented here https://docs.rs/crossbeam-epoch/0.9.13/crossbeam_epoch/ ).

One advantage of these kinds of "localised" garbage collection approaches compared to typical GC is that the cost scales with the size of the specific collection, as opposed to with the total allocation rate / memory usage of the entire program.

sebk | karma 226 | avg karma 3.37 2023-01-10 17:24:03 | [–] similar comments

The Dlang docs make a similar point in the Garbage Collection section, here: https://dlang.org/spec/garbage.html

heavenlyblue | karma 2264 | avg karma 0.77 2023-01-10 17:25:04 | [–] similar comments

Surely anything you do with a GC you can do in a language with no GC?

amelius | karma 42902 | avg karma 1.63 2023-01-10 18:03:08 | [–] similar comments

No, because if you have to spend e.g. 80% of your lifetime building the application, and having no GC would require you to spend an additional 30% of your lifetime on memory management, then tough luck.

jjnoakes | karma 2590 | avg karma 1.82 2023-01-10 18:10:01 | [–] similar comments

Aren't you assuming that you need to spend substantially more time on memory management without a GC? I don't find that to be the case personally.

adgjlsfhk1 | karma 4911 | avg karma 2.11 2023-01-10 19:23:50 | [–] similar comments

No. Writing a good GC is a really big project (i.e. it takes a team of smart people multiple years).

jjnoakes | karma 2590 | avg karma 1.82 2023-01-12 15:59:29 | [–] similar comments

Sorry, did you reply to the right comment? We are talking about writing code that uses a GC vs doesn't use a GC, not about writing a GC.

jandrewrogers | karma 16134 | avg karma 5.16 2023-01-10 22:53:47 | [–] similar comments

Writing C++ code than can outperform GC memory management is practically idiomatic at this point. It is muscle memory for people that do it for a living. There is nothing a GC can do that cannot be done as well or better in modern C++.

The real disadvantage of a GC is that it makes schedule-based optimizations difficult-to-impossible, which are a major optimization class in modern systems code. Even if it performed identically to manual memory management (which it doesn’t), the loss of scheduling control would adversely impact performance of codes that make heavy use of scheduled-based optimizations e.g. modern database kernels.

pjmlp | karma 109153 | avg karma 1.76 2023-01-11 00:38:59 | [–] similar comments

Good luck implementing lock free algorithms without the help a GC (of any form).

CyberDildonics | karma 3006 | avg karma 0.57 2023-01-11 18:56:22 | [–] similar comments

This is done all the time, I don't know where people get the idea that a GC is some sort of magic that enables lock free algorithms. There are tons of modern C++ lock free algorithms out there.

pjmlp | karma 109153 | avg karma 1.76 2023-01-12 00:25:28 | [–] similar comments

The magic is called productivity and helping hand.

There are also lock free algorithms written in Assembly, that doesn't mean the authors had any fun writing them.

Unfortunately Modern C++ is mostly found in conference talks.

Most the real life code looks quite Orthodox C++ to me, including from big name ISO members on their SDKs.

jandrewrogers | karma 16134 | avg karma 5.16 2023-01-12 01:45:09 | [–] similar comments

> Unfortunately Modern C++ is mostly found in conference talks.

I would suggest that you might be projecting your experience on the rest of the industry. I’ve seen nothing but modern C++ at several companies for the last decade. I haven’t seen legacy C++ anywhere in a really long time, thankfully. I occasionally see very early C++11, which is definitionally modern though dated by modern standards. Baselines tend toward C++17 these days on average.

pjmlp | karma 109153 | avg karma 1.76 2023-01-12 03:14:52 | [–] similar comments

I suggest the same, unless your industry experience for some strange reason is more representative than mine.

CyberDildonics | karma 3006 | avg karma 0.57 2023-01-12 10:07:42 | [–] similar comments

Good luck implementing lock free algorithms without the help a GC

The magic is called productivity

that doesn't mean the authors had any fun

You went from implying that it can't be done, to saying a GC is about productivity, to saying it's about "fun". The goal posts are all over the place here.

Meanwhile, I've done it over and over in C++. The reality is that memory allocation shouldn't be a big part of a fast data structure anyway, let alone a good lock free algorithm, because if you're relying on allocation that much you are hopping around in memory, not to mention leaving a big part of your data structure to a black box.

Unfortunately Modern C++ is mostly found in conference talks. Most the real life code looks quite Orthodox C++ to me, including from big name ISO members on their SDKs.

I don't know what either of these is supposed to mean.

If you want to be explicit, tell me what algorithms need a garbage collector, because this all sounds a lot more like something you want to be true or want other people to agree with than something learned from extensive experience.

pebal | karma 21 | avg karma 0.31 2023-01-11 03:02:50 | [–] similar comments

There are pauseless real-time GCs: https://github.com/pebal/sgcl

wvenable | karma 19014 | avg karma 3.37 2023-01-10 18:07:40 | [–] similar comments

You can implement the same sort of algorithm as described here but then you've just re-implemented garbage collection.

deepsun | karma 3058 | avg karma 1.77 2023-01-10 22:48:47 | [–] similar comments

Sure, and even more, you can do even better with assembler or machine codes.

menaerus | karma 284 | avg karma 0.77 2023-01-10 17:34:31 | [–] similar comments

The problem I see with the GC and with immutable data structures is somewhat unbounded memory consumption.

In former, and general, case because it takes uncertain amount of time for objects to be released and memory to be reclaimed.

In latter case because of essentially creating and then keeping the snapshots of the same but slightly different data that is also going to be released at some uncertain moment.

In high concurrency scenarios, for both of these cases, peak memory consumption is going to present a problem. Imagine a database kernel servicing hundreds, if not thousands, if requests at the same time. Machine will eventually run out of the memory under see high pressure and at that moment you will either swap or die. Choose your poison.

But to be honest you don't need neither of those to run into such unbounded memory consumption problems. It's enough to litter your codebase with unwieldy use of std::shared_ptr and it will have the same effect.

ErikCorry | karma 916 | avg karma 2.57 2023-01-11 02:44:45 | [–] similar comments

> Machine will eventually run out of the memory under see high pressure and at that moment you will either swap or die.

Surely at that moment your GC will run and collect the free memory.

menaerus | karma 284 | avg karma 0.77 2023-01-11 07:07:34 | [–] similar comments

Probably, I don't know if GC can be that precise and timely. But nonetheless my point actually was that this is more inevitable to happen in case of immutable datastructures than with lock-free or mutex-ey sharded design.

GeneralMayhem | karma 2684 | avg karma 4.25 2023-01-10 20:35:14 | [–] similar comments

I think that's the other side of the same coin - GC lets you write more efficient code (for certain definitions of efficient) by default, but they're always possible to beat if you're more aware of lifetimes and manage them explicitly. GC is an easily achieved local maximum; badly written or unoptimized non-GC code (e.g., making everything a shared_ptr in C++) will be worse, while well-written non-GC code will be better. That may well be the right tradeoff most of the time, where either the delta between the absolute and local maxima is small or the effort it would take to surpass GC is too great, but there are occasionally workloads or patterns that are very easy for a human to optimize but hard for a generic GC.

> how do you know when to clean up the memory?

Right, you have to pick a time that's optimal for your application. It's not trivial, but a GC would only be able to take a guess, whereas the developer can know based on the program's logic when an optimal time is - for instance, at the end of a request in an RPC server (https://developers.google.com/protocol-buffers/docs/referenc...).

> Using something like a reference count adds contention to this high-concurrency data structure, slowing it down

I rarely find this to be true. If a thread is going to be repeatedly requesting access to a structure, it can usually hold the reference throughout. If there are lots of ephemeral threads that independently need to grab refcounts, usually there's a higher level coordinator that's spawning them that can own a single reference. If you really need to do something like a mark-and-sweep, there's usually a custom way to do it that's more efficient than a generic GC would be (e.g., with generational immutable data structures, if you can guarantee that versions are referenced in order, then if the min version referenced is N, you can immediately drop all versions <N without needing to do a full sweep.

mike_hearn | karma 11641 | avg karma 2.92 2023-01-11 03:38:46 | [–] similar comments

The developer can only know this to the extent that they control everything happening in memory. The moment libraries maintained by disparate teams enter the picture especially if they are binary-only or otherwise encapsulated, defensive coding starts because you can't reason about lifetimes across the whole program anymore. It also massively complicates any API surface as now every API must declare its refcounting and ref holding rules. GC gets rid of all of that complexity because the GC actually does see the whole program.

lowbloodsugar | karma 2050 | avg karma 1.3 2023-01-10 22:42:15 | [–] similar comments

>it’s an overall performance win.

Speaking as someone who has benchmarked the shit out of such things, I can tell you it is most certainly not free. Building and maintaining the live graph is an insane task to do from the perspective of performance, and it shows up in CPU metrics.

For one particular task we went as far as writing the equivalent in C and the single-threaded C version was twice as fast as the Java version in wall time and 20 times as fast in total CPU usage.

Love Java for the productivity. But don’t be fooled that it is remotely optimal. Just realize that you aren’t going to rewrite that thing in C or Rust, and be content. Java is a nice Mercedes with a warranty and a service plan. You can’t afford a Bugatti. But don’t pretend the Mercedes is faster.

winrid | karma 4020 | avg karma 1.39 2023-01-10 22:50:14 | [–] similar comments

Unless you can afford the AMG Black Mercedes (Graal Native Enterprise lol)

marginalia_nu | karma 21123 | avg karma 4.08 2023-01-11 03:53:34 | [–] similar comments

While I agree Java is almost never the fastest, I'll still object it's extremely hard to meaningfully benchmark these things. It's hard to benchmark even within the same language.

There are so much hidden state in the JVM. Like are you running out of TLAB space? How aggressively has the code been optimized? Does JVM elide any object creations? It may be tempting to try to compensate for these things, to isolate the benchmark from the statefulness of the JVM, but that is also isolating the results from any sort of practical applicability because real world code exists in the context of a stateful JVM.

It's also difficult to generalize benchmarks of allocation costs alone because it's rare for code to only be allocating stuff and this isn't something anyone is optimizing for. The GC will choke if you don't give it some breathing room. You'll get resource contention that realistically wouldn't be there in real-world scenarios because real code doesn't (typically) just allocate objects for minutes on end.

lowbloodsugar | karma 2050 | avg karma 1.3 2023-01-11 12:11:28 | [–] similar comments

>There are so much hidden state in the JVM.

Indeed. GB of state.

To the OPs statement:

OP>Using something like a reference count adds contention to this high-concurrency data structure, slowing it down.

So instead we'll have an engine that walks the DAG all the time; that trashes the CPU caches; in a language where there are no nested structures or stack allocations, so that what might be a single allocation in C or C++ or Rust (if an allocation happened at all) ends up being 28 for a single API call. Then you've either got to traverse the DAG completely every cycle, or you've got to write somewhere else that the object has changed and that thing has the same (or worse) access cost as a reference count. I know that there are very clever people working on very clever algorithms, but at this point, most of these algorithms have crossed into optimizations that work great 99% of the time, and it is the 1% that brings your app down in production.

And I'm not even complaining: if I have to architect my application so that it submits to the rules of the GC algo just so I can use Java, well that's often a fair trade. But I'm not pretending it's free.

OP>Threads have to fight over updating that counter,

I've done a lot of work that involved exactly this, and measured it, and they just don't: most of the time they just don't because it is incredibly unlikely for two threads to be hitting the same object at the same time. Some languages (Swift) even identify that a refcount doesn't need to be taken at all. If you do have a "central" object with such contention then you switch to a different locking model.

noctune | karma 154 | avg karma 2.75 2023-01-11 01:35:12 | [–] similar comments

Immutable hash maps are generally O(lg n) inserts and updates, where the mutable counterpart is O(1). So it's not without cost.

Quekid5 | karma 1417 | avg karma 1.79 2023-01-11 04:43:57 | [–] similar comments

The base of the log is usually 32 or similar, so it's practically always bounded by... 6. (That's for a billion keys, for example.)

tester756 | karma 3905 | avg karma 1.96 2023-01-10 17:16:21 | [–] similar comments

It's interesting to see that GC is "marketed" to Java developers as some important topic that they all should put effort into

meanwhile in .NET world GC is basically outsourced to

https://devblogs.microsoft.com/dotnet/author/maoni/ and team and their famous gc.cpp

https://raw.githubusercontent.com/dotnet/runtime/main/src/co...

and nobody touches anything but one knob iirc

moonchild | karma 2760 | avg karma 2.2 2023-01-10 17:45:56 | [–] similar comments

And .net gc sucks in comparison to java.

thfuran | karma 7179 | avg karma 1.96 2023-01-10 18:39:06 | [–] similar comments

Isn't a more performant GC one of Azul's longest standing selling points for their jvm? It makes sense to me that they'd want to talk about gc.

adgjlsfhk1 | karma 4911 | avg karma 2.11 2023-01-10 19:25:54 | [–] similar comments

A lot of the difference is that .NET has actual value types so is less dependent on GC performance. Java came from a brief period when a lot of people thought GCs were magic and so they made their GC problems 10x worse than any other remotely fast language.

pjmlp | karma 109153 | avg karma 1.76 2023-01-11 00:42:14 | [–] similar comments

Actually many didn't, except for Smalltalk and SELF, all other languages around that time with automatic memory management did have support for value types.

It was a huge mistake to ignore them.

vips7L | karma 2543 | avg karma 1.69 2023-01-10 22:12:36 | [–] similar comments

Nowadays with modern Java collectors you really don’t touch any knobs outside of -Xmx (max heap) or if you have special constraints, choosing a different gc implementation like -XX:+UseZGC

pjmlp | karma 109153 | avg karma 1.76 2023-01-11 00:40:31 | [–] similar comments

.NET GCs have plenty of knobs to turn as well, and since it allows C++ level of memory management, knowing when to use value types and native heap allocations is also part of those knobs.

coding123 | karma 3080 | avg karma 1.24 2023-01-10 18:50:11 | [–] similar comments

Static variables. Know that you should be extremely cautious with these, especially if you're using them to store big maps or dictionaries or things like that. Beyond statics, honestly, never ran into anything else that's not cured by changing your command line a bit.

Brystephor | karma 381 | avg karma 2.7 2023-01-10 19:29:03 | [–] similar comments

Why are these relevant or special for the GC in a "gotcha" manner?

karmakaze | karma 8189 | avg karma 1.25 2023-01-10 19:54:34 | [–] similar comments

Because a static reference causes objects referenced by it to never be GC'd. As a simple hack, someone might add a singleton implemented as a static field of a class. That reference could be/have a structure/collection that accumulates and never gets freed.

You can find these things using the Java Memory Analyzer and the OQL it supports. Sometimes easy sometimes tricky.

dehrmann | karma 14936 | avg karma 2.33 2023-01-11 11:06:32 | [–] similar comments

Classes can also be GCed in Java, but obviously only if they're no longer referenced.

bberrry | karma 618 | avg karma 3.81 2023-01-11 16:05:34 | [–] similar comments

Presumably if you're creating a large map in memory it's being used, no? Or did you have something static that was really only used during startup?

rcgorton | karma -13 | avg karma -0.15 2023-01-10 20:08:16 | [–] similar comments

[dead]

datavirtue | karma 1805 | avg karma 0.56 2023-01-10 20:08:29 | [–] similar comments

Be nice to your garbage collector. Don't leave references to objects that are not needed. Once they are done, and the scope is going to remain, null them.

ramranch | karma 22 | avg karma 0.96 2023-01-10 20:21:23 | [–] similar comments

Why not just write C, then? Sounds like you're just setting up for NPEs

nlitened | karma 1501 | avg karma 3.14 2023-01-10 21:37:01 | [–] similar comments

Or use a language that does it automatically, like Clojure

thewisenerd | karma 322 | avg karma 3.43 2023-01-10 22:32:12 | [–] similar comments

is this a reference to the 2019 talk, or are you being serious (in which case, could you please share a couple references)?

context: https://www.youtube.com/watch?v=ocUx-QUJMfo&t=630s

dehrmann | karma 14936 | avg karma 2.33 2023-01-11 11:04:18 | [–] similar comments

> The pause times with these modern GCs are very small, often units of milliseconds or even lower.

This is shorter that a lot of async event loop "pauses."

Legal | privacy