Hacker Read

Hacker Read top | best | new | newcomments | leaders | about | bookmarklet

login

		Why Discord is switching from Go to Rust (blog.discordapp.com) similar stories update story
		1582.0 points by Sikul \| karma 956 \| avg karma 15.17 2020-02-04 17:30:40+00:00 \| hide \| past \| favorite \| 671 comments

view as:

johnmc408 | karma 39 | avg karma 1.08 2020-02-04 18:06:37 | [–] similar comments

Non programmer here, but would it make sense to add a keyword (or flag) to Go to manually allocate a piece of memory (ie not use GC). That way, for some use cases, you could use avoid GC for the critical path. Then when GC happened, it could be very fast as there would be far less to pause-and-scan (in this use case example). Obviously this would have to be optional and discouraged...but there seems to be no way to write an intensive real-time app with a GC based language. (again non-programmer that is writing this to learn more ;-)

vips7L | karma 2543 | avg karma 1.69 2020-02-04 18:26:26+00:00 | [–] similar comments

So like D?

https://dlang.org/spec/attribute.html#nogc

bluebasket | karma 5 | avg karma 0.45 2020-02-04 18:43:40+00:00 | [–] similar comments

i thought go has a GOGC=off option or something? did they remove it?

wwarner | karma 2361 | avg karma 3.19 2020-02-04 19:20:44+00:00 | [–] similar comments

no it's there. https://golang.org/pkg/runtime/debug/#SetGCPercent

oconnor663 | karma 6967 | avg karma 4.71 2020-02-04 18:35:53+00:00 | [–] similar comments

There are two things you'd have to do at the same time that make this complicated:

- You'd have to ensure that your large data structure gets allocated entirely within the special region. That's simple enough if all you have is a big array, but it gets more complicated if you've got something like a map of strings. Each map cell and each string would need to get allocated in the special region, and all of the types involved would need new APIs to make that happen.

- You'd have to ensure that data structures in your special region never hold references to anything outside. Since the whole point of the region is that the GC doesn't scan it, nothing in the region will be able to keep anything outside the region alive. Any external references could easily become dangling pointers to freed memory, which is the sort of security vulnerability that GC itself was designed to prevent.

All of this is doable in theory, but it's sufficiently difficult, and it comes with sufficiently many downsides, that it makes more sense for a project with these performance needs to just use C or Rust or something.

zozbot234 | karma 19616 | avg karma 2.16 2020-02-04 19:16:27+00:00 | [–] similar comments

> Since the whole point of the region is that the GC doesn't scan it, nothing in the region will be able to keep anything outside the region alive.

You can treat external references as GC roots.

int_19h | karma 21203 | avg karma 1.69 2020-02-04 23:15:59+00:00 | [–] similar comments

How do you know that they exist, if you're not scanning that memory?

zozbot234 | karma 19616 | avg karma 2.16 2020-02-05 00:29:36+00:00 | [–] similar comments

The data structure code can take care of this by registering GC roots with the garbage collector (and de-registering them if an external reference changes). It's no different in principle than any other smart pointer.

tatersolid | karma 1166 | avg karma 1.35 2020-02-05 02:56:39+00:00 | [–] similar comments

> You can treat external references as GC roots.

Which brings you back to doing an expensive scan of the large off-heap data structure you were trying to avoid.

jimbo1qaz | karma 221 | avg karma 1.37 2020-02-04 16:39:42 | [–] similar comments

Both of these requirements kinda remind me of Microsoft's Verona language.

nanny | karma 428 | avg karma 2.73 2020-02-04 18:37:03+00:00 | [–] similar comments

>would it make sense

Probably, yeah. But the Golang team would never add such a feature because of their philosophy of keeping the language simple.

correct_horse | karma 466 | avg karma 3.38 2020-02-04 18:40:25+00:00 | [–] similar comments

I think the bigger problem with Go is a lack of GC options. Java, on the other end of the spectrum has multiple GC algorithms (i.e. the Z garbage collector, Shenandoah, Garbage-First/G1) each with tunables (max heap size, min heap size, for more see [1]). Java other issues, but it solves real business problems by having so many garbage collector tunables. Go's philosophy on the matter seems to be that the programmer shouldn't have to worry about such details (and GC tunables are hard to test). Which is great, until the programmer does have to worry about them.

[1] https://docs.oracle.com/javase/9/gctuning/garbage-first-garb...

geodel | karma 6768 | avg karma 2.18 2020-02-04 18:58:29+00:00 | [–] similar comments

> Java other issues, but it solves real business problems by having so many garbage collector tunables.

That real business problem is Java generates boat load of garbage so GC needs a lot more performance tuning to make application run normal.

ozten | karma 2165 | avg karma 3.3 2020-02-04 18:45:16+00:00 | [–] similar comments

Yes and no. You can get very clever by pre-allocating memory and ensuring it is never garbage collected, but at that point you're opening yourself up to new types of bugs and other performance issues as you try to scale your hack.

As you fight your language, you're GC avoidance system will become larger and larger. At some point you might re-evaluate your latency requirements, your architecture, and which are the right tools for the job.

sascha_sl | karma 1963 | avg karma 1.73 2020-02-04 18:55:12+00:00 | [–] similar comments

Go has a tool for this job.

https://golang.org/pkg/sync/#Pool

jhgg | karma 3220 | avg karma 6.78 2020-02-04 18:58:05+00:00 | [–] similar comments

checked in objects in a sync pool gets cleaned up on GC. It used to clean the whole pool, but now I think it does half each GC cycle. If you want to say "objects checked in should live here forever and not free themselves unless I want them to" sync pool is not the tool for the job.

sascha_sl | karma 1963 | avg karma 1.73 2020-02-05 12:11:42+00:00 | [–] similar comments

Well, it's a start. In fact the existing interface lends itself really well to a rolling window on-demand (de)allocator, especially with that New function you can supply.

Just pool could've mitigated your problem at least partially, is what I'm saying.

unlinked_dll | karma 2533 | avg karma 3.72 2020-02-04 18:46:46+00:00 | [–] similar comments

GC isn't "slow" insofar as it's non deterministic. Modern garbage collectors are extremely fast, in fact.

sorokod | karma 4501 | avg karma 1.96 2020-02-04 19:02:16+00:00 | [–] similar comments

Running every two minutes sounds pretty deterministic.

jhgg | karma 3220 | avg karma 6.78 2020-02-04 19:42:08 | [–] similar comments

It was perhaps too deterministic. What's not mentioned in the blog is that after running for long enough, the cluster would line up it's GCs, and each node would do the 2 minute GC at exactly the same time causing bigger spikes as the entire cluster would degrade. I'm guessing all it takes is a few day night cycles combined with a spike in traffic to make all the nodes reset their forced GC timers to the same time.

cellularmitosis | karma 2902 | avg karma 2.69 2020-02-04 20:08:56+00:00 | [–] similar comments

Interesting, sounds like 2 minutes + random fuzz might avoid the thundering herd. Might be worth submitting a patch to the golang team!

zys5945 | karma 19 | avg karma 1.73 2020-02-04 18:55:43+00:00 | [–] similar comments

That would imply a drastic change to the language design. Essentially you are asking for 2 code generators (one for code managed by the go runtime and one managed by the programmer). It might be possible but it's most likely not gonna happen.

carllerche | karma 1590 | avg karma 14.45 2020-02-04 18:32:44+00:00 | [–] similar comments

Tokio author here (mentioned in blog post). It is really great to see these success stories.

I also think it is great that Discord is using the right tool for the job. It isn't often that you need the performance gains that Rust & Tokio so pick what works best to get the job done and iterate.

joseluisq | karma 1349 | avg karma 5.37 2020-02-04 18:35:29+00:00 | [–] similar comments

Basically because of:

> Rust is blazingly fast and memory-efficient: with no runtime or garbage collector, it can power performance-critical services, run on embedded devices, and easily integrate with other languages.

Polyisoprene | karma 101 | avg karma 1.44 2020-02-04 22:53:55+00:00 | [–] similar comments

No offense to Tokio and Rust, I really like Rust, but having someone rewriting their app because of performance limitations in their previous language choice, isn’t really someone picking the right tool for the job necessary.

I’m not so sure they would have done the rewrite if the Go GC was performing better, and the choice of Rust seems primarily based on prior experience at the company writing performance sensitive code rather than delivering business value.

acheron9383 | karma 129 | avg karma 2.22 2020-02-04 23:27:50 | [–] similar comments

Right tool for the job should also take into account the experience of the devs you have at your disposal. For an omniscient Dev, is Rust the best tool for the job? Unsure. But for them with already significant rust experience? Sounds like it.

qaq | karma 6423 | avg karma 2.08 2020-02-05 00:04:30+00:00 | [–] similar comments

too much focus on "business value" often ends-up with codebase in a state that makes delivery of that business value pretty impossible. Boeing was delivering a lot of business value with MAX ...

sneezetheory | karma -1 | avg karma -1.0 2020-02-05 11:41:22+00:00 | [–] similar comments

THIS !!! this is so underrated

say_it_as_it_is | karma 1888 | avg karma 2.11 2020-02-05 00:06:52+00:00 | [–] similar comments

Correct. They wouldn't have considered Rust if the GC was performing better. They also wouldn't have even adopted Go if Elixir was sufficient. This team seems to have an incredible talent pool who is willing to push further for the sake of, as you say, delivering business value. Improving UX, investing in capabilities for growth, are valid business reasons why they're iterating over so many solutions. It's really impressive to see what they're accomplishing.

_--___-___ | karma 11 | avg karma 1.0 2020-02-04 18:37:36+00:00 | [–] similar comments

"We want to make sure Discord feels super snappy all the time" is hilarious coming from a program that is infamous for making you read 'quirky' loading lines while a basic chat application takes several seconds to start up.

Don't really know about Go versus Rust for this purpose, but don't really care because read states (like nearly everything that makes Discord less like IRC) is an anti-feature in any remotely busy server. Anything important enough that it shouldn't be missed can be pinned, and it encourages people to derail conversations by replying out of context to things posted hours or days ago.

anchpop | karma 2639 | avg karma 3.11 2020-02-04 18:40:13+00:00 | [–] similar comments

I don't see why that's hilarious. Lots of programs take a second or two to load and it only happens once on boot for me. "Read states" is just discord telling you which channels and servers you have unread messages in

wvenable | karma 19014 | avg karma 3.37 2020-02-04 19:09:42 | [–] similar comments

Discord takes longer to start up than Microsoft Word.

Desktop development is a total wasteland these days -- there isn't nearly as much effort put into optimization as server side. They're not paying for your local compute, so they can waste as much of it as they want.

penagwin | karma 2802 | avg karma 2.49 2020-02-04 19:45:08 | [–] similar comments

I feel that it's not really fair to expect them to natively implement their app on every platform and put tons of resources into it's client performance - anecdotally discord is a very responsive app - see [0].

But think of it this way, all the effort they put into their desktop app works on all major OSes without a problem. They even get to reuse most of the code for access from the browser, with no installation required.

Now imagine approaching your PM and saying "Look I know we put X effort into making our application work on all the platforms, but it would be even faster if we instead did 4x effort for native implementations + the browser".

[0] From what I've seen in the "gamer community" is that most gamers don't care that much about that kind of extra performance. Discord itself doesn't feel slow once it's started. Joining a voice channel is instant, quickly switching to a different server and then to a chat channel to throw in a some text is fast and seamless (Looking at you MS Teams!!!).

Sure Mumble/Teamspeak are native and faster, but where are their first party mobile apps and web clients? One of the incredible things Discord did to really aid in adoption was allow for web clients, so when you found some random person on the internet, you didn't have to tell them to download some chat client, they could try it through their browser first.

tl;dr

Yes electron apps can be slow, but discord IMO has fine client side performance, and they clearly do put resources into optimizing it. Yes it "could be faster" with native desktop apps, but their target community seems perfectly content as is.

jhgg | karma 3220 | avg karma 6.78 2020-02-04 21:06:56+00:00 | [–] similar comments

A lot of the startup cost right now is in our really ancient update checker. There are plans to rewrite all of this, now that we understand why it's bad, and have some solid ideas as to what we can do better.

I do think it's reasonable to get the startup time of Discord to be near what VS Code's startup times are. If we remove the updater, it actually starts pretty fast (chromium boot -> JS loaded from cache) is <1s on a modern PC. And there's so much more we can do from there, for example, loading smaller chunks in the critical "load and get to your messages path" - being able to use v8 heap snapshots to speed up starting the app, etc...

The slow startup time is very much an us problem, and not an electron problem and is something I hope we'll be able to address this year.

penagwin | karma 2802 | avg karma 2.49 2020-02-04 22:37:09+00:00 | [–] similar comments

When you guys do address it could I pretty please request you do a blog article about it?

Electron startup time as well as v8 snapshots have been a hot topic for a looooong time. I actually started a pull request for it in 2015 [0]. My pull request was focusing on source code protection, but ANY information on how you use v8 snapshots, etc. would be awesome!

[0] https://github.com/electron/electron/issues/3041

dancemethis1 | karma -7 | avg karma -0.5 2020-02-05 12:26:11+00:00 | [–] similar comments

The thing that's really bad is that it's proprietary.

graphememes | karma 548 | avg karma 1.66 2020-02-04 19:47:00+00:00 | [–] similar comments

Microsoft Word isn't patching the application on startup. That's the difference.

Once it's loaded, how much slower than Word is it?

wvenable | karma 19014 | avg karma 3.37 2020-02-04 21:26:25+00:00 | [–] similar comments

You're telling me Discord is patching itself on every single launch and this somehow a valid excuse for slow startup performance?

Almost every single app I run auto-updates itself in some form.

graphememes | karma 548 | avg karma 1.66 2020-02-04 22:20:44 | [–] similar comments

In the case of Discord, yes. That's a valid argument, whether or not it's truly important, I'm not sure. It certainly is a waste of time to invest improving when their current system works perfectly fine.

wvenable | karma 19014 | avg karma 3.37 2020-02-04 22:36:42+00:00 | [–] similar comments

They're investing in server-side project that are also perfectly fine. In this case, re-writing an entire module in a different language to eek out a tiny bit more performance!

But on the client side, it's arguably the slowest to launch application I have installed even among other Electron apps. Perfectly fine.

This completely re-enforces my original statement: "Desktop development is a total wasteland these days -- there isn't nearly as much effort put into optimization as server side" Desktop having horrible startup performance is "fine" but a little GC jitter on the server requires a complete re-write from the ground up.

jhgg | karma 3220 | avg karma 6.78 2020-02-05 00:41:57+00:00 | [–] similar comments

I think this is a statement that is ignorant to our development efforts, and how our team is staffed, and what our objectives are.

First and foremost, we do care deeply about desktop performance. We shipped this week a complete rewrite of our messages components, that come with a boatload of performance optimizations, in addition to a new design. We spent a lot of time to do that rewrite in addition to applying new styles, because given what we know now (and what's state of the art in React world), we can write the code better than we did 3+ years ago. In terms of total engineering time spent, the rewrite of messages actually took much longer than the rewrite of this service from go to rust.

That being said, the desktop app does load much slower than we'd like (and honestly than I'd like personally.) I commented in another thread why that is. That being said, the person who is writing backend data-services, is not the one who's going to be fixing the slow boot times (our native platform team). These are efforts that happen independently.

As for our motivations for using rust, I think saying that "a little GC jitter on the server requiring a complete rewrite" is one of many reasons we wanted to give rust a shot. We have some code that we know works in a Golang. We want to investigate the viability of Rust to figure out how it'd look like to write a data service in rust. We have this service that is rather trivial, and has some GC jitters (that we've been fine with for a year.) So, an engineer (the author of this blog post) spends some time last year to see what it'd look like to write an equivalent service in rust, how it'd perform, how easy it'd be, and what the general state of the ecosystem is like in practice.

I think it's easy to forget that a lot of work we do as engineers isn't all about what's 100% practical, but also about learning new things in order to explore new viable technologies. In this case, this project had a very clear scope and set of requirements (literally rewrite this thing that we know works), and a very well defined set of success criteria (should perform as-good or better, see if a lack of GC will improve latencies, get a working understanding of the state of the ecosystem and how difficult it would be to write future data services in rust vs go.) Given the findings in our rewrite of this service, running it in production, and now using features that have stabilized in rust, we're confident in saying that "in places where we would have used golang, we consider rust viable, and here's why, given our exercise in rewriting something from go to rust."

penagwin | karma 2802 | avg karma 2.49 2020-02-04 18:46:33+00:00 | [–] similar comments

To be fair it really doesn't take that long, and often it's because it's auto updating, but it's not more then a couple seconds.

The big thing IMO is that once started I normally leave discord running, and most actions within discord itself feel very snappy - E.g. You click on a voice channel and you're instantly there. I think that's what they mean, they're trying to keep the delay for such an action low. Sometimes you click a voice Chanel and there's a few seconds of delay, those for some reason more annoying then the long (ish) startup time

StreamBright | karma 4217 | avg karma 1.31 2020-02-04 18:37:42+00:00 | [–] similar comments

Pretty amazing write up from Jesse. I really like how they maxed out Go first before even thinking about a rewrite in Rust. It turns out no-GC has pretty significant advantages in some cases.

Rapzid | karma 5572 | avg karma 1.75 2020-02-05 04:32:43+00:00 | [–] similar comments

Unsafe or it doesn't count ;)

correct_horse | karma 466 | avg karma 3.38 2020-02-04 18:43:22 | [–] similar comments

I've heard lots of hot takes on "what Go really is". Here's mine.

Go is what would have happened if Bell Labs wrote Java.

kick | karma 11443 | avg karma 5.07 2020-02-04 18:54:37 | [–] similar comments

Minor nitpick: That already happened, Limbo is what happened when Bell Labs wrote Java.

monocasa | karma 27236 | avg karma 2.94 2020-02-04 19:52:39+00:00 | [–] similar comments

And go is very very derived from plan 9. It could be considered a sibling of limbo in a lot of ways.

anthk | karma 3504 | avg karma 0.6 2020-02-04 20:00:59+00:00 | [–] similar comments

More like Limbo and Inferno.

kick | karma 11443 | avg karma 5.07 2020-02-04 20:17:39 | [–] similar comments

Limbo doesn't only run on Inferno; anything with the Dis VM will work.

correct_horse | karma 466 | avg karma 3.38 2020-02-05 00:51:20 | [–] similar comments

Huh. I managed to hear about Inferno, but not remember the Limbo part.

In that case, Go is Bell Labs' second attempt at Java.

kick | karma 11443 | avg karma 5.07 2020-02-05 03:19:07 | [–] similar comments

Third, there was also a language I can't remember the name of that happened at the same time as Alef.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-05 04:22:31 | [–] similar comments

Newsqueak?

kick | karma 11443 | avg karma 5.07 2020-02-05 20:56:32 | [–] similar comments

That may have been what I was thinking of (or Squeak, for that matter, if my sense of time was off), but I'm not sure!

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 20:23:22+00:00 | [–] similar comments

Interesting comment, as 2 of the main Go creators (Ken Thompson and Rob Pike) did work at the Bell Labs. So while I doubt they tried to write Java, Go in a sense was written by the Bell Labs :).

(And Kernighan was their floor-mate too, that must have been a stunningly great environment)

correct_horse | karma 466 | avg karma 3.38 2020-02-05 00:35:34+00:00 | [–] similar comments

I was taking that into consideration when I authored my comment!

cdelsolar | karma 1453 | avg karma 1.64 2020-02-04 21:18:03 | [–] similar comments

OK boomer

dang | karma 18142 | avg karma 0.25 2020-02-05 04:09:53+00:00 | [–] similar comments

Please don't do this here.

cdelsolar | karma 1453 | avg karma 1.64 2020-02-14 16:45:42 | [–] similar comments

ok

flavio81 | karma 2332 | avg karma 2.08 2020-02-05 02:54:56+00:00 | [–] similar comments

>Go is what would have happened if Bell Labs wrote Java

And Unix is what happened when Bell Labs wrote an operating system -- something that was born outdated from the start.

Just like Golang.

makapuf | karma 2574 | avg karma 2.33 2020-02-05 05:27:06 | [–] similar comments

Yet simple enough to be understood and its features actually used, widespread and that stood the test of time.

buboard | karma 6145 | avg karma 1.29 2020-02-04 18:44:35 | [–] similar comments

maybe next year : why discord is switching to C

flafla2 | karma 744 | avg karma 4.89 2020-02-04 18:45:32+00:00 | [–] similar comments

> After digging through the Go source code, we learned that Go will force a garbage collection run every 2 minutes at minimum. In other words, if garbage collection has not run for 2 minutes, regardless of heap growth, go will still force a garbage collection.

> We figured we could tune the garbage collector to happen more often in order to prevent large spikes, so we implemented an endpoint on the service to change the garbage collector GC Percent on the fly. Unfortunately, no matter how we configured the GC percent nothing changed. How could that be? It turns out, it was because we were not allocating memory quickly enough for it to force garbage collection to happen more often.

As someone not too familiar with GC design, this seems like an absurd hack. That this 2-minute hardcoded limitation is not even configurable comes across as amateurish even. I have no experience with Go -- do people simply live with this and not talk about it?

JBReefer | karma 2559 | avg karma 3.13 2020-02-04 18:50:54 | [–] similar comments

Go always feels like an amateur language to me, I’ve given up on it. This feels right in line - similar to the hardcode GitHub magic.

reificator | karma 6682 | avg karma 3.76 2020-02-04 19:07:46 | [–] similar comments

I could be wrong, but I don't believe there is "hardcoded[d] GitHub magic".

IIRC I have used GitLab and Bitbucket and self-hosted Gitea instances the same exact way, and I'm fairly sure there was an hg repo in one of those. Don't recall doing anything out of the ordinary compared to how I would use a github URL.

heinrich5991 | karma 1879 | avg karma 3.46 2020-02-04 19:27:48+00:00 | [–] similar comments

There are a couple of hosting services hardcoded in Go. I believe it was about splitting the URL into the actual URL and the branch name.

Nullabillity | karma 4768 | avg karma 1.8 2020-02-04 19:58:40+00:00 | [–] similar comments

https://github.com/golang/go/blob/e6ebbe0d20fe877b111cf4ccf8...

Ouch, Go never ceases to amaze. The Bitbucket case[0] is even more crazy, calling out to the Bitbucket API to figure out which VCS to use. It has a special case for private repositories, but seems to hard-code cloning over HTTPS.

If only we had some kind of universal way to identify resources, that told you how to access it...

[0]: https://github.com/golang/go/blob/e6ebbe0d20fe877b111cf4ccf8...

reificator | karma 6682 | avg karma 3.76 2020-02-04 23:04:37+00:00 | [–] similar comments

Thanks for the reference to prove me wrong.

Wow, that's sad. I'm glad it works seamlessly, don't get me wrong, but I was assuming I could chalk it up to defacto standards between the various vendors here.

ptrincr | karma 359 | avg karma 3.63 2020-02-04 18:52:01 | [–] similar comments

You are able to disable GC with:

  GOGC=off

As someone mentions below.

More details here: https://golang.org/pkg/runtime/

singron | karma 1870 | avg karma 3.26 2020-02-04 19:13:39 | [–] similar comments

Keeping GC off for a long running service might become problematic. Also, the steady state might have few allocations, but startup may produce a lot of garbage that you might want to evict. I've never done this, but you can also turn GC off at runtime with SetGCPercent(-1).

I think with that, you could turn off GC after startup, then turn it back on at desired intervals (e.g. once an hour or after X cache misses).

It's definitely risky though. E.g. if there is a hiccup with the database backend, the client library might suddenly produce more garbage than normal, and all instances might OOM near the same time. When they all restart with cold caches, they might hammer the database again and cause the issue to repeat.

ignoramous | karma 17311 | avg karma 3.91 2020-02-04 19:57:17+00:00 | [–] similar comments

> ...all instances might OOM near the same time.

CloudFront, for this reason, allocates heterogeneous fleets in its PoPs which have diff RAM sizes and CPUs [0], and even different software versions [1].

> When they all restart with cold caches, they might hammer the database again and cause the issue to repeat.

Reminds me of the DynamoDB outage of 2015 that essentially took out us-east-1 [2]. Also, ELB had a similar outage due to unending backlog of work [3].

Someone must write a book on design patterns for distributed system outages or something?

[0] https://youtube.com/watch?v=pq6_Bd24Jsw&t=50m40s

[1] https://youtube.com/watch?v=n8qQGLJeUYAt=39m0s

[2] https://aws.amazon.com/message/5467D2/

[3] https://aws.amazon.com/message/67457/

singron | karma 1870 | avg karma 3.26 2020-02-05 23:01:49+00:00 | [–] similar comments

Google's SRE book covers some of this (if you aren't cheekily referring to that). E.g. chapters 21 and 22 are "Handling Overload" and "Addressing Cascading Failures". The SRE book also covers mitigation by operators (e.g. manually setting traffic to 0 at load balancer and ramping back up, manually increasing capacity), but it also talks about engineering the service in the first place.

This is definitely a familiar problem if you rely on caches for throughput (I think caches are most often introduced for latency, but eventually the service is rescaled to traffic and unintentionally needs the cache for throughput). You can e.g. pre-warm caches before accepting requests or load-shed. Load-shedding is really good and more general than pre-warming, so it's probably a great idea to deploy throughout the service anyway. You can also load-shed on the client, so servers don't even have to accept, shed, then close a bunch of connections.

The more general pattern to load-shedding is to make sure you handle a subset of the requests well instead of degrading all requests equally. E.g. processing incoming requests FIFO means that as queue sizes grow, all requests become slower. Using LIFO will allow some requests to be just as fast and the rest will timeout.

ignoramous | karma 17311 | avg karma 3.91 2020-02-09 14:46:50 | [–] similar comments

Your comment reminds me of this excellent ACM article by Facebook on the topic: https://queue.acm.org/detail.cfm?id=2839461

I've read the first SRE book but having worked on large-scale systems it is impossible to relate to the book or internalise the advice/process outlined in it unless you've been burned by scale.

I must note that there are two Google SRE books in-circulation, now: https://landing.google.com/sre/books/

marrs | karma 730 | avg karma 1.51 2020-02-04 19:26:18+00:00 | [–] similar comments

How does Go allow you to manage memory manually? Malloc/free or something more sophisticated?

masklinn | karma 65147 | avg karma 3.36 2020-02-04 20:18:14 | [–] similar comments

It doesn't. If you disable the GC… you only have an allocator, the only "free" is to run the entire GC by hand (calling runtime.GC())

tsimionescu | karma 17553 | avg karma 2.17 2020-02-04 20:30:47+00:00 | [–] similar comments

It doesn't. You could start and stop the GC occasionally, maybe?

lostcolony | karma 9464 | avg karma 3.54 2020-02-04 23:46:59 | [–] similar comments

So other comments didn't mention this, per se, but Go gives you tools to see what memory escapes the stack and ends up being heap allocated. If you work to ensure things stay stack allocated, it gets freed when the stack frees, and the GC never touches it.

But, per other comments, there isn't any direct malloc/free behavior. It just provides tools to help you enable the compiler to determine that GC is not needed for some.

_bxg1 | karma 112 | avg karma 0.03 2020-02-04 18:56:40+00:00 | [–] similar comments

It does sound like Discord's case was fairly extraordinary in terms of the degree of the spike:

> We kept digging and learned the spikes were huge not because of a massive amount of ready-to-free memory, but because the garbage collector needed to scan the entire LRU cache in order to determine if the memory was truly free from references.

So maybe this is one of those things that just doesn't come up in most cases? Maybe most services also generate enough garbage that that 2-minute maximum doesn't really come into play?

spullara | karma 9487 | avg karma 3.26 2020-02-04 19:06:16 | [–] similar comments

Heap caches that keep things longer than a GC cycle are terrible under GC unless you have a collector in the new style like ZGC, Azul or Shenandoah.

sitkack | karma 14538 | avg karma 1.82 2020-02-04 19:16:25+00:00 | [–] similar comments

Systems with poor GC and the need to keep data for lifetimes greater than a request should have an easy to use off heap mechanism to prevent these problems.

Often something like Redis is used as a shared cache that is invisible to the garbage collector, there is a natural key with a weak reference (by name) into a KV store. One could embed a KV store into an application that the GC can't scan into.

spullara | karma 9487 | avg karma 3.26 2020-02-04 19:24:28 | [–] similar comments

100%. In Java, you would often use OpenHFT's ChronicleMap for now and hopefully inline classes/records in Java 16 or so.

cgh | karma 3034 | avg karma 2.81 2020-02-04 19:59:44+00:00 | [–] similar comments

Ehcache has an efficient off-heap store: https://github.com/Terracotta-OSS/offheap-store/

Doesn't Go have something like this available? It's an obvious thing for garbage-collected languages to have.

mappu | karma 5142 | avg karma 3.21 2020-02-04 23:27:29 | [–] similar comments

You can usually resort to `import C` and using `C.malloc` to get an unsafe.Pointer.

mappu | karma 5142 | avg karma 3.21 2020-02-04 23:26:19 | [–] similar comments

What feature of "the new style" makes them more suitable in this case?

spullara | karma 9487 | avg karma 3.26 2020-02-05 19:20:31+00:00 | [–] similar comments

They have very short pause times even for very large heaps with lots of objects in them as they don't have to crawl the entire live tree when collecting.

wongarsu | karma 24397 | avg karma 4.14 2020-02-04 19:16:09+00:00 | [–] similar comments

Games written in the Unity engine are (predominately) written in C#, a garbage collected language. Keeping large amounts of data around isn't that unusual since reading from disk is often prohibitively slow, and it's normal to minimize memory allocation/garbage generation (using object pools, caches etc), and manually trigger the GC in loading screens and in other opportune places (as easy as calling System.GC.Collect()). At 60 fps each frame is about 16ms. You do a lot in those 16ms, adding a 4ms garbage collection easily leads to dropping a frame. Of course whether that matters depends on the game, but Unity and C# seem to handle it well for the games that need tiny or no GC pauses.

But (virtually) nobody is writing games in Go, so it's entirely possible that it's an unusual case in the Go ecosystem. Being an unsupported usecase is a great reason to switch language.

_bxg1 | karma 112 | avg karma 0.03 2020-02-04 19:19:23+00:00 | [–] similar comments

Right; Go is purpose-built for writing web services, and web services tend to be pretty tolerant of (small) latency spikes because virtually anyone who's calling one is already expecting at least some latency

latch | karma 13550 | avg karma 5.73 2020-02-05 02:04:50 | [–] similar comments

> Go is purpose-built for writing web services

Is this true? Go was built specifically for C++ developers, which, even when Go was first release, was a pretty unpopular language for writing web services (though maybe not at Google?). That a non-trivial number of Ruby/Python/Node developers switched was unexpected. (1)

https://commandcenter.blogspot.com/2012/06/less-is-exponenti...

blondin | karma 1244 | avg karma 1.43 2020-02-05 02:44:23+00:00 | [–] similar comments

your quote is just corroborating what the reply is saying. go was written for web services which were written in c++ at google.

latch | karma 13550 | avg karma 5.73 2020-02-05 06:34:18+00:00 | [–] similar comments

The linked article doesn't say anything about web services. Just C++. I believe Rob Pike was working on GFS and log management, and Go was always initially pitched at system programming (which is not web services).

> Our target community was ourselves, of course, and the broader systems-programming community inside Google. (1)

(1) http://www.informit.com/articles/article.aspx?p=1623555

erk__ | karma 2323 | avg karma 3.24 2020-02-04 20:35:07+00:00 | [–] similar comments

C# uses a generational gc iirc so it may be better suited for a system where you have a relativly stable collection that does not need to be fully garbage collected all the time and have a smaller and more volitile set of objects that will be gc'ed more often. I don't think the current garbage collector in go does anything similar to that.

danbolt | karma 2777 | avg karma 2.58 2020-02-04 20:58:16+00:00 | [–] similar comments

This might have changed with more recent updates, but I was under the impression that the Mono garbage collector in Unity was a bit dated and not as up-to-date as a C# one today.

martindevans | karma 175 | avg karma 3.07 2020-02-04 22:56:58+00:00 | [–] similar comments

Unity has recently added the "incremental GC" [1] which spreads the work of the GC over multiple frames. As I understand it this has a lower overall throughput, but _much_ better worst case latency.

[1] https://blogs.unity3d.com/2018/11/26/feature-preview-increme...

to11mtm | karma 1108 | avg karma 1.59 2020-02-08 02:04:54+00:00 | [–] similar comments

Yeah, that's the ideal pattern in C#. You have to be smart-ish about it, but writing low GC pressure code can be easier than you think. Keep your call stacks shallow, avoid certain language constructs (i.e. LINQ) or at least know when they really make sense for the cost (async).

IDK if this is true for earlier versions, but as of today C# has pretty clear rules: 16MB in desktop or 64MB in server (which type is used can be set via config) will trigger a full GC [1]. Note that less than that may trigger a lower level GC, but those are usually not the ones that are noticed. I'm guessing at least some of that is because of memory locality as well as the small sizes.

On the other hand, in a lot of the Unity related C# posts I see on forums/etc, passing structs around is considered the 'performant' way to do things to minimize GC pressure.

[1] https://docs.microsoft.com/en-us/dotnet/standard/garbage-col... [2] https://blog.golang.org/ismmkeynote

mindajar | karma 70 | avg karma 2.19 2020-02-05 01:31:50+00:00 | [–] similar comments

If there's an example of getting great game performance with a GC language, Unity isn't it. Lots of Unity games get stuttery, and even when they don't, they seem to use a lot of RAM relative to game complexity. Kerbal Space Program even mentioned in their release notes at one point something about a new garbage collector helping with frame rate stuttering.

I started up KSP just now, and it was at 5.57GB before I even got to the main menu. To be fair, I hadn't launched it recently, so it was installing its updates or whatever. Ok, I launched it again, and at the main menu it's sitting on 5.46GB. (This is on a Mac.) At Mission Control, I'm not even playing the game yet, and the process is using 6.3GB.

I think a better takeaway is that you can get away with GC even in games now, because it sucks and is inefficient but it's ... good enough. We're all conditioned to put up with inefficient software everywhere, so it doesn't even hurt that much anymore when it totally sucks.

loeg | karma 21759 | avg karma 2.57 2020-02-04 19:40:07 | [–] similar comments

A GC scan of a large LRU (or any large object graph) is expensive in CPU terms because the many of the pointers traversed will not be in any CPU cache. Memory access latency is extremely high relative to how fast CPUs can process cached data.

You could maybe hack around the GC performance without destroying the aims of LRU eviction by batching additions to your LRU data structure to reduce the number of pointers by a factor of N. It's also possible that a Go BTree indexed by timestamp, with embedded data, would provide acceptable LRU performance and would be much friendlier on the cache. But it might also not have acceptable performance. And Go's lack of generic datastructures makes this trickier to implement vs Rust's BtreeMap provided out of the box.

jerf | karma 85298 | avg karma 5.28 2020-02-04 19:41:57+00:00 | [–] similar comments

Yes, this is a maximally pessimal case for most forms of garbage collection. They don't say, but I would imagine these are very RAM-heavy systems. You can get up to 768GB right now on EC2. Fill that entire thing up with little tiny objects the size of usernames or IDs for users, or even merely 128GB systems or something, and the phase where you crawl the RAM to check references by necessity is going to be slow.

This is something important to know before choosing a GC-based language for a task like this. I don't think "generating more garbage" would help, the problem is the scan is slow.

If Discord was forced to do this in pure Go, there is a solution, which is basically to allocate a []byte or a set of []bytes, and then treat it as expanse of memory yourself, managing hashing, etc., basically, doing manual arena allocation yourself. GC would drop to basically zero in that case because the GC would only see the []byte slices, not all the contents as individual objects. You'll see this technique used in GC'd languages, including Java.

But it's tricky code. At that point you've shucked off all the conveniences and features of modern languages and in terms of memory safety within the context of the byte expanses, you're writing in assembler. (You can't escape those arrays, which is still nice, but hardly the only possible issue.)

Which is, of course, where Rust comes in. The tricky code you'd be writing in Go/Java/other GC'd language with tons of tricky bugs, you end up writing with compiler support and built-in static checking in Rust.

I would imagine the Discord team evaluated the option of just grabbing some byte arrays and going to town, but it's fairly scary code to write. There are just too many ways to even describe for such code to end up having a 0.00001% bug that will result in something like the entire data structure getting intermittently trashed every six days on average or something, virtually impossible to pick up in testing and possibly even escaping canary deploys.

Probably some other languages have libraries that could support this use case. I know Go doesn't ship with one and at first guess, I wouldn't expect to find one for Go, or one I would expect to stand up at this scale. Besides, honestly, at feature-set maturity limit for such a library, you just end up with "a non-GC'd inner platform" for your GC'd language, and may well be better off getting a real non-GC'd platform that isn't an inner platform [1]. I've learned to really hate inner platforms.

By contrast... I'd bet this is fairly "boring" Rust code, and way, way less scary to deploy.

[1]: https://en.wikipedia.org/wiki/Inner-platform_effect

_bxg1 | karma 112 | avg karma 0.03 2020-02-04 13:51:21 | [–] similar comments

> I don't think "generating more garbage" would help

To be clear: I wasn't suggesting that generating garbage would help anyone. Only that in a more typical case, where more garbage is being generated, the two minute interval itself might never surface as the problem because other things are getting in front of it.

ericflo | karma 6355 | avg karma 10.28 2020-02-04 18:57:19 | [–] similar comments

It comes from a desire to run in the exact opposite direction as the JVM, which has options for every conceivable parameter. Go has gone through a lot of effort to keep the number of configurable GC parameters to 1.

erik_seaberg | karma 4377 | avg karma 1.56 2020-02-04 18:59:20+00:00 | [–] similar comments

Anyone who pushes the limits of a machine needs tuning options. If you can't turn knobs you have to keep rewriting code until you happen to get the same effect.

ericflo | karma 6355 | avg karma 10.28 2020-02-04 19:05:15 | [–] similar comments

There's definitely a happy medium. One setting may indeed be too few, but JVM's many options ends in mass cargo-cult copypasta, often leading to really bad configurations.

Polyisoprene | karma 101 | avg karma 1.44 2020-02-04 23:05:41+00:00 | [–] similar comments

Haven’t really seen anyone trying to use JVM options to get performance benefits without benchmarks for their specific use case the last 10 years or so.

tomc1985 | karma 11141 | avg karma 2.76 2020-02-04 19:15:15 | [–] similar comments

YeAh BuT wE dOnT wAnT tO hAvE tO tEsT eVeRy oPtIoN!!!

- lazy devs and product managers, everywhere

hombre_fatal | karma 18663 | avg karma 3.79 2020-02-04 20:24:55+00:00 | [–] similar comments

This was the first time I've seen that annoying cAsE meme on HN and I pray it's the last. It is a lazy way to make your point, hoping your meme-case does all the work for you so that you don't have to say anything substantial.

Or do you think it adds to the discussion?

tomc1985 | karma 11141 | avg karma 2.76 2020-02-05 01:23:12+00:00 | [–] similar comments

It indicates a mocking, over-the-top tone to indicate the high level of contempt I have for my originally-stated paraphrase (and the people who have caused software dev decisionmaking to be that way). So yes, I think it does add to the discussion.

boring_twenties | karma 2309 | avg karma 1.5 2020-02-05 04:33:05+00:00 | [–] similar comments

It's annoying to read, so while it does get across the mocking tone, the reaction of annoyance at the author is far stronger.

tannhaeuser | karma 10947 | avg karma 3.1 2020-02-04 19:20:37+00:00 | [–] similar comments

That might be true, but from a language design PoV it isn't convincing to have dozens of GC-related runtime flags a la Java/JVM. If you need those anyway, this might point to pretty fundamental language expressivity issues.

kllrnohj | karma 11052 | avg karma 3.34 2020-02-04 20:11:10 | [–] similar comments

Tuning options don't work well with diverse libraries, though. If you use 2 libraries and they both are designed to run with radically different tuning options what do you do? Some bad compromise? Make one the winner and one the loser? The best you can do is do an extensive monitoring & tuning experiment, but that's quite involved as well and still won't get you the maximum performance of each library, either.

At least with code hacking around the GC's behavior that code ends up being portable across the ecosystem.

There doesn't seem to really be a good option here either way. This amount of tuning-by-brute-force (either by knobs or by code re-writes) seems to just be the cost of using a GC.

Scarbutt | karma 1633 | avg karma 0.85 2020-02-04 19:04:41+00:00 | [–] similar comments

Surely tuning some GC parameters is less effort that having to do a rewrite in another language.

dickeytk | karma 1273 | avg karma 3.26 2020-02-04 19:02:45+00:00 | [–] similar comments

I think for most applications (especially the common use-case of migrating a scripting web monolith to a go service), people just aren't hitting performance issues with GC. Discord being a notable exception.

If these issues were more common, there would be more configuration available.

[EDIT] to downvoters: I'm not saying it's not an issue worth addressing (and it may have already been since they were on 1.9), I was just answering the question of "why this might happen"

biomcgary | karma 2289 | avg karma 3.76 2020-02-04 19:08:14 | [–] similar comments

Or, in the case of latency, just wait a few months because the Go team obsesses about latency (no surprise from a Google supported language). Discord's comparison is using Go1.9. Their problem may well have been addressed in Go1.12. See https://golang.org/doc/go1.12#runtime.

Nyra | karma 59 | avg karma 4.21 2020-02-04 19:03:26 | [–] similar comments

Funnily enough, something similar happened at Twitch regarding their API front end written in Go: https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i...

mrpotato | karma 307 | avg karma 3.45 2020-02-04 19:19:33+00:00 | [–] similar comments

Interesting, they went a totally different route.

> The ballast in our application is a large allocation of memory that provides stability to the heap.

> As noted earlier, the GC will trigger every time the heap size doubles. The heap size is the total size of allocations on the heap. Therefore, if a ballast of 10 GiB is allocated, the next GC will only trigger when the heap size grows to 20 GiB. At that point, there will be roughly 10 GiB of ballast + 10 GiB of other allocations.

firethief | karma 1941 | avg karma 2.26 2020-02-04 20:10:24+00:00 | [–] similar comments

Wow, that puts Discord's "absurd hack" into perspective! I feel like the moral here is a corollary to that law where people will depend on any observable behavior of the implementation: people will use any available means to tune important performance parameters; so you might as well expose an API directly, because doing so actually results in less dependence on your implementation details than if people resort to ceremonial magic.

tick_tock_tick | karma 3482 | avg karma 2.72 2020-02-04 20:29:05 | [–] similar comments

I mean if you read Twitch's hack they intentionally did it in code so they didn't need to tune the GC parameter. They wanted to avoid all environment config.

firethief | karma 1941 | avg karma 2.26 2020-02-04 20:37:08+00:00 | [–] similar comments

I missed that part. I thought they would use a parameter if it were available, because they said this:

> For those interested, there is a proposal to add a target heap size flag to the GC which will hopefully make its way into the Go runtime soon.

What's wrong with the existing parameter?

I'm sure they aren't going this far to avoid all environment config without a good reason, but any good reason would be a flaw in some part of their stack.

robocat | karma 11778 | avg karma 2.08 2020-02-04 20:47:53+00:00 | [–] similar comments

summary: Go 1.5, memory usage (heap) of 500MB on a VM with 64GiB of physical memory, with 30% of CPU cycles spent in function calls related to GC, and unacceptable problems during traffic spikes. Optimisation hack to somewhat fix problem was to allocate 10GiB, but not using the allocation at all, which caused a beneficial change in the GC behaviour!

nickserv | karma 1302 | avg karma 2.94 2020-02-04 19:05:26 | [–] similar comments

This is in line with Go's philosophy, they try to keep the language as simple as possible.

Sometimes it means an easy thing in most other languages is difficult or tiresome to do in Go. Sometimes it means hard-coded values/decisions you can't change (only tabs anyone?).

But overall this makes for a language that's very easy to learn, where code from project to project and team to team is very similar and quick to understand.

Like anything, it all depends on your needs. We've found it suits ours quite well, and migrating from a Ruby code base has been a breath of fresh air for the team. But we don't have the same performance requirements as Discord.

andai | karma 8456 | avg karma 2.45 2020-02-04 19:11:15+00:00 | [–] similar comments

Offtopic but what are you missing when you have to use tabs instead of spaces? I can understand different indentation preferences but I can change the indentation width per tab in my editor. And then everyone can read the code with the indentation they prefer, while the file stays the same.

nickserv | karma 1302 | avg karma 2.94 2020-02-04 19:15:47+00:00 | [–] similar comments

It's just an example of something that the Go team took a decision on, and won't allow you to change. I mean, even Python lets you choose. I don't really have a problem with it however, even if I do prefer spaces.

giancarlostoro | karma 13964 | avg karma 2.4 2020-02-04 19:23:27+00:00 | [–] similar comments

Python chose spaces a la PEP8 by the way.

dragonwriter | karma 118260 | avg karma 2.17 2020-02-04 19:26:04+00:00 | [–] similar comments

PEP8 isn't a language requirement, but a style guide. There are tools to enforce style on Python, but the language itself does not.

giancarlostoro | karma 13964 | avg karma 2.4 2020-02-04 19:31:59+00:00 | [–] similar comments

Same thing with Go... Tabs aren't enforced, but the out of the box formatter will use tabs. PyCharm will default to trying to follow PEP8, and GoLand will do the same, it will try to follow the gofmt standards.

See:

https://stackoverflow.com/questions/19094704/indentation-in-...

marsokod | karma 239 | avg karma 1.61 2020-02-04 19:29:07+00:00 | [–] similar comments

You can use tabs to indent your python code. Ok, you might be lynched if you share your code but as long as you don't mix tabs and spaces, it is fine.

giancarlostoro | karma 13964 | avg karma 2.4 2020-02-04 19:33:55 | [–] similar comments

Same with Go, you can use spaces.

_wldu | karma 5420 | avg karma 7.83 2020-02-04 19:56:03 | [–] similar comments

Or none at all: https://play.golang.org/p/VeEaJbJ6sYT

monocasa | karma 27236 | avg karma 2.94 2020-02-04 19:36:10 | [–] similar comments

There's a difference between making decisions that are really open to bikeshedding, and making sweeping decisions in contexts that legitimately need per app tuning like immature GCs.

The Azul guys get to claim that you don't need to tune their gc, golang doesn't.

geodel | karma 6768 | avg karma 2.18 2020-02-04 19:57:49+00:00 | [–] similar comments

Hmm ..this is why Azul's install and configure guide run in hundreds of pages. All the advanced tuning, profiling and configuring OS commands, setting contingency memory pools are perhaps for GCs which Azul does not sell.

monocasa | karma 27236 | avg karma 2.94 2020-02-04 20:05:08+00:00 | [–] similar comments

I mean, they'll let you because the kind of customers want to be able to are the kinds of customers that Azul targets. But everything I've heard from their engineers is that they've solved a lot of customer problems by resetting things to defaults and just letting it have a giant heap to play with.

Not sure how that makes the golang position any better.

tasty_freeze | karma 6721 | avg karma 5.13 2020-02-04 19:27:27+00:00 | [–] similar comments

> everyone can read the code with the indentation they prefer, while the file stays the same.

Have you ever worked in a code base with many contributors that changed over the course of years? In my experience it always ends up a jumble where indentation is screwed up and no particular tab setting makes things right. I've worked on files where different lines in the same file might assume tab spacing of 2, 3, 4, or 8.

For example, say there is a function with a lot of parameters, so the argument list gets split across lines. The first line has, say, two tabs before the start of the function call. The continuation line ideally should be two tabs then a bunch of spaces to make the arguments line up with the arguments from the first line. But in practice people end up putting three or four tabs to make the 2nd line line up with the arguments of the first line. It looks great with whatever tab setting the person used at that moment, but then change tab spacing and it no longer is aligned.

Will_Parker | karma 551 | avg karma 3.36 2020-02-04 19:36:22+00:00 | [–] similar comments

> In my experience it always ends up a jumble where indentation is screwed up and no particular tab setting makes things right.

Consider linting tools in your build.

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 19:37:32+00:00 | [–] similar comments

On the good side, the problem of mixing tabs and spaces does normally not appear in Go sources, as gofmt always converts spaces to tabs, so there is no inconsistant indentation. Normally I prefer spaces to tabs because I dislike the mixing, but gofmt solves this nicely for me.

tasty_freeze | karma 6721 | avg karma 5.13 2020-02-04 19:44:30+00:00 | [–] similar comments

Please explain to me how this works for the case I outlineed, eg:

        some_function(arg1, arg2, arg3, arg4,
                      arg5, arg6);

For the sake of argument, say tabstop=4. If the first line starts with two tabs, will the second line also have two tabs and then a bunch of spaces, or will it start with five tabs and a couple spaces?

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 19:53:03+00:00 | [–] similar comments

You wouldn't use an alignment-based style, but a block-based one instead:

  some_function(
      arg1,
      arg2,
      arg3,
      arg4,
      arg5,
      arg6,
  );

(I don't know what Go idiom says here, this is just a more general solution.)

masklinn | karma 65147 | avg karma 3.36 2020-02-04 20:21:36+00:00 | [–] similar comments

Checking the original code on the playground, Go just reindents everything using one tab per level. So if the funcall is indented by 2 (tabs), the line-broken arguments are indented by 3 (not aligned with the open paren).

rustfmt looks to try and be "smarter" as it will move the argslist and add linebreaks to it go not go beyond whatever limit is configured on the playground, gofmt apparently doesn't insert breaks in arglists.

Uristqwerty | karma 46 | avg karma 1.53 2020-02-04 19:56:23 | [–] similar comments

In an ideal world, I'd think you would put a "tab stop" character before arg1, then a single tab on the following line, with the bonus benefit that the formatting would survive automatic name changes and not create an indent-change-only line in the diff. Trouble being that all IDEs would have to understand that character, and compilers would have to ignore it (hey, ASCII has form feed and vertical tab that could be repurposed...).

vthriller | karma 601 | avg karma 3.88 2020-02-05 01:46:02+00:00 | [–] similar comments

Or you could use regular tab stop characters to align parts of adjacent lines. That's the idea behind elastic tabstops: http://nickgravgaard.com/elastic-tabstops/

Not all editors, however, support this style of alignment, even if they support plugins (looks at vim and its forks).

dennisgorelik | karma 2312 | avg karma 1.24 2020-02-04 20:59:17+00:00 | [–] similar comments

You should NOT do such alignment anyway, because if you rename "some_function" to "another_function", then you will lose your formatting.

Instead, format arguments in a separate block:

    some_function(
        arg1, arg2, arg3, arg4,
        arg5, arg6);

When arguments are aligned in a separate block, both spaces and tabs work fine.

My own preference is tabs, because of less visual noise in code diff [review].

Uristqwerty | karma 46 | avg karma 1.53 2020-02-04 19:47:34+00:00 | [–] similar comments

I don't know about anyone else, but I like aligning certain things at half-indents (labels/cases half an indent back, so you can skim the silhouette of both the surrounding block and jump targets within it; braceless if/for bodies to emphasize their single-statement nature (that convention alone would have made "goto fail" blatantly obvious to human readers, though not helped the compiler); virtual blocks created by API structure (between glBegin() to glEnd() in the OpenGL 1.x days)).

Thing is, few if any IDEs support the concept, so if I want to have half-indents, I must use spaces. Unfortunately, these days that means giving up and using a more common indent style most of the time, as the extra bit of readability generally isn't worth losing automatic formatting or multi-line indent changes.

mntmoss | karma 697 | avg karma 2.51 2020-02-05 05:11:55+00:00 | [–] similar comments

You can use empty scope braces for this task in most languages. It's not a "half-indent" but it gives you the alignment and informs responsible variable usage.

takeda | karma 6128 | avg karma 2.11 2020-02-06 03:24:42+00:00 | [–] similar comments

So you are the person that ruins it for everyone (are you emacs user by any chance?). Tabs are more versatile, you can even use proportional fonts with them. Projects end up using tabs because many people end up mixing them together (unknowingly or in your case knowingly using configuration that is unavailable in many IDEs).

BTW when you mix spaces with tabs you eliminate all benefits that tabs give (for example you no longer can dynamically change tab size without ruining formatting.

Uristqwerty | karma 46 | avg karma 1.53 2020-02-07 00:56:54+00:00 | [–] similar comments

If I were an emacs user, I'd figure out how to write a plugin to display tab-indented code to my preferences.

No, I used to be a notepad user (on personal projects, not shared work) (you can kinda see it in the use of indentation to help convey information that IDEs would use font or colour to emphasize), and these days use tabs but longingly wish Eclipse, etc. had more options in their already-massive formatting configuration dialogues.

takeda | karma 6128 | avg karma 2.11 2020-02-07 19:03:13 | [–] similar comments

The reason I asked is that I believe this behavior is what Emacs does by default (actually don't know if by default, but saw this from code produced by Emacs users) e.g.

<tab>(inserts 4 spaces)<tab>(replaces 4 spaces into a tab that is 8 columns)<tab>(adds 4 spaces after the tab)<tab>(replaces with two tabs and so on)

Unless I misunderstood what formatting you were using.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 19:48:44+00:00 | [–] similar comments

"Simple" when used in programming, doesn't mean anything. So let's be clear here: what we mean is that compilation occurs in a single pass and the artifact of compilation is a single binary.

These are two things that make a lot of sense at Google if you read why they were done.

But unless you're working at Google, I struggle to guess why you would care about either of these things. The first requires sacrificing anything resembling a reasonable type system, and even with that sacrifice Go doesn't really deliver: are we really supposed to buy that "go generate" isn't a compilation step? The second is sort of nice, but not nice enough to be a factor in choosing a language.

The core language is currently small, but every language grows with time: even C with its slow-moving, change-averse standards body has grown over the years. Currently people are refreshed by the lack of horrible dependency trees in Go, but that's mostly because there aren't many libraries available for Go: that will also change with time (and you can just not import all of CPAN/PyPy/npm/etc. in any language, so Go isn't special anyway).

If you like Go for some aesthetic of "simplicity", then sure, I guess I can see how it has that. But if we're discussing pros and cons, aesthetics are pretty subjective and not really work talking about.

papaf | karma 2097 | avg karma 2.95 2020-02-04 20:19:39+00:00 | [–] similar comments

I don't agree with your definition of simplicity.

I like Go and I consider it a simple language because:

1. I can keep most of the language in my head and I don't hit productivity pauses where I have to look something up.

2. There is usually only one way to do things and I don't have to spend time deciding on the right way.

For me, these qualities make programming very enjoyable.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 20:50:46+00:00 | [–] similar comments

> I don't agree with your definition of simplicity.

You mean where I explicitly said that "simple" didn't mean anything, so we should talk about what we mean more concretely?

> 1. I can keep most of the language in my head and I don't hit productivity pauses where I have to look something up.

The core language is currently small, but every language grows with time: even C with its slow-moving, change-averse standards body has grown over the years.

> 2. There is usually only one way to do things and I don't have to spend time deciding on the right way.

Go supports functional programming and object-oriented programming, so pretty much anything you want to do has at least two ways to do it--it sounds like you just aren't familiar with the various ways.

The problem with having more than one way to do things isn't usually choosing which to use, by the way: the problem is when people use one of the many ways differently within the same codebase and it doesn't play nicely with the way things are done in the codebase.

This isn't really a criticism of Go, however: I can't think of a language that actually delivers on there being one right way to do things (most don't even make that promise--Python makes the promise but certainly doesn't deliver on it).

sk0g | karma 1482 | avg karma 2.14 2020-02-04 21:18:13+00:00 | [–] similar comments

Does Go support functional programming? There's no support for map, filter, etc. It barely supports OOP too, with no real inheritance or generics.

I've been happy working with it for a year now, though I've had the chance to work with Kotlin and I have to say, it's very nice too, even if the parallelism isn't quite easy/ convenient to use.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 22:16:52 | [–] similar comments

It supports first-class functions, and it supports classes/objects. Sure, it doesn't include good tooling for either, but:

1. map/filter are 2 lines of code each. 2. Inheritance is part of mainstream OOP, but there are some less common languages that don't support inheritance in the way you're probably thinking (i.e. older versions of JavaScript before they caved and introduced two forms of inheritance). 3. Generics are more of a strong type thing than an OOP thing.

abraxas | karma 2908 | avg karma 3.94 2020-02-04 19:10:43+00:00 | [–] similar comments

Seems like Go is more suitable for the “spin up, spin down, never let the GC run” kind of scenario that is being pushed by products like AWS Lambda and other function as a service frameworks.

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 19:26:34+00:00 | [–] similar comments

Why do you think it is? Go has a really great gc which mostly runs in parallel to your program with gc stops only in the doman of less than milliseconds. Discord ran into a corner case where they did not create enough garbage to trigger gc cycles, but had a performance impact due to scheduled gc cycles for returning memory to the OS (which they wouldn't need to do either).

abraxas | karma 2908 | avg karma 3.94 2020-02-04 19:36:19 | [–] similar comments

Because many services eventually become performance bottlenecked either via accumulation of users or accumulation of features. In either case eventually performance becomes very critical.

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 20:16:31 | [–] similar comments

Sure, but that doesn't make Go unsuitable for those tasks on a fundamental basis. Go is very high performance. Whether Go or another language is the best match very much depends on the problem at hand and the especial requirements. Even in the described case they might have tweaked the GC to fit their bill.

abraxas | karma 2908 | avg karma 3.94 2020-02-05 00:08:02+00:00 | [–] similar comments

GC pauses aside can Go match the performance of Rust when coded properly? Would sorting an array of structs in Go be in the same ballpark as sorting the same sized array of structures in Rust? I don't know a whole lot about how Go manages the heap under the covers.

cdoxsey | karma 3151 | avg karma 7.15 2020-02-05 01:34:42 | [–] similar comments

Sure. That kind of code is very similar in both languages.

Rust will probably be faster because it benefits from optimizations in LLVM that Go likely doesn't have.

Go arrays of structs are contiguous, not indirect pointers.

recuter | karma 3865 | avg karma 2.57 2020-02-04 19:18:23+00:00 | [–] similar comments

If you want to force it you can call "runtime.GC()" but that's almost always a step in the wrong direction.

It is worth it to read and understand: https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i...

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 19:19:30+00:00 | [–] similar comments

With recent Go releases, GC pauses have become neglible for most applications. So this should not get into your way. However, it can easily tweaked, if needed. There is runtime.ForceGCPeriod, which is a pointer to the forcegcperiod variable. A Go program, which really needs to change this, can do it, but most programs shouldn't require this.

Also, it is almost trivial to edit the Go sources (they are included in the distribution) and rebuild it, which usually takes just a minute. So Go is really suited for your own experiments - especially, as Go is implemented in Go.

nemo1618 | karma 3894 | avg karma 4.66 2020-02-04 19:25:51+00:00 | [–] similar comments

runtime.ForceGCPeriod is only exported in testing, so you wouldn't be able to use it in production. But as you said, the distribution could easily be modified to fit their needs.

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 19:33:17 | [–] similar comments

Thanks, didn't catch that this is for testing only.

calcifer | karma 3833 | avg karma 5.87 2020-02-04 19:56:33+00:00 | [–] similar comments

> especially, as Go is implemented in Go.

Well, parts of it. You can't implement "make" or "new" in Go yourself, for example.

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 20:13:33+00:00 | [–] similar comments

You have to distinguish between the features available to a Go program as the user writes it and the implementation of the language. The immplementation is completely written in Go (plus a bit of low-level assembly). Even if the internals of e.g. the GC are not visible to a Go program, the GC itself is implemented in Go and thus easily readeable and hackeable for experienced Go programmers. And you can quickly rebuild the whole Go stack.

calcifer | karma 3833 | avg karma 5.87 2020-02-04 22:10:04+00:00 | [–] similar comments

> You have to distinguish between the features available to a Go program as the user writes it and the implementation of the language.

I do, I'm just objecting to "Go is implemented in Go".

dodobirdlord | karma 3155 | avg karma 2.5 2020-02-05 04:35:17+00:00 | [–] similar comments

This reminds me of the ongoing saga of RUSTC_BOOTSTRAP[0][1]

The stable compiler is permitted to use unstable features in stable builds, but only for compiling the compiler. In essence, there are some Rust features that are supported by the compiler but only permitted to be used by the compiler. Unsurprisingly, various non-compiler users of Rust have decided that they want those features and begun setting the RUSTC_BOOTSTRAP envvar to build things other than the compiler, prompting consternation from the compiler team.

[0] https://github.com/rust-lang/cargo/issues/6627 [1] https://github.com/rust-lang/cargo/issues/7088

estebank | karma 4736 | avg karma 3.39 2020-02-05 06:15:06 | [–] similar comments

This is not entirely correct. These things that "can only be used by the compiler" are nightly features that haven't been stabilized yet. Some of them might never be stabilized, but you could always use them in a nightly conpiler, stability assurances just fly out the window then. This is also why using that environment variable is highly discouraged: it breaks the stability guarantees of the language and you're effectively using a pinned nightly. This is reasonable only in a very small handful of cases.

giornogiovanna | karma 496 | avg karma 3.12 2020-02-05 06:53:15+00:00 | [–] similar comments

Yep. Beyond that, there is at least one place[0] where the standard library uses undefined behavior "based on its privileged knowledge of rustc internals".

[0]: https://doc.rust-lang.org/src/std/io/mod.rs.html#379

dodobirdlord | karma 3155 | avg karma 2.5 2020-02-05 07:25:12+00:00 | [–] similar comments

I don't see what is incorrect? Perhaps I was insufficiently clear that when I said "the compiler" I meant "the stable compiler" as opposed to more generally all possible versions of rustc. The stable compiler is permitted to use unstable features for its own bootstrap, example being the limited use of const generics to compile parts of the standard library.

_ph_ | karma 9798 | avg karma 2.44 2020-02-05 07:01:45+00:00 | [–] similar comments

But on what basis? What part of Go isn't implemented in Go?

calcifer | karma 3833 | avg karma 5.87 2020-02-05 09:14:18+00:00 | [–] similar comments

I gave an example a bit further up, here [1].

[1] https://news.ycombinator.com/item?id=22240223

_ph_ | karma 9798 | avg karma 2.44 2020-02-05 09:43:06+00:00 | [–] similar comments

But this isn't a contradiction to the statement, that Go is implemented in Go. If you look at the sources of the Go implementation, the source code is 99% Go, with a few assembly functions (most for optimizations not performed by the compiler) and no other programming language used.

calcifer | karma 3833 | avg karma 5.87 2020-02-05 12:50:16+00:00 | [–] similar comments

That's not correct. The implementation of "make", for example, looks like Go but isn't - it relies on internal details of the gc compiler that isn't part of the spec [1]. That's why a Go user can't implement "make" in Go.

[1] https://golang.org/ref/spec

_ph_ | karma 9798 | avg karma 2.44 2020-02-05 13:11:04 | [–] similar comments

In which language do you think is "make" implemented?

scott_s | karma 34069 | avg karma 3.96 2020-02-05 15:38:10 | [–] similar comments

If I may interject: I believe you are both trying to make orthogonal points. calcifer's is trying to say that some features of Go are compiler intrinsics, and cannot be implemented as a library. You are making a different point, which is that those intrinsics are implemented in Go, the host language. Both statement can be true at the same time, but I agree that the terms were not used entirely accurately, causing confusion.

slrz | karma 455 | avg karma 1.19 2020-02-04 21:32:10 | [–] similar comments

And yet, maps and slices are implemented in Go.

https://golang.org/src/runtime/map.go

https://golang.org/src/runtime/slice.go

I don't see why you couldn't do something similar in your own Go code. It just won't be as convenient to use as the compiler wouldn't fill in the type information (element size, suitable hash function, etc.) for you. You'd have to pass that yourself or provide type-specific wrappers invoking the unsafe base implementation. More or less like you would do in C, with some extra care to abide by the rules required for unsafe Go code.

calcifer | karma 3833 | avg karma 5.87 2020-02-04 22:12:00+00:00 | [–] similar comments

Nothing you wrote contradicts what I said. You can't implement "make" in Go. The fact that you can implement some approximation of it with a worse signature and worse runtime behaviour (since it won't be compiler assisted) doesn't make it "make".

robocat | karma 11778 | avg karma 2.08 2020-02-04 21:06:32+00:00 | [–] similar comments

You still may have significant CPU overhead from the GC e.g. the twitch article (mentioned elsewhere in comments) measured 30% CPU used for GC for one program (Go 1.5 I think).

Obviously they consider spending 50% more on hardware is a worthwhile compromise for the gains they get (e.g. reduction of developer hours and reduced risk of security flaws or avoiding other effects of invalid pointers).

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 21:15:14+00:00 | [–] similar comments

In this case, as they were running into the automatic GC interval, their program did not create much, if any garbage. So the CPU overhead for the GC would have been quite small.

If you do a lot of allocations, the GC overhead rises of course, but also would the effort of doing allocations/deallocations with a manual managing scheme. In the end it is a bit trade-off, what fits the problem at hand best. The nice thing about Rust is, that "manual" memory management doesn't come at the price of program correctness.

pkolaczk | karma 2452 | avg karma 1.99 2020-02-05 10:01:07+00:00 | [–] similar comments

Languages that have GC frequently rely on heap allocation by default and make plenty of allocations. Languages with good manual memory management frequently rely on stack allocation and give plenty of tools to work with data on the stack. Automatic allocation on the stack is almost always faster than the best GC.

_ph_ | karma 9798 | avg karma 2.44 2020-02-05 12:10:31+00:00 | [–] similar comments

GC languages often do and also often do not. Most modern GC languages have escape analysis. So if the compiler can deduct that an object does not escape the current scope, it is stack allocated instead of heap allocated. Modern JVMs do this and Go does this also. Furthermore, Go is way more allocation friendly than e.g. Java. In Go an array of structs is a single item on the heap (or stack). In Java, you would have an array of pointers to separately allocate on the heap (Java is just now trying to rectify this with the "record" types). Also, structs are passed by value instead of reference.

As a consequence, the heap pressure of a Go program is not necessarily significantly larger than that of an equivalent C or Rust program.

pkolaczk | karma 2452 | avg karma 1.99 2020-02-05 18:06:09 | [–] similar comments

Escape analysis is very limited and what I found in practice, it often doesn't work in real code, where not all the things are inlined. If a method allocates an object and returns it 10 layers up, EA can't do anything.

In contrary, in e.g. C I can wrap two 32-bit fields in a struct and freely pass then anywhere with zero heap allocations.

Also, record types are not going to fix the pointer chasing problem with arrays. This is promised by Valhalla, but I'be been hearing about it for 3 years or more now.

andoriyu | karma 59 | avg karma 0.34 2020-02-04 21:16:24 | [–] similar comments

> Also, it is almost trivial to edit the Go sources (they are included in the distribution) and rebuild it, which usually takes just a minute. So Go is really suited for your own experiments - especially, as Go is implemented in Go.

Ruby 1.8.x wants to say "Hello"

tedunangst | karma 26000 | avg karma 2.74 2020-02-04 19:34:23+00:00 | [–] similar comments

Typically a GC runtime will do a collection when you allocate memory, probably when the heap size is 2x the size after the last collection. But this doesn't free memory when the process doesn't allocate memory. The goal is to return unused memory back to the operating system so it's available for other purposes. (You allocate a ton of memory, calculate some result, write the result to a file, and drop references to the memory. When will it be freed?)

justadudeama | karma 678 | avg karma 6.0 2020-02-04 18:46:42+00:00 | [–] similar comments

> Changing to a BTreeMap instead of a HashMap in the LRU cache to optimize memory usage.

Can someone explain to me how BTreeMap is more memory efficient than a HashMap?

jhgg | karma 3220 | avg karma 6.78 2020-02-04 18:51:01 | [–] similar comments

This is a bit unclear. The root map is still a hash map, but it's a "map of maps" the inner map is a BTreeMap - this is for memory efficiency, as the inner map is relatively smaller and we wouldn't have to deal with the growth factor of a hash map (and having to manually manage that.) where as the root hash map is pre allocated to its max size.

afranchuk | karma 302 | avg karma 2.5 2020-02-04 18:57:32 | [–] similar comments

A BTreeMap should typically have O(n) memory usage, whereas a HashMap (depending on load factor) will usually have O(kn) memory usage, where k > 1. This is because a HashMap allocates the table into which it will store hashed values upfront (and when the load is too great), so it can't anticipate how many values may be added nor what sorts of collisions may occur at this time. Yes, collisions are typically stored as some allocate-per-item collection, but the desire of a HashMap is to avoid such collisions. A BTreeMap allocates for each new value.

Note that this explanation is a bit handwavy, as both data structures have numerous optimizations in production scenarios.

cesarb | karma 14181 | avg karma 3.67 2020-02-04 19:53:40+00:00 | [–] similar comments

> collisions are typically stored as some allocate-per-item collection

Rust's HashMap stores the collisions in the same table as the non-collisions (open addressing), not in a separate collection.

afranchuk | karma 302 | avg karma 2.5 2020-02-04 20:11:21 | [–] similar comments

This is true, thanks for the specifics. I was answering the question from a more generic perspective, but failed to mention that many implementations rehash on collision...

nybble41 | karma 1836 | avg karma 1.05 2020-02-04 19:54:31+00:00 | [–] similar comments

There is no difference between O(n) and O(kn), if k is a constant. The notation deliberately ignores constant factors. (That's why you can say a BTreeMap requires O(n) memory independent of the size or type of data being stored, provided there is some finite upper bound on the sizes of the keys and values.)

afranchuk | karma 302 | avg karma 2.5 2020-02-04 20:08:58+00:00 | [–] similar comments

Yeah I know, it was just the fastest way to indicate that the constant factor was almost definitely larger for HashMaps. But thank you for clarifying!

mperham | karma 3264 | avg karma 6.46 2020-02-04 18:48:48+00:00 | [–] similar comments

Better title: "One Discord microservice with extremely high traffic is moving to Rust"

jhgg | karma 3220 | avg karma 6.78 2020-02-04 18:54:25 | [–] similar comments

This is one of multiple, we did not blog about this one, but switching a Python http service for analytics ingest that was purely CPU bound to rust resulted in a 90% reduction in compute required to power it. However, that's not too interesting because it's known that Python is slow haha.

We have 2 golang services left, one of them has a rewrite in rust in PR as of last week (as a fun side project an engineer wanted to try out.)

Additionally, as we move towards a more SOA internally, we plan to write more high velocity data services, and rust will be our language of choice for that.

okgood288 | karma -4 | avg karma -4.0 2020-02-04 19:09:36 | [–] similar comments

Well sure when it’s a micro service that probably has more lines of infra config than biz logic LOC.

This isn’t exactly “Linux kernel: now in Rust!”

Glad you’re making tech for you all better.

We get to take up the externalized runtime costs of the mess that is the Electron app.

Engineers are super efficient at offloading the last mile of effort.

snazz | karma 5427 | avg karma 2.68 2020-02-04 19:18:11 | [–] similar comments

Let's not start the Electron debate again. That's been argued to death already.

jodrellblank | karma 12920 | avg karma 1.96 2020-02-04 23:51:17+00:00 | [–] similar comments

Maybe if we keep at it, we can argue it until Electron's death?

onebot | karma 458 | avg karma 3.16 2020-02-04 19:28:52+00:00 | [–] similar comments

Think replacing elixir with Rust would ever be a consideration? Rust isn't there yet, but if you are NIF'ing a bunch of stuff, seems like it could make sense at some point?

meowface | karma 10977 | avg karma 2.45 2020-02-04 23:47:26 | [–] similar comments

I Googled around, but couldn't find the answer. What is "NIF"?

dnautics | karma 17907 | avg karma 2.14 2020-02-05 00:06:57+00:00 | [–] similar comments

https://erlang.org/doc/man/erl_nif.html

Short: ffi for erlang.

Lev1a | karma 864 | avg karma 2.32 2020-02-05 00:45:30 | [–] similar comments

> but switching a Python http service for analytics ingest that was purely CPU bound to rust resulted in a 90% reduction in compute required to power it. However, that's not too interesting because it's known that Python is slow

Kinda like this: https://blog.sentry.io/2016/10/19/fixing-python-performance-... ?

tybit | karma 776 | avg karma 2.86 2020-02-04 21:40:47+00:00 | [–] similar comments

Given the rampant misuse of Microservices, this was a really nice read about a seemingly well designed system.

They were able to rewrite their hot spot in a new language without having to rewrite all their business logic in a new language. Not that there wouldn’t have been solutions with a monolith, but this certainly seems elegant and precise.

kardianos | karma 1175 | avg karma 4.26 2020-02-04 18:51:35+00:00 | [–] similar comments

I'm glad they found a good solution (rust) to solve their problem!

Also note this was with Go1.9. I know GC work was ongoing during that time, I wonder if this time of situation would still happen?

biomcgary | karma 2289 | avg karma 3.76 2020-02-04 19:04:22+00:00 | [–] similar comments

I know latency for GC with large heaps improved in Go1.12. See: https://golang.org/doc/go1.12#runtime

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 19:55:03+00:00 | [–] similar comments

The article states specifically that part of the problem was heaps were never large.

EDIT: Actually, no it didn't, I misunderstood it.

typical182 | karma 628 | avg karma 7.95 2020-02-04 20:48:30+00:00 | [–] similar comments

Where does it say that?

It says things like:

“We were not creating a lot of garbage.”

... but that statement there doesn’t say anything about the heap size, including the size and count of live objects (i.e., not garbage).

It also says:

“There are millions of Users in each cache. There are tens of millions of Read States in each cache.”

Large is often in the eye of the beholder, but I missed it if it said anything specifically about not having a large heap size.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 21:19:57+00:00 | [–] similar comments

> ... but that statement there doesn’t say anything about the heap size, including the size and count of live objects (i.e., not garbage).

Not sure why you got downvoted, you're actually right, I'm wrong: I misread that and/or assumed one meant the other.

That said, this is a case that should be ideal for generational GC, which Go specifically eschewed at one point. I'm not sure this is still the case, however--I have yet to wade through this[1] to update my knowledge here.

[1] https://blog.golang.org/ismmkeynote

dilyevsky | karma 5634 | avg karma 2.11 2020-02-04 19:26:36+00:00 | [–] similar comments

LOL that should’ve been at the top. The improvement in gc between 1.9 and 1.12 is absolutely massive. They could’ve just upgraded go toolchain.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 19:56:44+00:00 | [–] similar comments

The GC changes in 1.12 supposedly target large heaps, which is not Discord's situation.

dilyevsky | karma 5634 | avg karma 2.11 2020-02-04 20:11:37+00:00 | [–] similar comments

This post needed a lot more depth to really understand what was going on. Statements like

> During garbage collection, Go has to do a lot of work to determine what memory is free, which can slow the program down.

read like blogospam to me (which it is).

For comparison sake - similar post from Twitch has a lot more technical detail and generally makes me view their team in a lot better light than Dicord’s after reading both.

lostcolony | karma 9464 | avg karma 3.54 2020-02-04 23:50:10+00:00 | [–] similar comments

Really? My take was that it was. They mention a bunch of cached data, that rarely got ejected (so not generating a lot of garbage), but that took a long time to traverse (so when GC DID occur, it took a long time), which implies it being large.

ascv | karma 21 | avg karma 1.62 2020-02-04 19:30:31 | [–] similar comments

It's surprising they didn't test upgrading to 1.13.

topspin | karma 5853 | avg karma 3.27 2020-02-04 21:45:26+00:00 | [–] similar comments

> It's surprising they didn't test upgrading to 1.13.

It isn't surprising to me. It's stated elsewhere they tried 4 difference version of Go, up through 1.10 apparently, and had performance problems with all of them. At some point you can't suffer garbage collector nonsense anymore and since they'd already employed Rust on other services they tried it here.

It worked on the first try.

That's not surprising either.

What would be surprising is if any of these "but version such and such is Waaay better and they should just use that" actually panned out. The best case would be that the issue just manifests as some other garbage collector related performance problem. That's the deal you sign up for when you saddle yourself with a garbage collector.

ascv | karma 21 | avg karma 1.62 2020-02-05 01:00:54+00:00 | [–] similar comments

It's still a huge whoosh. You're starting at 1.9 and you're testing 4 micro versions to 1.10.. what is the point of that? None of those non-major versions are going to significantly change how the GC works.

They could have tried 1 other version (not 4) and picked either the latest (1.13) or the version that contains the GC improvements (1.12) to test. Usually when you are looking to upgrade something you skim the release notes so testing 1.12 or 1.13 is obvious especially when 1.12 seems to specifically address their performance concern.

If upgrading something avoids a service re-write that is usually the way to go unless you were looking for an excuse to re-write the service in the first place which may have been the case.

edit:

It turns out they did exactly what my comment stated: they tested the latest version (1.10). It's just that this article was published recently but the events happened quite a while back.

topspin | karma 5853 | avg karma 3.27 2020-02-05 01:48:24+00:00 | [–] similar comments

> 1.12 seems to specifically address their performance concern

Maybe it does, maybe it doesn't. Maybe 1.14 has a performance regression with their specific load and they get screwed.

We'll never know. You know why? They permanently solved their GC problems by eliminating GC.

That's the smart play.

ascv | karma 21 | avg karma 1.62 2020-02-05 02:34:06+00:00 | [–] similar comments

Actually it turns out that they DID test the latest version which at the time was 1.10 (see my edit). I guess you should be surprised now? :D

topspin | karma 5853 | avg karma 3.27 2020-02-05 05:35:47+00:00 | [–] similar comments

Since I actually cited 1.10 as one they tested I'm not surprised at all.

ascv | karma 21 | avg karma 1.62 2020-02-14 22:05:27+00:00 | [–] similar comments

Except that you literally said it wasn't surprising b/c gc sucks. You were "not surprised" in response to the assumption that they DIDN'T test the latest version. However this was just a mis-understanding and you're re-casting your comment to make it seem like you were right all along. If you knew they tested the latest version all along then you couldn't have been surprised or not surprised to something that didn't happen.

It seems like my comment was just an entry point for you to shit on gc which, ironically, I mostly agree with in this context.

lathiat | karma 4116 | avg karma 3.07 2020-02-05 06:29:02+00:00 | [–] similar comments

According to another comment they did this back in May 2019 when 1.10 was the latest. They are only blogging about it now which I guess is slightly unfortunate but never the less.

faitswulff | karma 8583 | avg karma 4.66 2020-02-04 21:20:54 | [–] similar comments

From /u/DiscordJesse on reddit:

> We tried upgrading a few times. 1.8, 1.9, and 1.10. None of it helped. We made this change in May 2019. Just getting around to the blog post now since we've been busy.

https://www.reddit.com/r/programming/comments/eyuebc/why_dis...

cesarb | karma 14181 | avg karma 3.67 2020-02-05 00:32:32 | [–] similar comments

Another interesting comment in the same reddit thread, from /u/brian-discord (https://old.reddit.com/r/programming/comments/eyuebc/why_dis...):

> Another Discord engineer chiming in here. I worked on trying to fix these spikes on the Go service for a couple weeks. We did indeed try moving up the latest Go at the time (1.10) but this had no effect.

> For a more detailed explanation, it helps to understand what is going on here. It is not the increased CPU utilization that causes the latency. Rather, it's because Go is pausing the entire world for the length of the latency spike. During this time, Go has completely suspended all goroutines which prevents them from doing any work, which appears as latency in requests.

> The specific cause of this seems to be because we used a large free-list like structure, a very long linked list. The head of the list is maintained as a variable, which means that Go's mark phase must start scanning from the head and then pointer chase its way through the list. For whatever reason, Go does (did?) this section in a single-threaded manner with a global lock held. As a result, everything must wait until this extremely long pointer chase occurs.

> It's possible that 1.12 does fix this, but we had tried upgrading a few times already on releases that promised GC fixes and never saw a fix to this issue. I feel the team made a pragmatic choice to divest from Go after giving the language a good attempt at salvaging the project.

earthboundkid | karma 6376 | avg karma 2.89 2020-02-05 02:25:40+00:00 | [–] similar comments

Ugh, linked lists fuck everything up. It’s never the right data structure. Use a vector!

anarazel | karma 5793 | avg karma 3.95 2020-02-05 04:03:23 | [–] similar comments

That's only true if you actually need to access more than a few elements at once. And if you never need to insert/delete to/from anywhere but the end.

earthboundkid | karma 6376 | avg karma 2.89 2020-02-05 12:17:11+00:00 | [–] similar comments

Even if you need middle inserts but not a B-tree (weird), it’s still better to use a vector in most cases. Time to find the insertion point will dominate.

giornogiovanna | karma 496 | avg karma 3.12 2020-02-05 12:57:22 | [–] similar comments

The two main cases when linked lists are better are (a) when you want to guarantee a low cost per insert, since vector insertion is only O(1) amortized, and (b) when you want to insert in the middle, but somehow found that middle without scanning the list.

Anyway, in this case, I guess they're using a free list because (1) it's simpler since you don't need an external collection keeping a list of unused stuff, and (2) reason (a) above.

RcouF1uZ4gsC | karma 17613 | avg karma 3.88 2020-02-04 18:52:19 | [–] similar comments

> Changing to a BTreeMap instead of a HashMap in the LRU cache to optimize memory usage.

Collections are one of the big areas where Go's lack of generics really hurts it. In Go, if one of the built in collections does not meet your needs, you are going to take a safety and ergonomic hit going to a custom collection. In Rust, if one of the standard collections does not meet your needs, you (or someone else) can create a pretty much drop-in replacement that does that has similar ergonomic and safety profiles.

correct_horse | karma 466 | avg karma 3.38 2020-02-04 18:54:40+00:00 | [–] similar comments

I'm not sure what you mean by standard collections, but BTreeMap is in Rust's standard library.

pdpi | karma 11546 | avg karma 4.83 2020-02-04 19:00:32+00:00 | [–] similar comments

I think the point the GP is trying to make is that there’s no reason why BTreeMap couldn’t be an external crate, while only the core Go collections are allowed to be generic.

A corollary to this is that adding more generic collections to Go’s standard library implies expanding the set of magical constructs.

The_rationalist | karma -7 | avg karma -0.0 2020-02-04 19:52:14+00:00 | [–] similar comments

Rust has it's lot of weird hacks too. E.g array can take traits impls only if they have less than 32 elements... https://doc.rust-lang.org/std/array/trait.LengthAtMost32.htm...

jolux | karma 6163 | avg karma 2.58 2020-02-04 19:54:41+00:00 | [–] similar comments

That's a completely different and much more minor issue (red herring, more or less) than eschewing the one core language feature that makes performant type-safe custom data structures possible.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 20:01:00+00:00 | [–] similar comments

This is purely temporary; it used to be less hacky, but in order to move to the no-hacks world, we had to make it a bit more hacky to start.

masklinn | karma 65147 | avg karma 3.36 2020-02-04 20:09:00+00:00 | [–] similar comments

That's… not that at all. You can absolutely implement traits for arrays of more than 32 elements[0].

It is rather that due to a lack of genericity (namely const generics) you can't implement traits for [T;N], you have to implement them for each size individually. So there has to be an upper bound somehow[1], and the stdlib developers arbitrarily picked 32 for stdlib traits on arrays.

A not entirely dissimilar limit tends to be placed on tuples, and implementing traits / typeclasses / interfaces on them. Again the stdlib has picked an arbitrary limit, here 12[2], the same issue can be seen in e.g. Haskell (where Show is "only" instanced on tuples up to size 15).

These are not "weird hacks", they're logical consequences of memory and file size not being infinite, so if you can't express something fully generically… you have to stop at one point.

[0] here's 47 https://play.rust-lang.org/?version=stable&mode=debug&editio...

[1] even if you use macros to codegen your impl block

[2] https://doc.rust-lang.org/src/core/fmt/mod.rs.html#2115

kibwen | karma 47766 | avg karma 6.38 2020-02-04 23:18:22 | [–] similar comments

Also worth noting that Rust's const generics support has progressed to the point that the stdlib is already using them to implement the standard traits on arrays; the 32-element issue still technically exists, but only because the stdlib is manually restricting the trait implementation so as to not accidentally expose const generics to stable Rust before const generics is officially stabilized.

zerr | karma 4218 | avg karma 1.52 2020-02-04 19:55:11+00:00 | [–] similar comments

In Go, standard collections are compiler's magic while in Rust or e.g. C++ - they are implemented as libraries.

Cthulhu_ | karma 26640 | avg karma 2.23 2020-02-12 12:07:35 | [–] similar comments

I like to think it's a tradeoff; limit the language and standard library and you limit the amount of things you have to consider. That is, 99% of applications probably won't need a BTree.

(anecdotal: in Java I've never needed anything else than a HashMap or an ArrayList)

unlinked_dll | karma 2533 | avg karma 3.72 2020-02-04 18:54:07 | [–] similar comments

It'd be cool to look at more signal statistics from the CPU plot.

It appears that Go has a lower CPU floor, but it's killed by the GC spikes, presumably due to the large cache mentioned by the author.

This is interesting to me. It suggests that Rust is better at scale than Go, and I would have thought with Go's mature concurrency model and implementation would have been optimized for such cases while Rust would shine in smaller services with CPU bound problems.

Great post!

arnsholt | karma 1042 | avg karma 5.48 2020-02-04 19:09:37+00:00 | [–] similar comments

My first guess for the slightly higher CPU floor of the Rust version is that the Rust code has to do slightly more work per request, since it will free memory as it gets dropped, whereas the Go code doesn't do any freeing per request, but then gets hit with the periodic spike every two minutes where the entire heap has to be traversed for GC.

jhgg | karma 3220 | avg karma 6.78 2020-02-04 19:36:35+00:00 | [–] similar comments

tokio 0.1 was definitely less efficient, when we compare go to 0.2, tokio uses less cpu consistently, even when compared to a cluster of the same size almost a year later with our growth over the time since we switched over.

pixel_fcker | karma 419 | avg karma 2.57 2020-02-04 23:11:40 | [–] similar comments

Go's CPU floor is lower compared to the naive Rust port (roughly 20% vs 23% from eyeballing). Their optimized Rust version is shown in the next series of graphs as being ~12%.

bradhe | karma 1280 | avg karma 1.53 2020-02-04 18:57:16 | [–] similar comments

Replatforming to solve this problem was a bit silly in my opinion. The solution to the problem was "do fewer allocations" which can be done in any language.

pjmlp | karma 109153 | avg karma 1.76 2020-02-04 19:00:28+00:00 | [–] similar comments

Yeah, but this way one can add experience in yet another language to the CV.

jhgg | karma 3220 | avg karma 6.78 2020-02-04 19:03:46+00:00 | [–] similar comments

Your reply misses the point. We were already doing so few allocations that the GC only ran because it "had to" at every 2 minute mark. The issue was the large heap of many long lived objects.

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 20:53:26+00:00 | [–] similar comments

Did you try to change that interval to a much larger time?

jhgg | karma 3220 | avg karma 6.78 2020-02-04 21:01:10+00:00 | [–] similar comments

When we investigated, there was no way to change that that we could find - barring compiling go from source (something we could have done, but wanted to avoid.)

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 21:11:21+00:00 | [–] similar comments

Yes, you have to rebuild go, but that is literally done in a minute. It also would be interesting, if you happen to have some conclusive benchmarks, how the latest Go runtime would perform in this sense.

_ph_ | karma 9798 | avg karma 2.44 2020-02-05 07:04:25+00:00 | [–] similar comments

I don't get, why this is downvoted without comments. Compared to a rewrite, this would have been a miniscule change. Furthermore, considering that you wrote that long blog post (which I quite appreciate, as it contains interesting information), it would have been important knowledge, whether the setting of the parameter was the real culprit - and if it was, a good reason to shout out to the Go implementors to look closer at it.

Jweb_Guru | karma 2749 | avg karma 2.24 2020-02-05 08:01:50+00:00 | [–] similar comments

All I'm going to say is that if you think maintaining your own version of a compiler is the reasonable option compared to a rewrite in another language, you are probably deeply invested in the former language. This also applies to kernels and databases.

_ph_ | karma 9798 | avg karma 2.44 2020-02-05 09:38:56+00:00 | [–] similar comments

Well, in this case, "maintaining your own version compiler" concerns a single value change in the code base. At least, as I wrote, it should have been tried to identify the root cause for the observed behavior. If this "fix" significantly improves the behavior, it would have been a good data point to reach out to the Go developers to resolve this issue.

The problem to get down to the core of these issues are test cases. It seems, that neither the Go developers nor many other people have run into this as an issue - I only remember noticing the regular GC some years ago, but it was not an issue for me. As they have a real-life test case exposing this problem, they are possibly the only ones, who could verify a potential fix for the problem.

So, while it is great that they identified the problem and wrote a thorough blog piece about it, the only thing we learn from this is, that in Go 1.9 there was a latency issue every 2 minutes with their style of application/heap usage. Unfortunately, we don't know whether this problem was already addressed in later Go versions, and if not, whether there should be a way to control the automatic gc intervals to address this.

xyzzyz | karma 10696 | avg karma 2.87 2020-02-04 19:04:14+00:00 | [–] similar comments

You haven't read the post carefully. Their garbage collection in Go was spiking every 2 minutes precisely because they were doing too few allocations to have it run more often.

staticassertion | karma 12534 | avg karma 3.12 2020-02-04 19:09:25 | [–] similar comments

A) They had spent a lot of time optimizing the Go service

B) They weren't allocating a lot, and Go was enforcing a GC sweep every 2 minutes, and it was spending a lot of time on their LRU cache. To "reduce allocations" they had to cut their cache down, which negatively impacted latency.

nabla9 | karma 38285 | avg karma 4.67 2020-02-04 19:17:41 | [–] similar comments

I wonder if they attempted manual memory allocation in Go?

In many languages with GC you can actually do manual memory management relatively easily with few helper functions. You write your own allocate() and free() functions/methods. When you allocate, you check the free list first, if nothing is available, you do normal allocation. When you call free you add the object into a free list. If you memory management leaks, it triggers GC.

Usually you need to do that kind of stuff to only in few places and few data structures to cut GC 90%.

qidydl | karma 9 | avg karma 1.5 2020-02-04 19:44:50 | [–] similar comments

They addressed this in the article:

> These latency spikes definitely smelled like garbage collection performance impact, but we had written the Go code very efficiently and had very few allocations. We were not creating a lot of garbage.

The problem was due to the GC scanning all of their allocated memory and taking a long time to do so, regardless of it all being necessary and valid memory usage.

h2odragon | karma 15145 | avg karma 2.39 2020-02-04 18:58:10+00:00 | [–] similar comments

Excellent write up, and effective argument for Rust in this application and others. My cynical side sums it up as:

"Go sucked for us because we refused to own our tooling and make a special allocator for this service. Switching to Rust forced us to do that, and life got better"

staticassertion | karma 12534 | avg karma 3.12 2020-02-04 19:01:54+00:00 | [–] similar comments

I'm confused. Build a special allocator for Go you mean? That feels like going well beyond typical "own your tooling".

h2odragon | karma 15145 | avg karma 2.39 2020-02-04 23:46:35 | [–] similar comments

I'm outdated. I used to have 4 different python interpreter builds, for different purposes, where the modern world would be using lua as a glue language. I had nothing like the scale, staff, or budget of Discord; all I had was need and tools that could bend to fill it.

I think this is a great write up of why they chose a different tool. I don't say it was the wrong decision, they make that argument pretty well too. I'm still surprised that either Go isn't malleable enough to have bent around the need, or they didn't feel it worth more effort than parameter tweaking to bend it so.

monocasa | karma 27236 | avg karma 2.94 2020-02-04 19:54:55+00:00 | [–] similar comments

They were already not allocating, they were just stuck with a GC cycle that'd scan, not find any garbage, and scan again in two minutes.

thedance | karma 613 | avg karma 1.32 2020-02-04 18:59:45 | [–] similar comments

These kinds of posts would be much more interesting if they discussed alternatives considered and rejected. For example why did they choose Rust over C++?

therockhead | karma 547 | avg karma 3.04 2020-02-04 19:35:54 | [–] similar comments

The article mentioned that they have already used Rust successfully in house, so when you consider that Rust is inherently safer than C++, it seems like they picked the right language.

The_rationalist | karma -7 | avg karma -0.0 2020-02-04 19:54:30+00:00 | [–] similar comments

The most pressing undiscussed alternative is: why didn't they update their 3 years old Go version yet had the double standard of using rust nightly... This blog post is a scam and their only reason to use rust should be assumed: it's because they wanted to.

thedance | karma 613 | avg karma 1.32 2020-02-04 19:55:35+00:00 | [–] similar comments

That's exactly the trolling I was looking for :-) . It just reads like they thought it would be cool to use Rust, so they did, which is fine.

_bxg1 | karma 112 | avg karma 0.03 2020-02-04 12:59:57 | [–] similar comments

It's always good to see a case-study/anecdote, but nothing in here is surprising. It also doesn't really invalidate Go in any way.

Rust is faster than Go. People use Go, like any other technology, when the tradeoffs between developer iteration/throughput/latency/etc. make sense. When those cease to make sense, a hot path gets converted down to something more efficient. This is the natural way of things.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 19:53:53+00:00 | [–] similar comments

> It's always good to see a case-study/anecdote, but nothing in here is surprising. It also doesn't really invalidate Go in any way.

Well, sure, because categorizing languages as "valid/invalid" doesn't make any sense.

But it does show yet another example of how designing a language to solve Google's fairly-unique problems doesn't result in a general-purpose language suitable for solving most people's problems.

kikimora | karma 107 | avg karma 0.78 2020-02-04 20:26:23+00:00 | [–] similar comments

Long GC pauses caused by large collections/caches are decade long problem with no real wide spread solution so far. With Java and .NET you can resort to off-heap data. Not sure if this is possible with Go.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 14:41:48 | [–] similar comments

> Long GC pauses caused by large collections/caches are decade long problem with no real wide spread solution so far.

This is arguable, but the fact is that "large collections/caches" isn't Discord's situation.

dnautics | karma 17907 | avg karma 2.14 2020-02-04 21:23:46+00:00 | [–] similar comments

Erlang's basically solved it (and, arguably, solved it decades ago); relevant as discord uses erlang VM in places.

bsder | karma 16587 | avg karma 1.93 2020-02-04 15:51:17 | [–] similar comments

Erlang "solved" the problem by breaking having lots of little heaps so a GC can blast through the entire heap extremely quickly.

Would that actually work in this instance? It seems like that LRU cache they're talking about is kind of large.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 15:58:28 | [–] similar comments

> Would that actually work in this instance? It seems like that LRU cache they're talking about is kind of large.

I can't say for sure without knowing what the contents of that heap is, but I suspect that yes, it would work.

However, the reason the heaps are so small is that they're each a lightweight thread, and in Erlang, spinning up new threads is a way of life. It would be hard to overstate what a fundamentally different architecture this is.

throwaway34241 | karma 548 | avg karma 1.67 2020-02-05 00:47:00+00:00 | [–] similar comments

Yes, it's not particularly difficult with Go either. The default array/list type (slices) can point to unmanaged memory so it's easy to feed unmanaged data directly into most APIs. Also you can directly use off heap objects to support interfaces and pass references to them around just like objects allocated on the Go heap.

Of course there are no generics yet so doing things like re-using a custom hash table implementation will be less convenient.

vogre | karma 185 | avg karma 1.06 2020-02-04 15:56:25 | [–] similar comments

Keeping LRU cache that large with these performance requirements is not a "most people's problem".

Go is actually great to solve most people's problem with web servers, while Rust is better for edge cases.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 22:09:34+00:00 | [–] similar comments

> Keeping LRU cache that large with these performance requirements is not a "most people's problem".

Sure, but that's not what I said.

Any program of sufficient complexity will run into at least one critical problem that isn't a "most people's problem". A well-written general-purpose language implementation will have been written in such a way that that problem isn't totally intractable.

> Go is actually great to solve most people's problem with web servers, while Rust is better for edge cases.

Most people's problem with web servers is writing a CRUD app, which is going to be easiest in something like Python/Django/PostGres/Apache. It's not the new shiny, but it includes all the usual wheels so you don't have to reinvent them in the name of "simple". Similar toolsets exist for Ruby/Java/.NET. Give it a few years and similar toolsets will be invented for Go, I'm sure.

majewsky | karma 15323 | avg karma 2.15 2020-02-05 00:08:06+00:00 | [–] similar comments

> Give it a few years and similar toolsets will be invented for Go, I'm sure.

Go 1.0 is nearly 8 years old at this point, and these toolsets have existed for years.

d1zzy | karma 1343 | avg karma 2.22 2020-02-04 22:06:25+00:00 | [–] similar comments

> But it does show yet another example of how designing a language to solve Google's fairly-unique problems doesn't result in a general-purpose language suitable for solving most people's problems.

How much of Google's infrastructure actually runs on Go tho? :)

spenczar5 | karma 4531 | avg karma 6.58 2020-02-04 17:11:03 | [–] similar comments

No, I don't think so. You would need to demonstrate that this application is representative of "most people's problems," which doesn't seem clear to me.

This shows only a single example where Go is not very suitable, but it doesn't prove a general case on its own.

calcifer | karma 3833 | avg karma 5.87 2020-02-04 13:59:44 | [–] similar comments

This is a weirdly defensive comment, fighting against a strawman. The article doesn't claim it's surprising, that "invalidates" Go or that it isn't the "natural way of things".

_bxg1 | karma 112 | avg karma 0.03 2020-02-04 20:13:56+00:00 | [–] similar comments

I'm not pushing back against the article, but against the comments that tend to appear below articles like this. The headline in particular, to someone who doesn't read the article, could be taken as "Discord has decided that Rust is better than Go and here's why", and run with.

spenczar5 | karma 4531 | avg karma 6.58 2020-02-04 23:12:08+00:00 | [–] similar comments

There's a long history of Go and Rust partisans arguing this point. It's unfortunate, but I think it's reasonable to fight that strawman preemptively.

Symmetry | karma 18650 | avg karma 3.07 2020-02-04 14:16:29 | [–] similar comments

I think the way I'd put it is that languages with manual memory management, like Rust, have more scope for optimization than languages that don't. You can just use the gc crate in Rust and have almost the same ease of development but the same performance problems you do in Go.

nicoburns | karma 22847 | avg karma 3.29 2020-02-04 21:40:53+00:00 | [–] similar comments

Notably, it didn't sound like the development in Rust was particularly difficult anyway. That's certainly been my experience in any case.

Symmetry | karma 18650 | avg karma 3.07 2020-02-05 02:21:16+00:00 | [–] similar comments

It also sounded like the developers had already thought carefully about their memory usage patterns and had been optimizing for performance as much as they could within the scope Go allowed them. Personally I've found Rust has a higher cognitive overhead than Go when I'm just banging something out only worrying about correctness but if you're thinking carefully about memory usage patterns in a way you need to to optimize performance there's no penalty.

highfrequency | karma 1287 | avg karma 5.67 2020-02-04 19:02:45+00:00 | [–] similar comments

Curious about their definition of “response time” in the graph at the end. They’re quoting ~20 microseconds so I assume this doesn’t involve network hops? Is this just the CPU time it takes a Read State server to do one update?

jhgg | karma 3220 | avg karma 6.78 2020-02-04 19:15:44 | [–] similar comments

Correct. This is internal time it takes to process the message. Since once a node is "warm" thanks to their large caches, it's mostly in memory operations and queueing for persistence which happens in the background.

Sikul | karma 956 | avg karma 15.17 2020-02-04 19:24:35+00:00 | [–] similar comments

Also worth noting: Most requests to the service have to update many Read States. For instance, when you @everyone in the Minecraft server we have to update over 500,000 Read States.

joseluisq | karma 1349 | avg karma 5.37 2020-02-04 19:03:18 | [–] similar comments

That's why the {blazing-fast} term is becoming popular.

Rust won again.

mangatmodi | karma 193 | avg karma 1.91 2020-02-04 19:03:47+00:00 | [–] similar comments

Why would they switch to rust, rather than upgrading from 3 years old version?

jhgg | karma 3220 | avg karma 6.78 2020-02-04 19:13:09 | [–] similar comments

This blog post perhaps is a bit "after the fact" we had made the switch over mid 2019, and wanted to try out rust as well for services like this, due to adoption elsewhere in the company. Also, after upgrading between 4 golang versions on this service and noticing it didn't materially change performance, we decided to just spend our time on the rewrite (for fun, and latency) and to get a head start into the asynchronous rust ecosystem.

This blog post kinda internally matches our upgrade to std::futures and tokio 0.2, away from futures 0.1.

The_rationalist | karma -7 | avg karma -0.0 2020-02-04 19:23:35+00:00 | [–] similar comments

Out of curiosity, why didn't you choose Kotlin? It can reuse the Java ecosystem which allow you to save tons of money, and give you advanced features and scalability. It is a sexier and more ergonomic language too. And with e.g ZGC, you can have a GC that is fine tunable, and that has very low latency.

By choosing rust you will suffer a great deal of the limitations of it's poor, not production ready, ecosystem. I'm not even talking about the immaturity of the async await support.

therockhead | karma 547 | avg karma 3.04 2020-02-04 19:37:43+00:00 | [–] similar comments

> By choosing rust you will suffer a great deal of the limitations of it's poor, not production ready, ecosystem.

Why do you think that? Seems like Rust is a great choice for this type of high performance work.

ncmncm | karma 13951 | avg karma 1.02 2020-02-04 21:16:53 | [–] similar comments

Rust does best when the number of lines of code that must be parsed in an edit-compile-test loop is small. When the sources that must be parsed get large, coders suffer.

It is doubtful that this will improve, much, without breaking changes to the language. The range of code over which type inference operates, or at least programmers' reliance on it, would need to contract by quite a lot. There would be Complaints.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 21:29:23 | [–] similar comments

Type inference only operates within function bodies. It's also not the thing that causes compilation to be slow.

jhgg | karma 3220 | avg karma 6.78 2020-02-04 19:50:24 | [–] similar comments

More people at our company know Rust than Kotlin. It's used across multiple teams (from our game SDK, native encoder/capture pipeline, chat infra team for Erlang rust NIFs.) where as Kotlin is only used by our android team.

We are willing to adopt early technologies we think are promising, and contribute to or fund projects to continue to advance the ecosystem. Yes, this means the path less traveled, but in the case of rust (and in the past Elixir, and even React Native) we think the trade offs are worth it.

Also the tokio team uses Discord for their chat stuff, so it's nice to pop in to be able to ask for and offer help.

mping | karma 964 | avg karma 1.75 2020-02-04 19:57:45+00:00 | [–] similar comments

Ecosystem is a real problem, but motivated engineers will make it work no matter what. I mean, banks run on COBOL.

Besides, I am willing to bet idiomatic rust is 2x-10x faster than idiomatic kotlin.

codehalo | karma 232 | avg karma 1.08 2020-02-04 20:57:30+00:00 | [–] similar comments

Then that motivation should be applicable to Go as well.

kaoD | karma 3508 | avg karma 2.35 2020-02-04 20:09:02+00:00 | [–] similar comments

> you can have a GC

But can I not have it?

qw | karma 1741 | avg karma 3.68 2020-02-05 00:31:10 | [–] similar comments

Yes. You can have a no-op collector

jen20 | karma 5666 | avg karma 2.04 2020-02-04 22:53:10+00:00 | [–] similar comments

> I'm not even talking about the immaturity of the async await support.

The one that is 100% more mature than the Java async/await support?

The_rationalist | karma -7 | avg karma -0.0 2020-02-04 23:37:32+00:00 | [–] similar comments

Kotlin has coroutines, actors, lazy and reactive programming.

graphememes | karma 548 | avg karma 1.66 2020-02-04 19:46:00+00:00 | [–] similar comments

Shiny toy syndrome, basically.

The_rationalist | karma -7 | avg karma -0.0 2020-02-04 19:58:20+00:00 | [–] similar comments

Also, after upgrading between 4 golang versions on this service and noticing it didn't materially change performance, we decided to just spend our time on the rewrite So you basically don't read release changelogs of the slow iterating language like go, yet have the double standard of keeping up with rust nightly? Because Go 1.12 explicitly mention performance improvements to it's GC.

You just wanted to do it "for fun" (but is rust and it' s immature ecosystem with all it's issues that fun?). This blog is dishonest and show amateurism at discord.

BTW it's not too late, prove us right or wrong by benchmarcking latest Go GC vs rust.

crystaldev | karma 351 | avg karma 4.68 2020-02-04 20:13:17+00:00 | [–] similar comments

> to get a head start into the asynchronous rust ecosystem.

Sounds resume-driven.

typical182 | karma 628 | avg karma 7.95 2020-02-04 21:12:03+00:00 | [–] similar comments

Do you have any load tests or synthetic benchmarks that are still capable of producing this?

It would be interesting to see what a more modern Go would do given there have been a bunch of tail latency GC improvements since your older 1.9 Go version... and in an ideal world, it would be nice to file an issue on the tracker if you were still seeing this.

(Maybe that ends up later helping another one of your Go services, or maybe it just helps the community, or maybe it’s a topic for another interesting blog...).

In any event, thanks for taking the time to write up and share this one.

jamra | karma 1096 | avg karma 2.21 2020-02-05 03:35:16+00:00 | [–] similar comments

This comment doesn’t make sense. Didn’t rust not have async back then?

The timelines don’t appear to fit

gmjosack | karma 895 | avg karma 5.63 2020-02-05 03:46:26+00:00 | [–] similar comments

Tokio and Futures have existed since 2016. I worked on the initial loqui implementation that powers rpc at Discord in raw Futures/Tokio in late 2018. async/await was also on nightly back then. Jesse finished it and migrated it to async/await and later used that as the basis for Read States. The timelines make perfect sense.

jamra | karma 1096 | avg karma 2.21 2020-02-05 04:15:04+00:00 | [–] similar comments

Alright. I stand corrected. I used Go back when it was beta, but it never stuck with me. I still like it for small script like tasks. I also happen to think Rust is amazing. The learning curve kept me away for a while.

It would still be interesting to see them post how go > 1.12 would do since it no longer has stop the world garbage collection.

donatj | karma 17336 | avg karma 4.57 2020-02-04 19:13:36 | [–] similar comments

I feel like from the definition of the service, the entire thing could easily be replaced with a Redis cluster.

Sikul | karma 956 | avg karma 15.17 2020-02-04 19:19:23+00:00 | [–] similar comments

We originally cached this data with a Redis cluster but we hit scaling issues. The Read States service only exists because Redis had issues.

donatj | karma 17336 | avg karma 4.57 2020-02-04 19:20:54+00:00 | [–] similar comments

Hah, well now I feel like a dufus. Good info!

Sikul | karma 956 | avg karma 15.17 2020-02-04 19:32:28+00:00 | [–] similar comments

No worries, we could have mentioned that in the post as part of the service history :)

hopia | karma 254 | avg karma 1.12 2020-02-05 02:29:42+00:00 | [–] similar comments

We use Elixir too and ironically, Redis became our bottleneck as well. What a useless dependency it is when running on BEAM.

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 19:13:53 | [–] similar comments

If you have a problem at hand which does not really benefit from the presence of a garbage collector, switching to an implementation without a garbage collector has quite a potential to be at least somewhat faster. I remember myself to run onto this time trigger for garbage collection long in the past - though I don't remember why and mostly forgot about ever since until I read this article. As also written in the article, even if there are no allocations going on, Go forces a gc every two minutes, it is set here: https://golang.org/src/runtime/proc.go#L4268

The idea for this is (if I remember correctly) to be able to return unused memory to the OS. As returning memory requires a gc to run, it is forced in time intervals. I am a bit surprised that they didn't contact the corresponding Go developers, as they seem to be interested in practical use cases where the gc doesn't perform well. Besides that newer Go releases improved the gc performance, I am a bit surprised that they didn't just increase this time interval to an arbitrary large number and checked, if their issues went away.

KMag | karma 6889 | avg karma 2.73 2020-02-04 20:32:18+00:00 | [–] similar comments

Not only is there good potential for a speed improvement, but languages built around the assumption of pervasive garbage collection tend not to have good language constructs to support manual memory management.

To be fair, most languages without GCs also don't have good language constructs to support manual memory management. If you're going to make wide use of manual memory management, you should think very carefully about how the language and ecosystem you're using help or hinder your manual memory management.

geodel | karma 6768 | avg karma 2.18 2020-02-04 19:15:18 | [–] similar comments

Makes sense write most efficient stuff for in-house and give resource hog Electron apps to users.

jrockway | karma 72069 | avg karma 3.74 2020-02-04 19:25:39+00:00 | [–] similar comments

Discord pays for their servers, but not for their users's computers.

Hamuko | karma 9515 | avg karma 2.14 2020-02-04 20:10:13+00:00 | [–] similar comments

That's fine as long as you ignore the fact that the users are the customers.

nemo1618 | karma 3894 | avg karma 4.66 2020-02-04 19:22:26+00:00 | [–] similar comments

I wonder if it would be feasible to rewrite the LRU cache (either fully or in part) in a way that does not require the GC to scan the entire cache.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 20:06:43 | [–] similar comments

Yes, it's possible: that's generational garbage collection. But last I heard, Google decided writing a modern GC was too complicated.

They're probably right, because Google doesn't need it. But for everyone else who decided to use a language designed to solve Google's fairly-unique problems as if it were a general-purpose language: that kind of sucks, doesn't it?

terminaljunkid | karma 29 | avg karma 0.28 2020-02-05 09:36:17+00:00 | [–] similar comments

The fact seems to be that the go team is not so well funded as it seems. Go is not Google's language in the sense C# is MS' language or Java was Sun language.

ssoroka | karma 28 | avg karma 2.15 2020-02-04 22:00:45 | [–] similar comments

Check out this post, that describes exactly that process. https://blog.gopheracademy.com/advent-2018/avoid-gc-overhead...

the-alchemist | karma 443 | avg karma 2.93 2020-02-04 19:23:47+00:00 | [–] similar comments

Looks like the big challenge is managing a large, LRU cache, which tends to be a difficult problem for GC runtimes. I bet the JVM, with its myriad tunable GC algorithms, would perform better, especially Shenandoah and, of course, the Azul C4.

The JVM world tends to solve this problem by using off-heap caches. See Apache Ignite [0] or Ehcache [1].

I can't speak for how their Rust cache manages memory, but the thing to be careful of in non-GC runtimes (especially non-copying GC) is memory fragmentation.

Its worth mentioning that the Dgraph folks wrote a better Go cache [2] once they hit the limits of the usual Go caches.

From a purely architectural perspective, I would try to put cacheable material in something like memcache or redis, or one of the many distributed caches out there. But it might not be an option.

It's worth mentioning that Apache Cassandra itself uses an off-heap cache.

[0]: https://ignite.apache.org/arch/durablememory.html [1]: https://www.ehcache.org/documentation/2.8/get-started/storag... [2]: https://blog.dgraph.io/post/introducing-ristretto-high-perf-...

pshc | karma 1944 | avg karma 2.89 2020-02-04 19:48:45+00:00 | [–] similar comments

Great comment and thanks for the reading material.

Now I'm wondering if there's a Rust library for a generational copying arena--one that compacts strings/blobs over time.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 19:50:00 | [–] similar comments

Generational arenas yes, but copying, I'm not aware of one. It's very hard to get the semantics correct, since you can't auto-re-write pointers/indices.

ithkuil | karma 7515 | avg karma 2.06 2020-02-04 20:34:56+00:00 | [–] similar comments

Perhaps such a library could help you record the location of the variables that contain pointers to the strings and keep that pointer up to data as the ownership of the string moves from variable to variable?

I'm other words, doing some of the work a moving compacting collector would do during compaction but continuously during normal program execution.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 20:41:57 | [–] similar comments

There's no way to hook into the move, so I don't see how it would be possible, or at least, not with techniques similar to compacting GCs.

masklinn | karma 65147 | avg karma 3.36 2020-02-04 20:47:42+00:00 | [–] similar comments

Maybe by reifying the indirection? The compacting arena would hand out smart pointers which would either always bounce through something (to get from an indentity to the actual memory location, at a cost) or it'd keep track and patch the pointers it handed out somehow.

Possibly half and half, I don't remember what language it was (possibly obj-c?) which would hand out pointers, and on needing to move the allocations it'd transform the existing site into a "redirection table". Accessing pointers would check if they were being redirected, and update themselves to the new location if necessary.

edit: might have been the global refcount table? Not sure.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 20:54:41 | [–] similar comments

Yeah so I was vaguely wondering about some sort of double indirection; the structure keeps track of "this is a pointer I've handed out", those pointers point into that, which then points into the main structure.

I have no idea if this actually a good idea, seems like you get rid of a lot of the cache locality advantages.

masklinn | karma 65147 | avg karma 3.36 2020-02-04 21:00:41 | [–] similar comments

I don't know that the cache locality would be a big issue (your indirection table would be a small-ish array), however you'd eat the cost of doubling the indirections, each pointer access would be two of them.

mypalmike | karma 2344 | avg karma 2.16 2020-02-04 22:29:17 | [–] similar comments

This sounds a lot like classic MacOS (pre-Darwin) memory allocation. You were allocated a handle, which you called Lock on to get a real pointer. After accessing the memory, you called Unlock to release it. There was definitely a performance hit for that indirection.

cesarb | karma 14181 | avg karma 3.67 2020-02-04 23:04:15+00:00 | [–] similar comments

It's the same on classic Windows (pre-Windows 95) memory allocation. GlobalAlloc with GMEM_MOVEABLE or LocalAlloc with LMEM_MOVEABLE returned a handle, which you called GlobalLock or LocalLock on to get a real pointer. After accessing the memory, you called GlobalUnlock or LocalUnlock to release it. Of course, this being Microsoft, you can still call these functions inherited from 16-bit Windows even on today's 64-bit Windows. (See Raymond Chen's "A history of GlobalLock" at https://devblogs.microsoft.com/oldnewthing/20041104-00/?p=37...).

e_y_ | karma 354 | avg karma 1.93 2020-02-05 09:14:52+00:00 | [–] similar comments

On top of the cost of the extra pointer lookup, you also run into cache coherency issues when dealing with threading. So then you need to use atomic ops or locks or cache flushing which makes it even more expensive.

Rust is better suited to deal with it since there's a similar issue with refcounting across threads, so you might be able to get away with doing it for objects that are exclusive to one thread.

konstmonst | karma 85 | avg karma 1.49 2020-02-05 15:08:49+00:00 | [–] similar comments

I would give out handles and have a Guard Object, that allows you to get smart pointers from handles as long as Guard Object is in scope. Then when Guard Object is out of scope, the smart pointers would get invalidated.

Jweb_Guru | karma 2749 | avg karma 2.24 2020-02-05 07:46:59+00:00 | [–] similar comments

It would be possible (and not that hard) for Copy types, but you'd need some custom derive traits if you had any pointers to managed objects.

dochtman | karma 8267 | avg karma 10.48 2020-02-04 21:07:50+00:00 | [–] similar comments

One the one hand, yes. On the other hand, all of this sounds much more complex and fragile. This seems like an important point to me:

"Remarkably, we had only put very basic thought into optimization as the Rust version was written. Even with just basic optimization, Rust was able to outperform the hyper hand-tuned Go version."

novok | karma 6611 | avg karma 2.4 2020-02-04 21:42:59+00:00 | [–] similar comments

A C app will tend to outperform a Java or Golang app by 3x, so it isn't too surprising.

georgebarnett | karma 1336 | avg karma 4.48 2020-02-04 21:58:34 | [–] similar comments

Could you please provide a source for this?

Java is very fast and 3X slower is a pretty wild claim.

squarefoot | karma 13102 | avg karma 2.95 2020-02-04 22:24:04+00:00 | [–] similar comments

3x might be a bit too much today, but it's definitely slower than C. Also to be considered is the VM overhead, not just the executed code.

Here are some benchmarks; I'll leave to the experts out there to confirm or dismiss them.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

robocat | karma 11778 | avg karma 2.08 2020-02-04 23:13:39 | [–] similar comments

However the memory usage difference is astonishing for some of those benchmarks - using 1000x more memory is only acceptable for some situations.

iainmerrick | karma 5858 | avg karma 2.53 2020-02-05 09:33:21 | [–] similar comments

This may be overly cynical, but don’t you have it backwards? 1000x greater memory usage is acceptable for most applications.

There are only a few applications for which it isn’t acceptable.

Just look at all the massive apps built on Electron. People wouldn’t do that if it wasn’t effective.

wool_gather | karma 2993 | avg karma 2.31 2020-02-05 16:47:51+00:00 | [–] similar comments

Wouldn't you say there's a difference between "effective" and "getting away with it"? If non-technical users see that their daily computing lives are made more complicated (because of lowered performance) by having n Electron apps running at the same time, they may not understand the reasons, but they will certainly choose a different solution that has the features they need, where available.

iainmerrick | karma 5858 | avg karma 2.53 2020-02-05 21:15:40+00:00 | [–] similar comments

Wouldn't you say there's a difference between "effective" and "getting away with it"?

I used to think that but now I’m not so sure. :(

igouy | karma 1139 | avg karma 0.41 2020-02-05 17:34:09+00:00 | [–] similar comments

— default JVM allocation ~35 MB looks astonishing for tiny programs that don't allocate memory

— memory usage is less different for tiny programs that do allocate memory: reverse-complement, k-nucleotide, binary-trees, mandelbrot, regex-redux

squarefoot | karma 13102 | avg karma 2.95 2020-02-05 19:00:11+00:00 | [–] similar comments

Agreed, and ironically the most widely used Java platform (Android), despite its VM optimizations, is the one which would benefit the most from running only native code.

I mean, those 1GB RAM 7 years old slow as molasses phones getting dust into drawers or being littered into landfills would scream if they didn't have to run everything behind a VM.

yc12340 | karma 234 | avg karma 2.27 2020-02-06 03:19:44+00:00 | [–] similar comments

Make no mistake — Android isn't memory hungry because of Java alone. Android 4 used to comfortably fit in 1Gb of RAM, but Android 10 no longer can run on such devices. Both old and new versions use JVM, but newer Android has a lot of "helpful" system services, such as "Usage Statistics" service [1] and "Autofill Servide" [2].

Google's spyware is becoming more invasive and thus more memory-hungry.

1: https://developer.android.com/reference/android/app/usage/Us...

2: https://developer.android.com/reference/android/service/auto...

anarazel | karma 5793 | avg karma 3.95 2020-02-05 01:19:45+00:00 | [–] similar comments

Really depends on the domain. There's some things that are a lot easier to make scale up in a language with a great concurrent gc, because that makes writing some lock free data structures quite fundamentally easier (no complicated memory reclamation logic, trivial to avoid ABA problems).

pkolaczk | karma 2452 | avg karma 1.99 2020-02-05 07:36:18+00:00 | [–] similar comments

GC makes it easier to write, but not necessarily better. Modern state-of-the-art Java GCs operate a global heap, so you often pay for what you do not use. In languages like Rust or C++ your can build small locally GCed heaps, where you can limit GC overhead to just a few particular structures that need GC, not everything. Therefore other structures like caches don't affect GC pause times.

And the "hardness" of writing lockless structures is strongly offset by libraries, so unless you're doing something very exotic, it is rarely a real problem.

BenFrantzDale | karma 346 | avg karma 2.14 2020-02-06 02:25:38+00:00 | [–] similar comments

“GC makes it easier to write.” I disagree. GC means I don’t get deterministic destruction which means I can’t use RAII properly.

pkolaczk | karma 2452 | avg karma 1.99 2020-02-06 08:27:29+00:00 | [–] similar comments

The post was about writing lockless structures. With lockless RAII is of no use. See ABA problem. RAII is awesome in other places.

kllrnohj | karma 11052 | avg karma 3.34 2020-02-05 06:08:27+00:00 | [–] similar comments

> 3x might be a bit too much today, but it's definitely slower than C.

If anything the gap is increasing not shrinking. JVM is terrible at memory access patterns due to the design of the language, and designing for memory is increasingly critical for maximum performance on modern systems. All the clever JIT'ing in the world can't save you from the constant pointer chasing, poor cache locality, and poor prefetching.

The gap won't shrink until Java has value types. Which is on the roadmap, yes, but still doesn't exist just yet.

The problem with those benchmarks is if you look at the Java code you'll see it's highly non-idiomatic. Almost no classes or allocations. They almost all exclusively use primitives and raw arrays. Even then it still doesn't match the performance on average of the C (or similar) versions, but if you add the real-world structure you'd find in any substantial project that performance drops off.

wool_gather | karma 2993 | avg karma 2.31 2020-02-05 16:43:38+00:00 | [–] similar comments

When you say "value types" in this context, you mean "copy-only types"?

kllrnohj | karma 11052 | avg karma 3.34 2020-02-05 17:05:21 | [–] similar comments

I mean "value types" https://www.jesperdj.com/2015/10/04/project-valhalla-value-t... & http://cr.openjdk.java.net/~jrose/values/values-0.html

I don't know what you mean by "copy-only types." I'm not finding any reference to that terminology in the context of language design.

wool_gather | karma 2993 | avg karma 2.31 2020-02-06 16:22:24+00:00 | [–] similar comments

Ah, thanks for the link; I wasn't sure what it meant in the context of Java, since it's possible to get value semantics using a class.

Sorry about the confusion. I meant for the quotes around "copy-only" to indicate that it wasn't really a standard term, but I marked "value types" the same way, so that didn't really work. By "copy-only" I meant something you couldn't have more than one reference to: every name (variable) to which you assign the data would have its own independent copy.

kllrnohj | karma 11052 | avg karma 3.34 2020-02-06 17:57:31+00:00 | [–] similar comments

> By "copy-only" I meant something you couldn't have more than one reference to: every name (variable) to which you assign the data would have its own independent copy.

That's not really a requirement of value types, no. C# has value types (structs) and you can have references to them as well (ref & out params).

In general though yes it would be typically copied around, same as an 'int' is. Particularly if Java doesn't add something equivalent to ref/out params.

wool_gather | karma 2993 | avg karma 2.31 2020-02-07 16:29:37+00:00 | [–] similar comments

I know, that's why I asked for clarification originally. :) Anyways, I appreciate the details.

igouy | karma 1139 | avg karma 0.41 2020-02-07 18:47:15+00:00 | [–] similar comments

> Almost no classes or allocations.

Plenty of `inner static classes`. Where's the up-to-date comparison showing that `static` makes a performance difference for those tiny programs?

Plenty of TreeNode allocations.

----

> The problem with …

The problem with saying "in any substantial project that performance drops off" is when we don't show any evidence.

marcosdumay | karma 27273 | avg karma 1.67 2020-02-04 22:28:13+00:00 | [–] similar comments

Those people have a really good claim to have the most optimized choice on each language. They've found Java to be 2 to 3 times slower than C and Rust (with much slower outliers).

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

On the real world, you won't get things as optimized in higher level languages, because optimized code looks completely unidiomatic. A 3x speedup from Java is a pretty normal claim.

bcrosby95 | karma 9525 | avg karma 3.64 2020-02-04 23:09:56+00:00 | [–] similar comments

Speaking of, I wish there were an "idiomatic code benchmarks game". Some of us want to compare language speed for common use cases vs trying to squeeze every last piece of performance from it.

totalperspectiv | karma 922 | avg karma 4.04 2020-02-05 00:06:30+00:00 | [–] similar comments

Check out this: https://github.com/frol/completely-unscientific-benchmarks

tormeh | karma 7507 | avg karma 2.28 2020-02-05 00:34:35+00:00 | [–] similar comments

D, Nim and Crystal all do very well on all metrics. Author finds Rust pretty close but not as maintainable. Interesting that the top 3 (performance close to C++, but more maintainable) all are niche languages that haven't really broken into the mainstream.

marcosdumay | karma 27273 | avg karma 1.67 2020-02-05 00:52:42+00:00 | [–] similar comments

Rust won't get a maintainability prize for short programs with simple algorithms.

It's Haskell-like features only help on large and complex programs.

vnorilo | karma 1889 | avg karma 3.77 2020-02-05 01:27:55+00:00 | [–] similar comments

They do help library writers, and therefore turn large and complex programs into shorter programs.

corford | karma 2276 | avg karma 2.01 2020-02-05 01:18:51+00:00 | [–] similar comments

I really wish Intel or MS or someone would fund D so it could give Go and Rust a run for their money. It's as fast (or faster), expressive, powerful and, subjectively, easier to pick up and code in than Rust. It just needs some backers with muscle.

jholman | karma 3913 | avg karma 3.03 2020-02-05 04:44:25 | [–] similar comments

Maybe some big FAANG company. Start at the beginning of the acronym, I guess. I wonder if anyone could persuade anyone at Facebook to do a little bit of D professionally. I bet if even one serious Facebook engineer made a serious effort to use D, its adoption woes would be over.

vips7L | karma 2543 | avg karma 1.69 2020-02-05 05:22:42 | [–] similar comments

Walter Bright the author of D worked at Facebook writing a fast preprocessor for C/C++ in D.

jholman | karma 3913 | avg karma 3.03 2020-02-05 05:26:56 | [–] similar comments

Alexandrescu, too.

They also talked about it pretty frequently.

Maybe ggp's claim that D just needs a heavy hitter backing it might be misplaced.

vips7L | karma 2543 | avg karma 1.69 2020-02-05 18:46:05+00:00 | [–] similar comments

I don't think it's misplaced. Facebook wasn't really backing D by hiring them. And it doesn't seem like that specific project is active anymore.

dom96 | karma 6422 | avg karma 3.49 2020-02-05 09:48:06+00:00 | [–] similar comments

Walter too? AFAIK it was just Andrei.

Cyph0n | karma 8463 | avg karma 2.62 2020-02-05 05:23:43+00:00 | [–] similar comments

I thought Facebook used to “unofficially” support D? Or am I mixing languages up?

Right now, I think eBay is one of the big companies that uses D.

Edit: Sibling comment beat me to it.

sophiebits | karma 9979 | avg karma 58.7 2020-02-05 07:17:24+00:00 | [–] similar comments

Facebook is unlikely to move to D at this point and already has multiple serious projects in Rust (that they’re happy about, AFAIK).

nnq | karma 4559 | avg karma 2.61 2020-02-05 07:18:20+00:00 | [–] similar comments

You probably have Swift in the same niche... and more elegant, not completely ignoring the last few decades of language research etc. If you want something more minimalistic there's Zig. D is just "C++ + hindsight", nothing special, only extra fragmentation of dev ecosystem by bringing in another language.

Ofc, Apple is not MS, so Swift developer experience and docs kind of suck (if you're not in the "iOS bubble" where it's nice and friendly), and multiplatform dev experience especially sucks even worse...

And "easier to pick up and code in" might not necessarily be an advantage for a systems language - better start slowly putting the puzzle together in your head, and start banging up code others need to see only after you've internalized the language and its advanced features and when (not) to use them! It helps everyone in the long run. This is one reason why I'd bet on Rust!

wool_gather | karma 2993 | avg karma 2.31 2020-02-05 16:38:21 | [–] similar comments

"completely ignoring the last few decades"

Well, for fairness, D is quite a bit older than Swift. (It's nearly as much older than Swift as it is younger than C++!) But what do you think pushes Swift out of the "C++ with hindsight" basket?

pjmlp | karma 109153 | avg karma 1.76 2020-02-05 14:47:57+00:00 | [–] similar comments

That is what MS is doing with C#/F# and .NET Native/Xamarin AOT/CoreRT, with the experience taken from Midori.

So I doubt they would sponsor D.

marcosdumay | karma 27273 | avg karma 1.67 2020-02-05 17:30:45 | [–] similar comments

C# is an attempt of making Java good, F# is an attempt of making a subset of Haskell popular. .Net Native/Xamarin/CoreRT are UI frameworks. There is nothing there that would compete with C++.

I don't think MS has any interest in improving C++ (look at their compiler). But that's not because of competing activities.

pjmlp | karma 109153 | avg karma 1.76 2020-02-05 19:02:05+00:00 | [–] similar comments

Except the lessons learned from Midori and .NET Native on UWP.

Visual C++ is the best commercial implementation of C++ compilers, better not be caught using xlc, aCC, TI, IAR, icc and plenty of other commercial offerings.

If C++ has span, span_view, modules, co-routines, core guidelines, lifetime profile static analyser, is it in large part to work started and heavily contributed by Microsoft on ISO, and their collaboration with Google.

As for competing with C++, it is quite clear, specially when comparing the actual software development landscape with the 90's, that C++ has lost the app development war.

Nowadays across all major consumer OSes it has been relegated to the drivers and low level OS services layer like visual compositor, graphics engine, GPGPU binding libraries.

Ironically, from those mainstream OSes, Microsoft is the only one that still cares to provide two UI frameworks directly callable from C++.

Which most Windows devs end up ignoring in favour of the .NET bindings, as Kenny Kerr mentions in one of his talks/blog posts.

Back to the D issue, Azure IoT makes use of C# and Rust, and there is Verona at MSR as well, so as much I would like to see them spend some effort on D, I don't see it happening.

Kuinox | karma 2251 | avg karma 2.39 2020-02-05 19:53:34+00:00 | [–] similar comments

CoreRT is an UI frameworks, what ?

systems | karma 2030 | avg karma 2.25 2020-02-05 02:27:02+00:00 | [–] similar comments

I strongly believe, that people should consider the ecosystem part of maintainability risk and claim

Rust have by far the better ecosystem compared to those 3

no_wizard | karma 8769 | avg karma 3.61 2020-02-05 05:00:12+00:00 | [–] similar comments

That’s highly subjective claim to bare without evidence. All three languages are actively maintained and are growing.

My refute it simply: Rusts web development story isn’t out of the box clean like Crystal Lang’s which ships with an HTML language out of the box. So it could be categorized as a poor choice in comparison to Crystal

nindalf | karma 16080 | avg karma 6.1 2020-02-05 10:27:16 | [–] similar comments

> ships with an HTML language

Did you mean HTTP server? If so, there are at least 3 good ones in Rust that are only a `cargo add` away. If you've already taken the trouble to set up a toolchain for a new language, surely a single line isn't asking too much.

igouy | karma 1139 | avg karma 0.41 2020-02-05 17:17:19+00:00 | [–] similar comments

— Expressiveness "keep in mind that this is a subjective metric based on the author's experience!"

— Maintenance Complexity "keep in mind that this is a subjective metric based on the author's experience!"

UtherII | karma 14 | avg karma 0.93 2020-02-05 09:34:59+00:00 | [–] similar comments

The problem would be to get a definition of idiomatic code.

Would it be fair to accept an optimization on a language and refuse it on another because it is not idiomatic ?

igouy | karma 1139 | avg karma 0.41 2020-02-05 17:04:50 | [–] similar comments

>> I wish there were an "idiomatic code benchmarks game"

Help make one:

— contribute programs that you consider to be "idiomatic"

https://salsa.debian.org/benchmarksgame-team/benchmarksgame/...

— use those measurement and web scripts to publish your own "common use cases" "idiomatic code" benchmarks game

----

>> compare language speed for common use cases vs trying to squeeze every last piece of performance

— please be clear about why you wish to compare the speed of programs written as if speed did not matter

----

??? What do you think is not "idiomatic" about programs such as:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

cryptos | karma 1628 | avg karma 3.0 2020-02-05 20:03:20+00:00 | [–] similar comments

There is a lot more than the language/compiler what influences the results, but at least these benchmarks are closer to real world than solving math puzzles in micro benchmarks.

https://www.techempower.com/benchmarks/

gameswithgo | karma 10455 | avg karma 3.71 2020-02-04 22:32:50+00:00 | [–] similar comments

It depends greatly on the problem domain. The difference might be near zero, or you might be able to get ~16x better performance (using say, AVX-512 intrinsics). Then again, is intrinsics really C? Not really, but you can do it. What if you have to abandon using classes when you want to, in order to get the memory layout you want in Java, are you still using Java?

the_duke | karma 16666 | avg karma 6.89 2020-02-05 01:27:01 | [–] similar comments

VMs with JIT like the JVM are only ever really fast/competitive with C in small numerical micro-benchmarks where the code can be hyper-optimized.

Most code will be considerably slower due to a lot of factors.

Java in particular is a very pointer-heavy language, made up of pointers to pointers to pointers everywhere, which is really bad for our modern systems that often are much more memory latency than CPU constrained.

A factor of 2-4x to languages like C++ or Rust for most code seems plausible (and even low) unless the limiting factor is external, like network or file system IO.

thu2111 | karma 811 | avg karma 0.34 2020-02-05 09:01:39+00:00 | [–] similar comments

This stuff is really hard to pin down though. I've been reading these sorts of debates forever.

It's true that pointer chasing really hurts in some sorts of program and benchmark. For sure. No argument. That's why Project Valhalla exists.

But it's also my view that modern C++ programming gets away with a lot of slow behaviours that people don't really investigate or talk about because they're smeared over the program and thus don't show up in profilers, whereas actually the JVM fixes them everywhere.

C++ programs tend to rely much more heavily on copying large structures around than pointer-heavy programs. This isn't always or even mostly because "value types are fast". It's usually because C++ doesn't have good memory management so resource management and memory layout gets conflated, e.g. std::vector<BigObject>. You can't measure this because the overheads are spread out over the entire program and inlined everywhere, so don't really show up in profiling. For the same reasons C++ programs rely heavily on over-specialised generics where the specialisation isn't actually a perf win but rather a side effect of the desire for automatic resource management, which leads to notorious problems with code bloat and (especially) compile time bloat.

Another source of normally obscured C++ performance issues is the heap. We know malloc is very slow because people so frequently roll their own allocators that the STL supports this behaviour out of the box. But malloc/new is also completely endemic all over C++ codebases. Custom allocators are rare and restricted to very hot paths in very well optimised programs. On the JVM allocation is always so fast it's nearly free, and if you're not actually saturating every core on the machine 100% of the time, allocation effectively is free because all the work is pushed to the spare cores doing GC.

Yet another source of problems is cases where the C++ programmer doesn't or can't actually ensure all data is laid out in memory together because the needed layouts are dynamically changing. In this case a moving GC like in the JVM can yield big cache hit rate wins because the GC will move objects that refer to each other together, even if they were allocated far apart in time. This effect is measurable in modern JVMs where the GC can be disabled:

https://shipilev.net/jvm/anatomy-quarks/11-moving-gc-localit...

And finally some styles of C++ program involve a lot of virtual methods that aren't always used, because e.g. there is a base class that has multiple implementations but in any given run of the program only one base class is used (unit tests vs prod, selected by command line flag etc). JVM can devirtualise these calls and make them free, but C++ compilers usually don't.

On the other hand all these things can be obscured by the fact that C++ these days tends only to be used in codebases where performance is considered important, so C++ devs write performance tuned code by default (or what they think is tuned at least). Whereas higher level languages get used for every kind of program, including the common kind where performance isn't that big of a deal.

barrkel | karma 34063 | avg karma 3.87 2020-02-05 10:42:00 | [–] similar comments

The costs of languages like C++ also get worse the older a program is.

Without global knowledge of memory lifetimes, maintainers make local decisions to copy rather than share.

scott_s | karma 34069 | avg karma 3.96 2020-02-05 16:41:20+00:00 | [–] similar comments

> We know malloc is very slow because people so frequently roll their own allocators that the STL supports this behaviour out of the box. But malloc/new is also completely endemic all over C++ codebases. Custom allocators are rare and restricted to very hot paths in very well optimised programs. On the JVM allocation is always so fast it's nearly free, and if you're not actually saturating every core on the machine 100% of the time, allocation effectively is free because all the work is pushed to the spare cores doing GC.

Allocation in a C++ program is going to be about the same speed as in a Java program. Modern mallocs are doing basically the same thing on the hot-path: bumping the index on a local slab allocator.

dotaheor | karma 31 | avg karma 2.82 2020-02-05 05:38:43+00:00 | [–] similar comments

This really depends on what job a program does and how well the C program is implemented.

danesparza | karma 2471 | avg karma 3.07 2020-02-05 13:35:59 | [–] similar comments

I call utter bullshit especially when dealing with threads. I think you'll spend so much time debugging pointers, the stack and your memory allocations that switching to a more modern language could save you significant debugging time.

But now I sound like a Geico (insurance) commercial. Sorry about that.

_ph_ | karma 9798 | avg karma 2.44 2020-02-06 09:03:17+00:00 | [–] similar comments

This statement is definitely wrong in this generic blank fashion. Also I would lay upon you the burden of proof for it :).

Tight, low level code in Java and Go is roughly as fast as average C code. The Go compiler is know to be less good at optimizing code than e.g. GCC, but this in many cases creates little practical difference, while the Java JIT compilers have become excellent to a point where they often beat GCC, especially as they can use run time profiling for code optimization. So they can optimize the code for the actual task at hand.

Where the languages differ in "speed" is their runtime environment. Java and Go are languages with garbage collection, which of course means that some amount of CPU is required to perform GC. But as the modern garbage collectors run in parallel with the program, this CPU effort often enough is no bottleneck. On the other side, manual memory management has different performance trade-offs, which in many cases can make it quite slow on its own.

chubs | karma 1762 | avg karma 3.2 2020-02-04 22:02:25+00:00 | [–] similar comments

I found similarly when I ported an image resizing algorithm from Swift to Rust: I'm experienced in swift thus was able to write in an idiomatic way, and have little Rust experience thus I wrote it in a naive way; yet still the rust algorithm was twice(!) as fast. And swift doesn't even have a GC slowing things down!

dgellow | karma 16126 | avg karma 4.42 2020-02-04 22:15:10+00:00 | [–] similar comments

ARC, used by Swift, has its own cost.

1propionyl | karma 1033 | avg karma 3.29 2020-02-05 07:48:56+00:00 | [–] similar comments

True, but it's generally better than most full GC solutions (for processes running for relatively short times without the benefit of profile-guided optimization), and worse than languages with fully statically analyzable memory usage.

Note: that parenthetical is a very big caveat, because properly profile-optimized JVM executables can often achieve exceptional performance/development cost tradeoffs.

In addition however, ARC admits a substantial amount of memory-usage optimization given bytecode, which is now what developers provide to Apple on iOS. Not to mention potential optimizations by allowing Apple to serve last-minute compiled microarchitecture optimized binaries for each device (family).

To satiate the pedants... ARC is more of less GC where calls into the GC mechanism are compiled in statically and where there are at worst deterministic bounds on potential "stop the world" conditions.

While this may not be presently optimal because profile-guided approaches can deliver better performance by tuning allocation pool and collection time parameters, it's arguably a more consistent and statically analyzable approach that with improvement in compilers may yield better overall performance. It also provides tight bounds on "stop the world" situations, which also exist far less frequently on mobile platforms than in long running sever applications.

Beyond those theoretical bounds, it's certainly much easier to handle when you have an OS that is loading and unloading applications according to some policy. This is extremely relevant as most sensible apps are not actually long running.

ernst_klim | karma 1085 | avg karma 1.95 2020-02-05 11:40:00+00:00 | [–] similar comments

> but it's generally better than most full GC solutions

I doubt that. It implies huge costs without giving any benefits of GC.

A typical GC have compaction, nearly stack-like fast allocation [1], ability to allocate a bunch of objects at once (just bump the heap pointer once for a whole bunch).

And both Perl and Swift do indeed perform abysmally, usually worse than both GC and manual languages [2].

> ARC is more of less GC

Well, no. A typical contemporary GC is generational, often concurrent, allowing fast allocation. ARC is just a primitive allocator with ref/deref attached.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49....

[2] https://github.com/ixy-languages/ixy-languages

pkolaczk | karma 2452 | avg karma 1.99 2020-02-05 18:55:07+00:00 | [–] similar comments

It is nowhere near stack-like. Stack is hot in cache. Heap memory in tlab is cold. Bringing the lines into cache is the major cost, not bumping the pointer.

ernst_klim | karma 1085 | avg karma 1.95 2020-02-06 07:16:43+00:00 | [–] similar comments

> Stack is hot in cache. Heap memory in tlab is cold.

What? This doesn't make any sense. From the cache's POV stack and bump-allocated heap are the same thing. Both are continuous chunks of memory where the next value is being allocated right after the previous one.

The only difference between the stack and the bump-allocated heap is that the former has hardware support for pointer bumping and the latter has not. That's all.

pkolaczk | karma 2452 | avg karma 1.99 2020-02-06 08:08:40+00:00 | [–] similar comments

You're missing the fact that the tlab pointer is only ever moved forward, so it always points to recently unused memory. Until the reset happens and it points back to the same memory again, the application managed to allocate several megabytes or sometimes hundreds of megabytes, and most of that new-gen memory does not fit even in L3 cache.

The stack pointer moves both directions and the total range of that back-and-forth movement is typically in kilobytes, so it may fit fully in L1.

Just check with perf what happens when you iterate over an array of 100 MB several times and compare that to iterating over 10 kB several times. Both are contiguous but the performance difference is pretty dramatic.

Besides that, there is also an effect that the faster you allocate, the faster you run out of new gen space, and the faster you trigger minor collections. These are not free. The faster you do minor collections, the more likely it is for the objects to survive. And the cost is proportional to survival rate. That's why many Java apps tend to use pretty big new generation size, hoping that before collection happens, most of young objects die.

This is not just theory - I saw this just too many times, when reducing allocation rate to nearly zero caused significant speedups - by order of magnitude of more. Reducing memory traffic is also essential to get good multicore scaling. It doesn't matter each core has a separate tlab, when their total allocation rate is so high that they are saturating LLC - main memory link. It is easy to miss this problem by classic method profiling, because a program with such problem will manifest by just everything being magically slow, but no obvious bottleneck.

ernst_klim | karma 1085 | avg karma 1.95 2020-02-06 08:55:04 | [–] similar comments

> You're missing the fact that the tlab pointer is only ever moved forward, so it always points to recently unused memory. Until the reset happens and it points back to the same memory again, the application managed to allocate several megabytes or sometimes hundreds of megabytes, and most of that new-gen memory does not fit even in L3 cache.

Yes, you are right about stack locality. It indeed moves back and forward making effective used memory region quite small.

> These are not free. The faster you do minor collections, the more likely it is for the objects to survive. And the cost is proportional to survival rate.

Yes, that's true. Immutable languages are doing way better here having small minor heaps (OCaml has 2MB on amd64) and very small survival rates (with many object being directly allocated on older heap if they are known to be lasting in advance).

Now I understand your point better and I agree.

terminaljunkid | karma 29 | avg karma 0.28 2020-02-05 02:03:02 | [–] similar comments

> Swift doesn't have a GC slowing things down

This Apple marketing meme needs to die. Reference counting incurs arguably more cost than GC, or at least the cost is spread through-out processing.

pkolaczk | karma 2452 | avg karma 1.99 2020-02-05 06:31:16 | [–] similar comments

It incurs some cost, but whether it is higher is very debatable. This is very much workload dependent. A smart compiler can elide most reference updates.

abjKT26nO8 | karma 730 | avg karma 3.95 2020-02-05 08:45:21 | [–] similar comments

It would seem that the Swift compiler is far from smart[1].

[1]: <https://media.ccc.de/v/35c3-9670-safe_and_secure_drivers_in_...

pjmlp | karma 109153 | avg karma 1.76 2020-02-05 14:45:34 | [–] similar comments

No it can't, not in Swift's current implementation.

https://github.com/ixy-languages/ixy-languages

arcticbull | karma 30340 | avg karma 3.27 2020-02-05 06:34:02+00:00 | [–] similar comments

Apple's ARC is not a GC in the classic sense. It doesn't stop the world and mark/sweep all of active memory. It's got "retain" and "release" calls automatically inserted and elided by the compiler to track reference counts at runtime, and when they hit zero, invoke a destructor. That's not even close to what most people think of when they think "gc". Of course it's not free, but it's deterministic.

trevor-e | karma 1766 | avg karma 6.2 2020-02-05 07:04:08+00:00 | [–] similar comments

I think you are being borderline pedantic here. ARC _is_ GC in the classic sense: https://en.wikipedia.org/wiki/Garbage_collection_(computer_s...

I agree with you that most people tend to associate GC with something more advanced nowadays, like mark and sweep as you said in another comment, but it seems pointless to argue that ARC is not a form of GC.

barrkel | karma 34063 | avg karma 3.87 2020-02-05 10:37:33+00:00 | [–] similar comments

You're using a hearsay, folklore definition of GC, not a CS one.

Refcounting in the presence of threads is usually non-deterministic too.

pkolaczk | karma 2452 | avg karma 1.99 2020-02-05 18:56:47 | [–] similar comments

It is non deterministic, but at a much different scale.

giornogiovanna | karma 496 | avg karma 3.12 2020-02-05 06:10:37+00:00 | [–] similar comments

Reference counting is a (low-throughput, low-latency) form of garbage collection.

arcticbull | karma 30340 | avg karma 3.27 2020-02-05 06:32:33 | [–] similar comments

Yes and no. From a theoretical perspective, I suppose that's true, but "garbage collection" tends to mean a non-deterministic collector that does its own thing, and you don't have to think at all about memory. That does not apply to Swift, as of course, you need to understand the difference between strong and weak references. It's unfairly simplistic to couple the two.

staticassertion | karma 12534 | avg karma 3.12 2020-02-05 07:58:48+00:00 | [–] similar comments

But it actually is just garbage collection.

Sean1708 | karma 1208 | avg karma 1.74 2020-02-05 09:20:36 | [–] similar comments

It is but in the context of this discussion it's very clear that they meant a tracing garbage collector, which has a very different cost than atomic reference counting. Or to put it another way: you're technically correct, the worst kind of correct.

barrkel | karma 34063 | avg karma 3.87 2020-02-05 10:34:48+00:00 | [–] similar comments

No, RC is GC.

Most people think of Python as GCed language, and it uses mostly RC.

Any runtime that uses mark & sweep today may elect to use RC for some subset of the heap at some point in a future design, if that makes more sense. The mix of marking GC vs refcounting GC shouldn't affect the semantics of the program.

_ph_ | karma 9798 | avg karma 2.44 2020-02-06 08:55:14 | [–] similar comments

The low-latency part might not even be true. RC means that you don't have CPU consuming heap scans, but if you free the last reference to a large tree of objects, freeing them can take quite a lot of time, causing high latencies.

pjmlp | karma 109153 | avg karma 1.76 2020-02-05 14:44:13+00:00 | [–] similar comments

Swift's form of GC is one of the slowest one, no wonder porting to Rust made it faster, specially given that most tracing GC outperform Swift's current implementation.

If one goes with reference counting as GC implementation, then one should take the effort to use hazardous pointers and related optimizations.

stingraycharles | karma 11619 | avg karma 4.34 2020-02-04 21:21:13 | [–] similar comments

> The JVM world tends to solve this problem by using off-heap caches. See Apache Ignite [0] or Ehcache [1].

For those who care, I was interested how off-heap caching works in Java and I did some quick searching around the Apache Ignite code.

The meat is here:

- GridUnsafeMemory, an implementation of access to entries allocated off-heap. This appears to implement some common Ignite interface, and invokes calls to a “GridUnsafe” class https://github.com/apache/ignite/blob/53e47e9191d717b3eec495...

- This class is the closest to the JVM’s native memory, and wraps sun.misc.Unsafe: https://github.com/apache/ignite/blob/53e47e9191d717b3eec495...

- And this, sun.misc.Unsafe, is what it’s all about: http://www.docjar.com/docs/api/sun/misc/Unsafe.html

It’s very interesting because I did my fair share of JNI work, and context switches between JVM and native code are typically fairly expensive. My guess is that this class was likely one of the reasons why Sun ended up implementing their (undocumented) JavaCritical* etc functions and the likes.

winrid | karma 4020 | avg karma 1.39 2020-02-04 21:57:42+00:00 | [–] similar comments

The idea is that that call is still less expensive than going over the wire and MUCH less expensive than having the GC go through that heap now and then.

stingraycharles | karma 11619 | avg karma 4.34 2020-02-04 22:04:47 | [–] similar comments

Yes sorry I should have elaborated, those Critical JNI calls avoid locking the GC and in general are much more lightweight. This is available for normal JNI devs as well, its just not documented. They were primarily intended for some internal things that Sun needed.

I’m now guessing that this might actually have been those Unsafe classes as an intended use case. It makes total sense and I can see how that will be very fast.

chrisseaton | karma 36438 | avg karma 2.64 2020-02-04 22:12:08+00:00 | [–] similar comments

> context switches between JVM and native code are typically fairly expensive

Aren't these Unsafe memory read and write methods intrinsified by any serious compiler? I don't believe they're using JNI or doing any kind of managed/native transition, except in the interpreter. They turn into the same memory read and write operations in the compiler's intermediate representation as Java field read and writes do.

jfim | karma 4355 | avg karma 2.96 2020-02-05 03:58:59 | [–] similar comments

They are optimized, yes, but from what I recall from reading the JVM code a few years ago, some optimizations don't get applied to those reads/writes. For example, summing two arrays together will be vectorized to use SSE instructions while doing so through Unsafe won't [0].

[0] https://cr.openjdk.java.net/~vlivanov/talks/2017_Vectorizati...

sreque | karma 1338 | avg karma 3.34 2020-02-04 23:32:57+00:00 | [–] similar comments

Unsafe lets you manipulate memory without any JNI overhead other than when allocating or de-allocating memory, and that is usually done in larger chunks and pooled to avoid the overhead at steady state. Netty also takes advantage of Unsafe to move a lot of memory operations off the java heap.

Unsafe was one of the cooler aspects to Java that Oracle is actively killing for, well, no good reason at least.

kllrnohj | karma 11052 | avg karma 3.34 2020-02-05 06:11:44+00:00 | [–] similar comments

> Unsafe was one of the cooler aspects to Java that Oracle is actively killing for, well, no good reason at least.

I mean, there's the obvious reason that it breaks the memory safety aspect that Java in general guarantees. The whole point of the feature is to subvert the language & expectations.

I'm not saying they should remove it, but it's pretty hard to argue there's "no good reason" to kill it, either. It is, after all, taking the worst parts of C and ramming it into a language that is otherwise immune from that entire class of problems.

jayd16 | karma 10887 | avg karma 2.09 2020-02-05 07:11:14+00:00 | [–] similar comments

C# seems to have a neat middle ground for this kind of stuff with their Span<T> api.

to11mtm | karma 1108 | avg karma 1.59 2020-02-08 01:38:17+00:00 | [–] similar comments

True, but we had our own version of unsafe for a much longer time. MS was just pragmatic enough to allow it across the ecosystem.

I'm guessing at least some of that was a side effect of wanting to support C++; not having pointers as an option would have killed C++/CLI from the get go.

thu2111 | karma 811 | avg karma 0.34 2020-02-05 09:06:30+00:00 | [–] similar comments

They aren't killing it. They're steadily designing safe and API stable replacements for its features, with equal performance. That is a very impressive engineering feat!

For instance fast access to un-GCd off heap memory is being added at the moment via the MemoryLayout class. Once that's here apps that upgrade won't need to use Unsafe anymore. MemoryLayout gives equivalent performance but with bounds checked accesses, so you can't accidentally corrupt the heap and crash the JVM.

They've been at it for a long time now. For instance VarHandle exposes various low level tools like different kinds of memory barriers that are needed to implement low level concurrency constructs. They're working on replacements for some of the anonymous class stuff too.

pjmlp | karma 109153 | avg karma 1.76 2020-02-05 14:49:48+00:00 | [–] similar comments

Unsafe is being replaced by less error prone APIs, not killed.

Project Panama is what is driving that effort.

tsimionescu | karma 17553 | avg karma 2.17 2020-02-04 21:50:19+00:00 | [–] similar comments

> I can't speak for how their Rust cache manages memory, but the thing to be careful of in non-GC runtimes (especially non-copying GC) is memory fragmentation.

As far as I know, a mark-and-sweep collector like Go's doesn't have any advantage over malloc/free when it comes to memory fragmentation. Am I missing some way in which Go's GC helps with fragmentation?

lossolo | karma 2553 | avg karma 2.06 2020-02-04 22:52:34 | [–] similar comments

Go GC implementation uses memory allocator that was based on TCMalloc (but derived from it quite a bit). They use a free list of multiple fixed allocatable size-classes, which helps in reducing fragmentation. That's why Go GC is non-copying.

haimez | karma 899 | avg karma 2.45 2020-02-05 00:31:35+00:00 | [–] similar comments

I’m not sure I follow. GC implementations that don’t copy (relocate) are inherently subject to the performance cost of “fragmentation” (in the sense of scattering memory accesses over non-adjacent regions). This is a very high price to pay when you’re dealing with modern hardware.

andoriyu | karma 59 | avg karma 0.34 2020-02-05 01:31:30+00:00 | [–] similar comments

Allocator underneath is keeping track of freed memory, so next allocation has high chance of being squeezed into memory region that has been used before. It's obviously not as good as say GC that relocates after sweep, but at least it doesn't leave gaping holes.

haimez | karma 899 | avg karma 2.45 2020-02-05 02:57:45 | [–] similar comments

Indeed, but it also doesn’t maintain locality of access nearly as well for young objects (the most commonly manipulated ones) and even older ones that survive.

jstrong | karma 353 | avg karma 2.04 2020-02-05 04:28:34+00:00 | [–] similar comments

one related point: the article mentions utilizing rust's BTreeMap, which manages its heap allocations with cache efficiency in mind: https://doc.rust-lang.org/std/collections/struct.BTreeMap.ht....

The guts of BTreeMap's memory management code is here: https://github.com/rust-lang/rust/blob/master/src/liballoc/c.... (warning: it is some of the most gnarly rust code I've ever come across, very dense, complex, and heavy on raw pointers. this is not a criticism at all, just in terms of readability). Anecdotally I've had very good results using BTreeMap in my own projects.

In terms of how the "global" allocator impacts performance, I'd expect it to play a bigger role in terms of Strings (I mean, it's a chat program), and possibly in terms of how the async code "desugars" in storing futures and callbacks on the heap (just guessing, I'm not an expert on the rust async internals).

senderista | karma 1402 | avg karma 1.93 2020-02-05 05:58:35+00:00 | [–] similar comments

I was a bit surprised that BTreeMap turned out to be more memory-efficient than HashMap; can anyone shed some light on this?

One of these days I’ll get around to turning my Rust set implementation[0] into a full-blown map (it’s already <50% the size of BTreeSet for ints)...

[0] https://github.com/senderista/rotated-array-set

tsimionescu | karma 17553 | avg karma 2.17 2020-02-05 06:28:42+00:00 | [–] similar comments

In the current context, fragmentation refers more to the problem of consuming extra memory through fragmentation, which malloc implementations like the one Go (or Rust, or glibc) uses can often mitigate.

tsimionescu | karma 17553 | avg karma 2.17 2020-02-05 06:24:24+00:00 | [–] similar comments

Sure, but the malloc implementation in Rust probably does something similar.

What I wanted to understand is what is the difference in fragmentation between a non-copying, non-compacting GC and a non-GC runtime.

wpietri | karma 58013 | avg karma 4.11 2020-02-05 01:31:35+00:00 | [–] similar comments

Maybe I've missed this, but why do they need a particularly large LRU cache? Surely this isn't all one process, so presumably they could reduce spikes by splitting the same load across yet more processes?

Techies4Trump | karma -15 | avg karma -0.47 2020-02-05 03:08:03+00:00 | [–] similar comments

Larger cache = faster performance and less load on the database.

I only glossed over the article but the problem they had with Go seems to be the GC incurred from having a large cache. Their cache eviction algorithm was efficient, but every 2 minutes there was a GC run which slowed things down. Re-implementing this algorithm in Rust gave them better performance because the memory was freed right after the cache eviction.

Splitting it across more processes will result in more cache misses and more DB calls.

wpietri | karma 58013 | avg karma 4.11 2020-02-05 15:22:30+00:00 | [–] similar comments

I am of course talking about the same amount of total cache RAM, just split among more processes. Depending on distribution of the calls, you might get more cache misses, but I don't think it's guaranteed, and if it is, I don't think we can assume it's significant. Heck, you could even use more cache RAM; the cost of a total rewrite plus ongoing maintenance in a new language covers a fair bit of hardware these days.

tlrobinson | karma 30498 | avg karma 4.18 2020-02-05 04:23:27 | [–] similar comments

> From a purely architectural perspective, I would try to put cacheable material in something like memcache or redis, or one of the many distributed caches out there. But it might not be an option.

Can you speak to why using something like memcache or redis may not be an option?

otterley | karma 9624 | avg karma 2.68 2020-02-05 04:32:13+00:00 | [–] similar comments

For latency-sensitive services, having to traverse the network to access a shared cache may be too slow. To use the current story as an example, you'd be trading off an occasional 100-millisecond latency spike every 2 minutes for an added 1-2ms of latency for every request.

cft | karma 7556 | avg karma 4.23 2020-02-05 07:27:50+00:00 | [–] similar comments

> From a purely architectural perspective, I would try to put cacheable material in something like memcache or redis

You cannot use a caching server at that scale with those latency requirements. It has to be embedded

toxicafunk | karma 1 | avg karma 1.0 2020-02-05 11:32:27 | [–] similar comments

Something like rocksdb (https://rocksdb.org/) then

cft | karma 7556 | avg karma 4.23 2020-02-05 13:38:07+00:00 | [–] similar comments

You cannot use that either, due to I/O block device saturation, even on enterprise PCIe NVMe. Not enough parallel IOPS. Only RAM can be used.

adev_ | karma 1564 | avg karma 3.51 2020-02-05 07:31:06+00:00 | [–] similar comments

> The JVM world tends to solve this problem by using off-heap caches. See Apache Ignite [0] or Ehcache [1].

Yeah, but I really do not bite your argument.

When you are reduced to do manual memory management and fight the GC of your language, maybe you should simply not use a language with GC in the first place.

They are right to use rust ( or C/C++) for that. It's not for nothing that redis (C) is so successful in the LRU domain.

> It's worth mentioning that Apache Cassandra itself uses an off-heap cache.

And still ScyllaDB (C++) is able to completely destroy Cassandra in term of AVG latency [0]

[0]: https://www.scylladb.com/product/benchmarks/

mrjn | karma 888 | avg karma 2.55 2020-02-05 07:35:08+00:00 | [–] similar comments

Don't know enough about Rust, but I think Go would benefit immensely by allowing its users to disable GC and allow de-allocating memory by hand. GC is great for simpler applications, but more complex projects end up fighting so much with memory and GC in Go that all the benefits of automatic de-allocations are negated. Love every other aspect of Go.

PeterCorless | karma 1395 | avg karma 2.01 2020-02-05 07:54:26 | [–] similar comments

Wow. We literally just published why to not put a cache in front of your server to mask its bad performance behind a layer of complexity. tl;dr: make sure you have a solid DB to begin with. (Forgive the gated asset, but it's a good read!)

https://go.scylladb.com/7-reasons-no-external-cache-database...

adamnemecek | karma 57571 | avg karma 6.54 2020-02-04 19:28:50+00:00 | [–] similar comments

Rust is maturing. I legit don't think there are too many good reasons to use Go over Rust. You can call Rust from Go but not vice versa.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 20:14:56 | [–] similar comments

(You can call Go from Rust: https://blog.arranfrance.com/post/cgo-sqip-rust/ )

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 19:31:17+00:00 | [–] similar comments

Go is not a general-purpose language. It's a Google language designed to solve Google's problems. If you aren't Google, you probably have different problems, which Go isn't intended to solve.

EDIT: Currently at -4 downvotes. Would downvoters care to discuss their votes?

Corrado | karma 5144 | avg karma 3.45 2020-02-04 20:16:21 | [–] similar comments

I agree. One of Go's design goals was to be simple enough for thousands of developers to use it simultaneously across a huge monorepo. To me this is in the same class as companies use k8s; unless your Google (or Facebook or Netflix ...) you probably shoudn't be using it.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 20:53:29+00:00 | [–] similar comments

> To me this is in the same class as companies use k8s; unless your Google (or Facebook or Netflix ...) you probably shoudn't be using it.

I'll actually even say, that if you're Facebook or Netflix, you still shouldn't use Go, because you can write your own tools that solve your problems.

cmrdporcupine | karma 19360 | avg karma 3.11 2020-02-04 21:29:45 | [–] similar comments

As a Googler, I don't consider this accurate. I've been here 8 years and have yet to work on a Go code base. Yes, there are projects in Go. Certainly not a majority, nor even a significant minority, honestly.

No, I wouldn't say Go is specific to Google's problems, though I'm sure some of the engineers had them in mind. I see Go used far more outside of Google than in.

takeda | karma 6128 | avg karma 2.11 2020-02-04 22:14:53+00:00 | [–] similar comments

Isn't that indication of a failure? It seems like Go aimed to replace Python and Java code at Google.

cmrdporcupine | karma 19360 | avg karma 3.11 2020-02-05 00:36:21+00:00 | [–] similar comments

My impression (and this was pre-Google and I haven't paid attention since I got here, so) is that it was Rob Pike's and Ken Thompson's project coming out of their long experience with Plan 9 and Inferno/Limbo. That it happened to meet some requirements for some Google projects -- I'm sure that might have been an intent. But that feels a bit like an explanation after the fact, since Go very obviously shows the biases and philosophy from the projects that the original authors had in their previous work.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 22:37:23+00:00 | [–] similar comments

Well, that's pretty interesting.

I don't know if that disproves that Go was intended to solve Google's problems, though. I think from the early writings of the authors of the language in its infancy, it was pretty clear that they intended it to solve problems they were having at Google (i.e. the single-pass compilation design was intended to help with the compilation of their gigantic codebase). If it hasn't gained traction at Google, that only proves that it failed to solve a lot of Google's problems.

That's still not to say it's a failure in an absolute sense: it may have solved the problems it was intended to solve.

ben0x539 | karma 4321 | avg karma 3.33 2020-02-04 22:08:49 | [–] similar comments

I downvoted. "Go is not a general-purpose language" is a statement I could see myself agreeing with, so I started reading your comment excited to read a brief outline of what use-cases Go is specifically aimed at and how that makes it sub-optimal for Discord's use-cases.

But "it's for Google, and you aren't Google" isn't a novel perspective, doesn't leave me with new insights, and isn't really actionable for either Google or people who aren't Google.

Usually this criticism is leveled at Go's dependency management story, with the implication being that it's suited to Google's monorepo but not normal people's repo habits. It's not clear to me how the criticism relates to the issues discussed in the article, which seem to be more about the runtime and GC behavior.

Your comment also doesn't come off as amusing or otherwise entertaining, so it feels like you're just dunking on Go users without really aiming to make anyone's day better.

Disclaimer: I use Go at work and think it's incredibly frustrating at times.

vogre | karma 185 | avg karma 1.06 2020-02-04 22:15:00+00:00 | [–] similar comments

I am not downvoter, but you should learn the history of the language. Most of the concepts in the language were first implemented long before Google even existed, for systems that were very different from modern ones.

It was made by people who had been designing languages for about 40 years now. While some design choices seem weird, they usually have very strong argumentation and solid experience behind them.

Also if you read the list of problems tha Go is intended to solve, you will be surprised how common they are in software development.

kerkeslager | karma 10034 | avg karma 2.74 2020-02-04 22:33:49+00:00 | [–] similar comments

> I am not downvoter, but you should learn the history of the language.

What makes you think I haven't been following Go since its inception?

> Most of the concepts in the language were first implemented long before Google even existed, for systems that were very different from modern ones.

Yes, some of the languages which created those concepts are languages which I've used and which I feel did it better, which is why I am particularly frustrated that Go has gained such popularity with so little substance.

> It was made by people who had been designing languages for about 40 years now. While some design choices seem weird, they usually have very strong argumentation and solid experience behind them.

Yes. Most of the strong argumentation is Google specific.

> Also if you read the list of problems tha Go is intended to solve, you will be surprised how common they are in software development.

Such as?

vogre | karma 185 | avg karma 1.06 2020-02-05 09:36:00+00:00 | [–] similar comments

> What makes you think I haven't been following Go since its inception?

Your statement that Go is Google's language. In fact it's Rob Pike's and his team's language.

> Such as?

build speed, cross-platform builds, performance, simplicity of deployment, uniformity of large codebases and documentation, concurrency, learning speed

mister_hn | karma 569 | avg karma 0.96 2020-02-04 19:34:27+00:00 | [–] similar comments

Why not C++, if performance was an issue?

wmf | karma 46152 | avg karma 2.46 2020-02-04 19:41:39 | [–] similar comments

Why C++? Why would you want the same performance as Rust with less safety?

loeg | karma 21759 | avg karma 2.57 2020-02-04 19:47:20+00:00 | [–] similar comments

Why would you pick C++ for a new codebase in 2019 or 2020 if Rust met your needs?

terminaljunkid | karma 29 | avg karma 0.28 2020-02-05 09:33:04 | [–] similar comments

Programmer productivity Library support

nuclx | karma 292 | avg karma 1.8 2020-02-05 12:14:16+00:00 | [–] similar comments

Compilation times.

Narishma | karma 3253 | avg karma 1.67 2020-02-05 12:37:44 | [–] similar comments

In my experience, C++ is slower to compile than Rust.

bluGill | karma 18458 | avg karma 1.43 2020-02-04 21:27:43+00:00 | [–] similar comments

Modern C++ is the right choice if you have an existing code base in C++, or you need to use features that only exist in a third party C++ library - there is a large collection of C++ libraries to choose from.

Their use case doesn't seem to have either consideration (note that even when these are considerations a hybrid of languages is often a good idea) so there isn't a compelling reason to choose C++. That doesn't mean C++ is wrong, just that there is nothing wrong with rust. Maybe a great C++ programmer can get a few tenths of a percent faster code (mostly because compiler writers spend more effort figuring out how to optimize C++ - rust uses the same llvm optimizer but it might sometimes do something less optimal because it assumed C++ input), but in general if the difference matters in your environment you are too close to the edge and need to scale.

Rust might be easier/faster to write than modern C++. If so that is a point in favor of rust. They seem to have people who know rust, which is important. There might be more people who know C++, but I can take any great programmer and make them good in any programming language in a few weeks in the worst case (worst case would be writing a large program in intercal or some such intentionally hard language) - not to be confused with expert which takes more experience.

tiffanyh | karma 7280 | avg karma 4.69 2020-02-04 19:37:04+00:00 | [–] similar comments

It should also be noted that Rust interoperates extremely well with Erlang, which is the basis of Discord (via Rustler).

https://github.com/rusterlium/rustler

https://blog.discordapp.com/scaling-elixir-f9b8e1e7c29b

The_rationalist | karma -7 | avg karma -0.0 2020-02-04 19:37:50+00:00 | [–] similar comments

Borrowed from a comment:

Garbage collection has gotten a lot of updates in the last 3 years. Why would you not take the exceedingly trivial step of just upgrading to the latest Go stable in order to at least try for the free win? From the go 1.12 release notes: “Go 1.12 significantly improves the performance of sweeping when a large fraction of the heap remains live. This reduces allocation latency immediately following a garbage collection.” ¯\_(?)_/¯ This sounds like “we just wanted to try Rust, ok?” Which is fine. But like, just say that.

jrockway | karma 72069 | avg karma 3.74 2020-02-04 19:39:18+00:00 | [–] similar comments

This seems like a nice microservices success story. It's so easy to replace a low-performing piece of infrastructure when it is just a component with a well-defined API. Spin up the new version, mirror some requests to see how it performs, and turn off the old one. No drama, no year-long rewrites. Just a simple fix for the component that needed it the most.

thijsvandien | karma 993 | avg karma 2.52 2020-02-04 23:10:29+00:00 | [–] similar comments

You don't need microservices for that, though. One might as well have moved that piece into a library.

kccqzy | karma 13168 | avg karma 3.58 2020-02-05 00:39:08+00:00 | [–] similar comments

And then deal with cross-language FFI boundaries and cross-language builds.

SlowRobotAhead | karma 1033 | avg karma 0.56 2020-02-05 02:53:27+00:00 | [–] similar comments

This is what clicked for me on microservices years back. That the language wasn’t important and if I couldn’t do it in python or C, someone else could in Go or Java or etc.

Compared to if I wrote something in house entirely in C... lolno

Tomis02 | karma 766 | avg karma 1.58 2020-02-05 04:16:18+00:00 | [–] similar comments

Landing in a shop that uses N programming languages for N microservices would be a pretty miserable experience.

tcbasche | karma 463 | avg karma 1.27 2020-02-05 08:59:38+00:00 | [–] similar comments

At a previous job we used Python for all microservices, except for 'legacy' systems which were in Groovy / Rails. That was a context switch if I ever experienced one.

koffiezet | karma 621 | avg karma 1.39 2020-02-05 11:36:52 | [–] similar comments

I've seen quite a few environments, and usually there's only a limited current set of tech the devs are allowed to use, and if that's not the case, I try to enforce this, but this set should evolve depending on the needs.

The main issue however is manpower. At my current client, one of the technologies still actively being used for this reason is PHP (which is a horrible fit for microservices for a lot of reasons), because they have a ton of PHP devs employed, and finding a ton of (good) people with something more fitting like Go or Rust knowledge is hard and risky and training costs a lot of money (and more importantly: time)...

qaq | karma 6423 | avg karma 2.08 2020-02-05 14:22:31+00:00 | [–] similar comments

I can buy this for Rust but if people have issues picking up Go quickly ...

koffiezet | karma 621 | avg karma 1.39 2020-02-05 16:09:03 | [–] similar comments

Well, picking up the language itself is one thing (and I agree, that's quite easy with Go), but getting familiar with the ecosystem, best practices and avoiding habits from other languages? That's an entirely different thing.

And that's also how management usually sees it, and if they're smart they also realise that the first project using an unfamiliar technology is usually one to throw away.

dodobirdlord | karma 3155 | avg karma 2.5 2020-02-05 04:20:29 | [–] similar comments

Once you experience protocol buffers it becomes hard to go back.

rswail | karma 3347 | avg karma 2.51 2020-02-05 11:22:10+00:00 | [–] similar comments

How is a serde IDL relevant to microservices? XDR would do just as well, so would JSON.

dodobirdlord | karma 3155 | avg karma 2.5 2020-02-06 05:40:58 | [–] similar comments

Because protocol buffers are actually pleasant to use.

ATsch | karma 3283 | avg karma 4.33 2020-02-05 19:55:39 | [–] similar comments

RPC is pretty much also a "cross-language FFI boundary", except it can fail.

Of course it has some advantages, but it's hardly universally better.

echopom | karma 283 | avg karma 6.29 2020-02-04 19:42:07 | [–] similar comments

This was an extremely interesting read.

I'm quiet disappointed though they did not update their Go Version to 1.13[0][1] which would normally have remove the spike issue and thus he latency before they move to Rust...

Rust seems more performant with proper usage ( tokio + async ) but I'm more worried about the ecosystem that doesn't seem has mature has Go.

We could quote the recent[2] Drama with Actix...

[0]https://golang.org/doc/go1.13#runtime [1]https://golang.org/doc/go1.12#runtime [2]https://github.com/fafhrd91/actix-web-postmortem

chc | karma 20862 | avg karma 3.0 2020-02-04 19:46:55+00:00 | [–] similar comments

Why would you want to bring up the Actix author's drama? That doesn't seem like something that should reflect on a language one way or the other.

deweller | karma 2211 | avg karma 3.91 2020-02-04 19:59:59 | [–] similar comments

As an outsider to both the Go and Rust cultures, I read the Actix news and walked away with the impression that the Rust ecosystem is less mature.

faitswulff | karma 8583 | avg karma 4.66 2020-02-04 20:13:40+00:00 | [–] similar comments

Every community has it. The dep vs. vgo drama gave me the same impression of Go at the time: https://news.ycombinator.com/item?id=17063724

cies | karma 7348 | avg karma 2.3 2020-02-04 20:20:44+00:00 | [–] similar comments

Go's is more pragmatic. Rust's is more purist, and that reflects on the language features (more functional, more free in allowing you to use it for any purpose where Go is network-app specific, more strict in typing), the licensing and the attitude towards collaboration.

That collaboration thing is why Actix exploded I think. While mostly an isolated incident it does show some clash between the author's values (and possibly the author's employer's (MSFT) values) and the values of the general Rust community. I would not say that reflects on the maturity of the langues or ecosystem.

In Go a lot of stuff is Google dictated. In Rust it's a true open governance innovation project (looking to become a non-profit). Since the Go is a very specific language --made for networked apps and only has one way to do concurrency-- and Rust very broad --a true general purpose prog lang-- it is easy to see how Go mature so quickly (not much to mature) and also why it got a bit old so quickly as well (ignores most innovations in computer science of the last decades).

nemothekid | karma 13576 | avg karma 4.48 2020-02-04 20:43:29 | [–] similar comments

The Go community has a very similar story, where someone released a web framework, with an unorthodox set of features, and was flamed to the point where he abandoned the project and quit OSS.

https://github.com/go-martini/martini

fjp | karma 517 | avg karma 2.46 2020-02-04 21:13:20 | [–] similar comments

What was so unorthodox/upsetting to people there?

nemothekid | karma 13576 | avg karma 4.48 2020-02-04 22:29:50 | [–] similar comments

Martini used the service injection pattern and made use of reflection to do so. It was a very popular framework and one of the first in Go (it currently has ~10k stars), and the use of reflection became a very contentious viewpoint in the community.

rob74 | karma 12601 | avg karma 3.76 2020-02-04 20:57:26+00:00 | [–] similar comments

> Go is network-app specific

Just because it is used most for "network apps" doesn't mean it's limited to that. On the other hand, you could argue that Rust is a wrong fit for anything _except_ performance-critical applications, because for anything else it's not worth to saddle yourself with the added complexity.

> and Rust very broad --a true general purpose prog lang-- it is easy to see how Go mature so quickly (not much to mature) and also why it got a bit old so quickly as well

This simplicity is the thing Go opponents like to point out (or mock) most, and what Go fans actually would tell you is one of the best features of the language. It's actually refreshing to have one language that doesn't try to be everyone's darling by implementing every conceivable feature - we already have enough of those, Rust, C++, Java etc. etc. But you don't have to take my word for it, you can also read the first sentences from this blog post: https://bradfitz.com/2020/01/30/joining-tailscale - he puts it better than I could...

cies | karma 7348 | avg karma 2.3 2020-02-04 22:51:20 | [–] similar comments

As the grant parent commentor i cannot down vote, so that wasn't me.

Google explicitly shown no intent to make Go a fit beyond network apps. You can hack something into doing more than originally intended, but then you are usually operating "outside of warranty".

> On the other hand, you could argue that Rust is a wrong fit for anything _except_ performance-critical applications

Well, Rust does more than C-level high perf. It also allows for very safe/maintainable code that's high perf. Both of these are not something like a special feature, nope, ANY software needs to be high perf, bug free and maintainable to some degree. And as the size of the codebase grows, lack of these properties in a languages rears is ugly head.

The added complexity cost, as you mentioned, is IMHO not a real cost. It's more like an investment. You go with Rust, you have to pay up front: learning new concepts, slower dev't, more verbosity/syntax/gibberish-y code. But once the codebase grows, you(r team) have grown accustomed to this and you start to reap the benefits of Rust's safety, freedom to choose your concurrency patterns, maintainability and verbosity.

Now I do want to point out a REAL cost that was not mentioned yet, that Rust brings with it much more than Go: compile time. This sucks for Rust. Given the complexity of Rust, I dont expect it to ever come close to Go's lightning compiles. It will improve/ it constantly improving. And IDE features that prevent compiles (e.g.: error highlights) are maturing and will help too. But this is a big reason for picking Go.

Your Jedi mind trick about Go's "simplicity" does not work on me :) ... It's fast compiles (a result of simplicity) are the bonus. Not being able to use the language beyond network apps or go-routine-concurrency are simply a minus for every learner (not for Google as a creator), as you limit the use of your new skill. The reason they kept the 1B$ mistake (null) in there is simply unforgivable.

And if Go will never add features we have to see. Java also intended to stay lean, well...

cdoxsey | karma 3151 | avg karma 7.15 2020-02-05 05:27:49+00:00 | [–] similar comments

Go is a general purpose programming language. It is suitable for a large variety of programming tasks beyond network services (though itself a massive problem domain)

For example Go is great for building cli apps. Simple, easy to install and easy to understand.

For another Go has surprisingly good windows support. Google didn’t do that.

For a third Go has robust cryptography libraries.

There are actually lot’s of other contributions from the community if you’d take the time to look.

cies | karma 7348 | avg karma 2.3 2020-02-05 08:04:51+00:00 | [–] similar comments

> Go is a general purpose programming language.

Seriously? Does Google say that "native GUI devt" is intended use? And how about that it only supports one concurrency method, and does not allow one to implement one yourself.

> For example Go is great for building cli apps.

Ok, your joking right? CLI apps are simply a networked app that does not necessarily use the network. That's not an entirely new domain, like OpenGL, native GUIs, embedded systems, kernel programming, ...

> There are actually lot’s of other contributions from the community if you’d take the time to look.

That's good, but it still is not "open innovation" to the level of say Rust.

chc | karma 20862 | avg karma 3.0 2020-02-04 20:25:21 | [–] similar comments

I actually agree that there are many parts of Rust's ecosystem that are relatively immature — I just don't see how the Actix situation reflects on that. It's not like Actix was a core part of the Rust ecosystem. It was a framework that was most notable for doing very well on the Techempower benchmarks. People get hurt feelings and have flameouts in the C, Java, JavaScript, etc. ecosystems too.

andoriyu | karma 59 | avg karma 0.34 2020-02-04 21:10:55+00:00 | [–] similar comments

I wouldn't call rust ecosystem less mature than go, but it wouldn't call either of them mature.

Both have ups and downs. Rust definitely has immature web service ecosystem and it's a result of immature async i/o ecosystem. At the same time go has those things otb.

Communitivity | karma 2056 | avg karma 2.85 2020-02-04 20:08:58+00:00 | [–] similar comments

Agreed. One could argue that a level of drama in the community is a sign of growing maturity and wider interest in the language, because it is evidence there is no longer a niche monoculture of devs all thinking the same way.

In the words of Steve Klabnik "Rust has been an experiment in community building as much as an experiment in language building. Can we reject the idea of a BDFL? Can we include as many people as possible? Can we be welcoming to folks who historically have not had great representation in open source? Can we reject contempt culture? Can we be inclusive of beginners?" https://words.steveklabnik.com/a-sad-day-for-rust

The Actix issue was resolved, and Actix will continue under new maintainers (https://github.com/actix/actix-web/issues/1289). So I'd argue the answer to those questions is a 'yes'.

reggieband | karma 2802 | avg karma 6.12 2020-02-04 19:46:44+00:00 | [–] similar comments

When I see this kind of GC performance, I wonder why you wouldn't change the implementation to use some sort of pool allocator. I am guessing each Read State object is identical to one another (e.g. some kind of struct) so why not pre-allocate your memory budget of objects and just keep an unused list outside of your HasMap? In a way this is even closer to a ring where upon ejection you could write the object to disk (or Cassandra), re-initialise the memory and then reuse the object for the new entry.

I suppose that won't stop the GC from scanning the memory though ... so maybe they had something akin to that. I assume that a company associated with games and with some former games programmers would have thought to use pool allocators. Honestly, if that strategy didn't work then I would be a bit frustrated with Go.

I have to say, out of all of the non-stop spamming of Rust I see on this site - this is definitely the first time I've thought to myself that this is a very appropriate use of the language. This kind of simple yet high-throughput workhorse of a system is a great match for Rust.

monocasa | karma 27236 | avg karma 2.94 2020-02-04 19:48:26+00:00 | [–] similar comments

Yeah, they already weren't allocating, it was a GC pause that just scanned and would come up with essentially no extra garbage every two minutes.

azakai | karma 7684 | avg karma 3.12 2020-02-04 20:40:30+00:00 | [–] similar comments

A pool allocator could have reduced the number of existing allocations (1 big one instead of many small ones), making those spikes less significant. (But that depends on how Go handles interior pointers and GC, so I'm not sure.)

runevault | karma 2576 | avg karma 2.06 2020-02-04 21:17:04 | [–] similar comments

Allocations weren't the problem. It was the fact that, every 2 minutes, the GC would trigger because of an arbitrary decision by the Go team and scan their entire heap, find little to nothing to deallocate, then go on its merry way.

monocasa | karma 27236 | avg karma 2.94 2020-02-05 15:42:59 | [–] similar comments

It has to track interior pointers. The problem for them seems to be the mark phase, where the GC has to track down all the pointers either way.

_ph_ | karma 9798 | avg karma 2.44 2020-02-04 19:49:06+00:00 | [–] similar comments

Here it wasn't the problem, that the GC was lacking performance when collecting garbage, which a pool allocator would have helped with, but rather, that they didn't produce garbage (good), but the GC ran nevertheless to check whether memory could be returned to the OS. Probably supressing that would have removed the spikes.

masklinn | karma 65147 | avg karma 3.36 2020-02-04 19:58:07+00:00 | [–] similar comments

> When I see this kind of GC performance, I wonder why you wouldn't change the implementation to use some sort of pool allocator.

The allocations were not the issue, the article notes that they did little to no allocations, hence the GC only running on forced triggers (every 2mn)

KMag | karma 6889 | avg karma 2.73 2020-02-04 20:20:21+00:00 | [–] similar comments

In this case, since the lines of code that can touch the manually managed object pool are probably few and easily reviewed and audited, I don't have any problem with your advice.

I realize you're not advocating pervasive use of the technique, but if someone reading this is going to make pervasive use of manually managed object pools in a GC'd language, they should at least consider the possibility of moving to a language with both good language support for manually managed memory and a good ecosystem of tooling around manual memory management.

Manually managed object pools in a language designed around GC don't fully get rid of the costs of GC, and re-expose the program to most of the errors (primarily use-after-free, double-free, and leaks related to poorly reasoned ownership) that motivated so much effort in developing garbage collectors in the first place.

karma_daemon | karma 113 | avg karma 3.53 2020-02-04 19:51:37 | [–] similar comments

I wish the article would show a graph of the golang heap usage. I'm reminded of this cloudflare article [0] from a while back where they created an example that seemed to exhibit similar performance issues when they created many small objects to be garbaged collected. They solved it by using a pooled allocator instead of relying solely on the GC. Wonder if that would have been applicable here to the go version.

[0] https://blog.cloudflare.com/recycling-memory-buffers-in-go/

jaten | karma 8 | avg karma 0.57 2020-02-04 19:52:16+00:00 | [–] similar comments

just use an off heap hash table. simple. https://github.com/glycerine/offheap

Also, as others have said, lots of big GC improvements were ignored by insisting on go1.9.2 and not the latest.

favorited | karma 4158 | avg karma 3.89 2020-02-05 01:35:44 | [–] similar comments

The graphs are from 1.9.2, but the author said they tried 1.8, 1.9, and 1.10 and saw the same thing.

rvcdbn | karma 357 | avg karma 3.5 2020-02-04 19:53:51+00:00 | [–] similar comments

Seems like you were hitting: runtime: Large maps cause significant GC pauses #9477 [0]

Looks like this issue was resolved for maps that don't contain pointers by [1]. From the article, sounds like the map keys were strings (which do contain pointers, so the map would need to be scanned by the GC).

If pointers in the map keys and values could be avoided, it would have (if my understanding is correct) removed the need for the GC to scan the map. You could do this for example by replacing string keys with fixed size byte arrays. Curious if you experimented this approach?

[0] https://github.com/golang/go/issues/9477

[1] https://go-review.googlesource.com/c/go/+/3288

ryuukk_ | karma -17 | avg karma -0.08 2020-02-04 19:58:52+00:00 | [–] similar comments

exactly, they use 3 years old version

with the improvements made to runtime and that issue fixed, then is safe to say that GO is much faster than rust, based on their graphs

jasondclinton | karma 844 | avg karma 5.05 2020-02-04 20:25:00 | [–] similar comments

Finding out if that does resolve the author's issue would be interesting but I'm not sure that that would be particularly supportive data in favor of Go. If anything it would reinforce the downsides of Go's GC implementation: prone sudden pitfalls only avoidable with obtuse, error-prone fiddling that makes the code more complex.

After spending weeks fighting with Java's GC tuning for a similar production service tail latency problem, I wouldn't want to be caught having to do that again.

masklinn | karma 65147 | avg karma 3.36 2020-02-04 20:27:03+00:00 | [–] similar comments

The good news are that Go's GC has basically no tunables, so you wouldn't have spent weeks on that. The bad news is that it has basically no tunables so if it's a tuning issue you're either fucked or have to put "tuning" hacks right into the code if you find any that works (e.g. twitch's "memory ballast" to avoid overly aggressive GC runs: https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i...)

PeterCorless | karma 1395 | avg karma 2.01 2020-02-04 20:46:38+00:00 | [–] similar comments

There are tradeoffs with all languages. C++ avoids the GC, but you then have to make sure you know how to avoid the common pitfalls of that language.

We use C++ at Scylla (saw that we got a shout-out in the blog! Woot!) but it's not like there isn't a whole industry about writing blogs avoiding C++ pitfalls.

C++ pitfalls through the years... • https://www.horstmann.com/cpp/pitfalls.html (1997) • https://stackoverflow.com/questions/30373/what-c-pitfalls-sh... (2008) • http://blog.davidecoppola.com/2013/09/cpp-pitfalls/ (2013) • https://www.typemock.com/pitfalls-c/ (2018)

I am not saying any of these (Go, Rust, C++, or even Java) are "right" or "wrong" per se, because that determination is situational. Are you trying to optimize for performance, for code safety, for taking advantage of specific OS hooks, or oppositely, to be generically deployable across OSes, or for ease of development? For the devs at Scylla, the core DB code is C++. Some of our drivers and utilities are Golang (like our shard aware driver). There's also a Cassandra Rust driver — it'd be sweet if someone wants to make it shard-aware for Scylla!

xb95 | karma 893 | avg karma 11.02 2020-02-04 20:56:14+00:00 | [–] similar comments

(Discord infra person here.)

Actually we didn't update the reference to Cassandra in the article -- the read states workload is now on Scylla too, as of last week. ;)

We'll be writing up a blog post on our migration with Scylla at some point in the next few months, but we've been super happy with it. I replaced our TokuMX cluster with it and it's faster, more reliable, _and_ cheaper (including the support contract). Pretty great for us.

PeterCorless | karma 1395 | avg karma 2.01 2020-02-04 21:17:00 | [–] similar comments

Woot! Go you! (Or Rust you! Whichever you prefer!)

mathw | karma 1497 | avg karma 2.98 2020-02-05 13:54:18+00:00 | [–] similar comments

What a glorious combination of things! What a shame faster, more reliable and cheaper don't usually go together, but that's the challenge all developers face...

hinkley | karma 39933 | avg karma 2.46 2020-02-04 20:35:22+00:00 | [–] similar comments

The common factor in most of my decisions to look for a new job has been realizing that I feel like a very highly compensated janitor instead of a developer.

Once I spend even the plurality of my time cleaning up messes instead of doing something new (and there are ways to do both), then all the life is sucked out of me and I just have to escape.

Telling me that I have to keep using a tool with known issues that we have to process or patches to fix would be super frustrating. And the more times we stumble over that problem the worse my confirmation bias will be.

Even if the new solution has a bunch of other problems, the set that is making someone unhappy is the one that will cause them to switch teams or quit. This is one area where management is in a tough spot with respect to rewrites.

Rewrites don't often fix many things, but if you suspect they're the only thing between you and massive employee turnover, you're between a rock and a hard place. The product is going to change dramatically, regardless of what decision you make.

outworlder | karma 14663 | avg karma 3.03 2020-02-04 20:47:56+00:00 | [–] similar comments

While I completely agree with the "janitor" sentiment... and for Newton's sake I feel like Wall-E daily...

> Telling me that I have to keep using a tool with known issues that we have to process or patches to fix would be super frustrating.

All tools have known issues. It's just that some have way more issues than others. And some may hurt more than others.

Go has reached an interesting compromise. It has some elegant constructs and interesting design choices (like static compilation which also happens to be fast). The language is simple, so much so that you can learn the basics and start writing useful stuff in a weekend. But it is even more limiting than Java. A Lisp, this thing is not. You can't get very creative – which is an outstanding property for 'enterprises'. Boring, verbose code that makes you want to pull your teeth out is the name of the game.

And I'm saying this as someone who dragged a team kicking and screaming from Python to Go. That's on them – no-one has written a single line of unit tests in years, so now they at least get a whiny compiler which will do basic sanity checks before things blow up in prod. Things still 'panic', but less frequently.

pstuart | karma 5489 | avg karma 1.63 2020-02-04 21:29:59 | [–] similar comments

I'll take boring over WTF code any day :-)

nicoburns | karma 22847 | avg karma 3.29 2020-02-04 21:37:00+00:00 | [–] similar comments

It's not necessarily an either-or though. I'll take clear, concise expressive code over either!

tensor | karma 5723 | avg karma 3.71 2020-02-04 22:01:56+00:00 | [–] similar comments

Most development jobs on products that matter involve working on large established code bases. Many people get satisfaction from knowing that their work matters to end users, even if it's not writing new things in the new shiny language or framework. Referring to these people as "janitors" is pretty damn demeaning, and says more about you than the actual job. Rewrites are rarely the right call, and doing simply to entertain developers is definitely not the right call.

heinrich5991 | karma 1879 | avg karma 3.46 2020-02-04 22:56:53 | [–] similar comments

>Referring to these people as "janitors" is pretty damn demeaning,

"Referring to the term of "janitors" as demeaning is pretty demeaning and says more about you than your judgement of the parent."

I don't like this rhetoric device you just used.

Also, I think that janitors do important work as well.

hu3 | karma 5570 | avg karma 2.62 2020-02-04 23:24:07+00:00 | [–] similar comments

Let's not fool ourselves.

The demeaning of janitors was introduced by GP by describing it as something they would rather not do.

No mental gymnastics required.

cma | karma 10501 | avg karma 1.7 2020-02-05 00:30:45 | [–] similar comments

He said he felt like a janitor, next guy said he demeaned others as janitors, and now you are saying he demeaned janitors. There is a level of gymnastics going on.

hu3 | karma 5570 | avg karma 2.62 2020-02-05 00:48:21 | [–] similar comments

First paragraph:

> The common factor in most of my decisions to look for a new job has been realizing that I feel like a very highly compensated janitor instead of a developer.

So for that person, feeling like a janitor is incentive for seeking a new job. It's that simple really.

CDSlice | karma 1006 | avg karma 3.49 2020-02-05 01:00:00+00:00 | [–] similar comments

That doesn't mean he is demeaning janitors, just that he doesn't want to be one. There are loads of reasons to not want to be a "code janitor" besides looking down at janitors.

rswail | karma 3347 | avg karma 2.51 2020-02-05 10:18:20+00:00 | [–] similar comments

I'm not a gymnast, but comparing people comparing their work to janitors and calling it gymnastics demeans gymnasts.

/s

therockhead | karma 547 | avg karma 3.04 2020-02-05 15:37:05 | [–] similar comments

It helps if your think gardener instead of janitor.

rvcdbn | karma 357 | avg karma 3.5 2020-02-04 20:46:52+00:00 | [–] similar comments

For any tracing GC, costs are going to be proportional to the number of pointers that need to be traced. So I would not call reducing the use of pointers to ameliorate a GC issue "obtuse, error-prone fiddling". On the contrary, it seems like one of the first approaches to look at when faced with the problem of too much GC work.

Really all languages with tracing GC are at a disadvantage when you have a huge number of long-lived objects in the heap. The situation is improved with generational GC (which Go doesn't have) but the widespread use of off-heap data structures to solve the problem even in languages like Java with generational GC suggests this alone isn't a good enough solution.

In Go's defense, I don't know another GC'ed language in which this optimization is present in the native map data structure.

pjmlp | karma 109153 | avg karma 1.76 2020-02-05 15:06:50+00:00 | [–] similar comments

Except that plenty of languages with tracing GC have also off GC memory allocation.

Since you mention not knowing such languages, have a look at Eiffel, D, Modula-3, Active Oberon, Nim, C#/F# (specially after the latest improvements).

Also Java will eventually follow the same idea as Eiffel (where inline classes are similar to expanded classes in Eiffel), and ByteBuffers can be off-GC heap.

gwbas1c | karma 12318 | avg karma 2.74 2020-02-04 20:35:08+00:00 | [–] similar comments

Everything I've read indicates that RAM caches work poorly in a GC environment.

The problem is that garbage collectors are optimized for applications that mostly have short-lived objects, and a small amount of long-lived objects.

Things like large in-RAM LRU are basically the slowest thing for a garbage collector to do, because the mark-and-sweep phase always has to go through the entire cache, and because you're constantly generating garbage that needs to be cleaned.

pkolaczk | karma 2452 | avg karma 1.99 2020-02-04 22:17:12 | [–] similar comments

A high number of short lived allocations is also a bad thing in a compacting GC environment, because every allocation gets you a reference to a memory region touched very long time ago and it is likely a cache miss. You would like to do an object pool to avoid this but then you run into a pitfall with long living objects, so there is really no good way out.

mcguire | karma 18217 | avg karma 1.81 2020-02-04 23:31:33+00:00 | [–] similar comments

???

The allocation is going to be close to the last allocation, which was touched recently, no? The first allocation after a compaction wii be far from recent allocations, but close to the compacted objects?

pkolaczk | karma 2452 | avg karma 1.99 2020-02-05 18:46:42 | [–] similar comments

Close to the last allocation doesn't matter. What matters is the memory returned to the application - and this is memory that has been touched long ago and unlikely in cache. If your new generation size is larger than L3 cache it will have to be fetched from main memory for sure every time you start the next 64 bytes. I believe a smart cpu will notice the pattern and will prefetch to reduce cache miss latency. But a high allocation rate will use a lot of memory bandwidth and would thrash the caches.

An extreme case of that problem happens when using GC in an app that gets swapped out. Performance drops to virtually zero then.

sorokod | karma 4501 | avg karma 1.96 2020-02-04 22:27:21+00:00 | [–] similar comments

In this[1] video at about 32 min, mark there is a discussion on GC and apps that do caching.

[1] https://www.youtube.com/watch?v=VCeHkcwfF9Q

Nitramp | karma 1920 | avg karma 4.27 2020-02-05 13:06:50+00:00 | [–] similar comments

> The problem is that garbage collectors are optimized for applications that mostly have short-lived objects, and a small amount of long-lived objects.

I think it's not quite that.

Applications typically have a much larger old generation than young generation, i.e. many more long lived objects than short lived objects. So GCs do get optimized to process large heaps of old objects quickly and efficiently, e.g. with concurrent mark/sweep.

However as an additional optimization, there is the observation that once an application has reached steady state, most newly allocated objects die young (think: the data associated with processing a single HTTP request or user interaction in a UI).

So as an additional optimization, GCs often split their heap into a young and an old generation, where garbage collecting the young generation earlier/more frequently overall reduces the mount of garbage collection done (and offsets the effort required to move objects around).

In the case of Go though, the programming language allows "internal pointers", i.e. pointers to members of objects. This makes it much harder (or much more costly) to implement a generational, moving garbage collector, so Go does not actually have a young/old generation split nor the additional optimization for young objects.

pjmlp | karma 109153 | avg karma 1.76 2020-02-05 15:08:15 | [–] similar comments

Which is why on GC languages that also support value types and off GC-heap allocations, one makes use of them, instead of throwing out the baby with the water.

asimpletune | karma 4828 | avg karma 3.43 2020-02-04 20:39:53+00:00 | [–] similar comments

Ok but in rust those pointers can just be borrowed obviating the need for gc at all.

masklinn | karma 65147 | avg karma 3.36 2020-02-04 20:42:27 | [–] similar comments

Given it's a cache the entries would not have an existing natural owner… except for the cache itself.

There would be no need for a GC to traverse the entire map, but that's because rust doesn't use a GC.

falcolas | karma 33589 | avg karma 3.15 2020-02-04 21:11:50+00:00 | [–] similar comments

While Rust does not have a discrete runtime GC process, it does utilize reference counting for dynamic memory cleanup.

So you could argue that they are still going to suffer some of the downsides of a GC'ed memory allocation. Some potential issues include non-deterministic object lifespan, and ensuring that any unsafe code they write which interacts with the cache does the "right thing" with the reference counts (potentially including de-allocation; I'm not sure what unsafe code needs to do when referencing reference counted boxes).

masklinn | karma 65147 | avg karma 3.36 2020-02-04 21:22:46+00:00 | [–] similar comments

> While Rust does not have a discrete runtime GC process, it does utilize reference counting for dynamic memory cleanup.

That's so misleading as to essentially be a lie.

Rust uses reference counting if and only if you opt into it via reference-counted pointers. Using Rc or Arc is not the normal or default course of action, and I'm not aware of any situation where it is ubiquitous.

> So you could argue [nonsense]

No, you really could not.

mcguire | karma 18217 | avg karma 1.81 2020-02-04 23:37:49+00:00 | [–] similar comments

On the other hand, Rust's RAII management model behaves similarly to a reference counting system where the counts are limited to 0 and 1 (well, for a loose approximation of the 0 state), right?

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 23:44:52 | [–] similar comments

Some people say this, but I think it's misleading; refcounting can make things live longer, but the borrow checker cannot.

Jweb_Guru | karma 2749 | avg karma 2.24 2020-02-05 07:48:23+00:00 | [–] similar comments

Well, with ownership, a move can make things "live" longer, I guess.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-05 14:00:48+00:00 | [–] similar comments

You're not wrong. I just think there's enough difference that the analogy doesn't really work.

ta20200202 | karma 37 | avg karma 6.17 2020-02-05 00:06:35+00:00 | [–] similar comments

RAII for references (pointers) is a no-op. If the cache returns references to the data in its own array there is no overhead.

falcolas | karma 33589 | avg karma 3.15 2020-02-05 00:36:52+00:00 | [–] similar comments

I was making an assumpotion that using a vector of ARC<T> would be the best way to handle a global LRU cache. Perhaps I should have specified it, but it seemed pretty obvious. Sorry if it wasn’t.

If there’s a better way to handle a global LRU cache, I’m all ears.

jhgg | karma 3220 | avg karma 6.78 2020-02-05 00:54:09+00:00 | [–] similar comments

Assuming only one thread at a time needs to access the LRU cache (not hard with the shared-nothing message passing architecture which we employ here), the lifetime of the object being checked out from the cache is able to be understood at compile time, and we can just use the borrow checker to ensure that it remains that way (we've got a mutable reference to the LRU, and we can use that to get a mutable reference to an object within the LRU. By the time the function that is mutating the data in the LRU finishes, the references to the objects must be dead (the borrow checker will enforce that.) Since all this information is available during compile time, runtime ref-counting (via rc/arc) is not necessary.

This is made possible by rust's memory model, where it understands ownership of data, and the lifetime of each reference that's being taken from that owned data. This means that the compiler can statically determine how long an object needs to live, and that references to the object don't outlive the owned data. For use-cases where the lifetime of references are able to be statically understood, an arc/rc is not required. This blog-post goes into it in much better detail than I can: https://words.steveklabnik.com/borrow-checking-escape-analys...

falcolas | karma 33589 | avg karma 3.15 2020-02-05 01:40:06 | [–] similar comments

Yes, I'm quite familiar with rust's borrow checking model. I've programmed some in rust, and the rest has been beaten into my head quite thoroughly by Rustaceans. I don't care for Rust, but I understand it.

Locking on one thread at a time seems like a pretty obvious performance flaw. It just doesn't seem like an appropriate design for the given workload (lots of requests, lots of stored items, largely write-only (except for its position in the queue)). It would make a lot more sense to grant multiple threads access the LRU at any given time.

And early optimization and all that aside, creating the LRU in such a way that it can be easily restricted to one thread or opened up makes the most sense to me. Otherwise, you get to re-write the LRU (and all the code which accesses it) if it should be identified as a bottleneck.

Of course, I'm not responsible for the code or truly involved in the design process, so my perspective may be limited.

jhgg | karma 3220 | avg karma 6.78 2020-02-05 04:14:28+00:00 | [–] similar comments

In practice, for our service, most of our CPU time is not spent in data mutation, but rather networking and serialization (this is btw, the same conclusion Redis came to when they added "multi-threading".)

You can scale-out by running multiple instances of the service (shared-nothing, N many depending on how cores you want to run on.) Or, you can do message-passing between cores.

In this case, we have 2 modes of scale-up/out (add more nodes to the cluster, or add more shared-nothing LRU caches that are partitioned internally that the process runs, allowing for more concurrency).

We however only run one LRU per node, as it turns out that the expensive part is not the bottleneck here, nor will it probably ever be.

jstrong | karma 353 | avg karma 2.04 2020-02-05 04:48:58 | [–] similar comments

what kind of design do you have in mind? I assume you don't mean simultaneous reads/writes from multiple threads without synchronization - yolo! there's a lot of possible designs, mutex, read/write lock, concurrent hashmap. I've never worked on an LRU cache, asking because interested in what plays well in that use case, and how you would approach it in another language.

iknowstuff | karma 1789 | avg karma 2.62 2020-02-04 21:48:00+00:00 | [–] similar comments

I think you're confusing Rust's ownership model with Swift's ARC. Rust doesn't do reference counting unless you use Rc<T> or Arc<T>.

falcolas | karma 33589 | avg karma 3.15 2020-02-05 00:34:29+00:00 | [–] similar comments

Given the model of memory we are discussing (a global per-process LRU cache), that’s exactly what I was discussing using. Unless there’s another way to handle such global caches.

bearcherian | karma 52 | avg karma 4.33 2020-02-04 20:49:28+00:00 | [–] similar comments

The article also mentions the service was on go 1.9.2, which was released 10/2017. I'd be curious to see if the same issues exist on a build based on a more recent version of Go.

typical182 | karma 628 | avg karma 7.95 2020-02-04 21:05:15 | [–] similar comments

Maybe that is what they hit... but it seems there is a pretty healthy chance they could have resolved this by upgrading to a more modern runtime.

Go 1.9 is fairly old (1.14 is about to pop out), and there have been large improvements on tail latency for the Go GC over that period.

One of the Go 1. 12 improvements in particular seems to at least symptomatically line up with what they described, at least at the level of detail covered in the blog post:

https://golang.org/doc/go1.12#runtime

“Go 1.12 significantly improves the performance of sweeping when a large fraction of the heap remains live.“

kristianp | karma 11941 | avg karma 3.44 2020-02-05 04:31:38+00:00 | [–] similar comments

I was thinking that if their cache is just one large hash table, essentially an array of structs, the GC wouldn't need to scan it. What you say about strings contained in the map would explain their problems, however I don't see the reason for it. Wouldn't you make sure every identifier uses a fixed-length GUID or similar, which would be contained in such a struct used in the array-of-structs?

Thaxll | karma 4756 | avg karma 1.87 2020-02-04 20:10:52 | [–] similar comments

Really interesting post, however they're using a 2+years old runtime, Go 1.9.2 was released 2017/10/25 why did they not even try Go 1.13?

For me the interesting part is that their new implementation in Rust with a new data structure is less than 2x faster than an implementation in Go using a 2+years old runtime.

It shows how fast Go is vs an very optimized language + new data structure with no GC.

Overall I'm pretty sure there was a way to make the spikes go away.

Still great post.

yazaddaruvala | karma 3171 | avg karma 2.36 2020-02-05 05:31:15+00:00 | [–] similar comments

The graphs were in different units. The final Rust version was over 100x faster.

Thaxll | karma 4756 | avg karma 1.87 2020-02-05 14:31:30+00:00 | [–] similar comments

Which doesn't make any sense. Rust is not x100 faster than Go.

yazaddaruvala | karma 3171 | avg karma 2.36 2020-02-06 08:33:36+00:00 | [–] similar comments

Rust and Go likely translate into similar enough assembly for similar code to make the performance close enough.

However, bigger caches will always have more cache hits than smaller caches. Therefore could easily be 100x faster.

The blog does a better job explaining everything than I can but simply put the “granular” memory management Rust allows gave them an improved ability to create a bigger cache. Go (at the time) while great did not work well for that particular usecase and required smaller cache sizes.

blackrock | karma 537 | avg karma 0.39 2020-02-04 20:17:01 | [–] similar comments

Would it have been better if they went with Elixir?

Write their code in a functional style. Get the benefits of the Erlang BEAM platform.

Their system runs over the web, so time sensitivity isn’t as important, in comparison to video games, VR, or AR.

Anyone ever done a performance comparison breakdown between something like Elixir vs. Rust?

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 20:18:59 | [–] similar comments

Discord is a heavy Elixir user, and even uses it with Rust via NIF: https://blog.discordapp.com/using-rust-to-scale-elixir-for-1...

jerf | karma 85298 | avg karma 5.28 2020-02-04 21:08:21+00:00 | [–] similar comments

"Would it have been better if they went with Elixir?"

No. It would have been unshippably bad. BEAM is generally fairly slow. It was fast at multitasking for a while, but that advantage has been claimed by several other runtimes in 2020. As a language, it is much slower than Rust. Plus, if you tried to implement a gigantic shared cache map in Erlang/Elixir, you'd have two major problems: One is that you'd need huge chunks of the map in single (BEAM) processes, and you'd get hit by the fact BEAM is not set up to GC well in that case. It wants lots of little processes, not a small number of processes holding tons of data. Second is that you'd be trading what in Rust is "accept some bytes, do some hashing, look some stuff up in memory" with generally efficient, low-copy operations, with "copy the network traffic into an Erlang binary, do some hashing, compute the PID that actually has the data, send a message to that PID with the request, wait for the reply message, and then send out the answer", with a whole lot of layers that expect to have time to make copies of lots of things. Adding this sort of coordination into these nominally fast lookups is going to slow this to a crawl. It's like when people try to benchmark Erlang/Elixir/Go's threading by creating processes/goroutines to receive two numbers and add them together "in parallel"; the IPC completely overshadows the tiny amount of work being done. (They mention tokio, but that's still going to add a lot less coordination overhead than Erlang messages.)

Go is a significantly better language for this use case than Elixir/Erlang/BEAM is, let alone Rust.

(This is not a "criticism" of Erlang/Elixir/BEAM. It's an engineering analysis. Erlang/Elixir/BEAM are still suitable for many tasks, just as people still use Python for many things despite the fact it would be a catastrophically bad choice for this particular task. This just isn't one of the tasks it would be suitable for.)

sergiotapia | karma 13626 | avg karma 2.88 2020-02-04 21:24:55 | [–] similar comments

>It was fast at multitasking for a while, but that advantage has been claimed by several other runtimes in 2020.

Such as?

jerf | karma 85298 | avg karma 5.28 2020-02-05 02:42:58+00:00 | [–] similar comments

Go, Rust is apparently getting there since it just smoked Go on a pretty core multitasking-heavy task here, and a lot of the old "stodgy" languages like Java while none of the cool kids were looking have gotten pretty good at large numbers of threads too.

I'm also expecting some sort of sensible solution to this sort of concurrency challenge to simply be baseline expected functionality for the next generation of languages. Anyone sitting down in 2020 to write The Next Big Language who ignores the fact that by the time they're done, their CPUs are going to be 64-core and the GPUs will be several-hundred-thousand core is really going to be missing the boat.

I actually encourage Erlang partisans to consider this a win in the general sense. Quite serious. If you consider languages as a multi-decade conversation, almost everything Erlang "said" in the late 1990s and early 2000s is in fact proving out to be true. However, while Erlang was a trailblazer, it isn't going to be that Next Big Language, nor anything like it. It got a lot of things right, but it's got too much wrong to be the NBL, and even if you did the minimal fixes, the result wouldn't be backwards compatible with Erlang anymore so it'd be a new language.

A lot of Erlang partisans are making the error of thinking the next language needs to be just like Erlang, except perhaps more so. But that's not how progress gets made. The good ideas are ripped out at a much more granular level and recombined with all sorts of other ideas and the end result may be quite different than what you expect. Go, for instance, can very much be seen as a direct sequel to Erlang among other things, even though it may not seem to have "OTP" or "built-in multinode concurrency" or whatever other apparent bullet-list features Erlang has, because a lot of those bullet-list features are really just epiphenomena of the real underlying features. Rust has its own nods to Erlang too; the whole "ownership" thing comes from the experiences with both mutation and immutability in a threading environment. But rather than making "Erlang, but a bit moreso", you get something that takes the lessons, ponders on them for another 15 years, and produces something different. It isn't Erlang, because at the time that Erlang was being written (Joe Armstrong RIP), nobody had the experience to think of Rust and imagine it could make a practical language. (Nor am I entirely sure the computers of the time could have compiled it; even if we sat through the clock time I'm not sure Rust could be stuffed into a 1997 desktop computer's RAM.)

blackrock | karma 537 | avg karma 0.39 2020-02-05 02:01:15 | [–] similar comments

Good points.

> It wants lots of little processes, not a small number of processes holding tons of data

Elixir/Erlang is good for handling a lot of little processes with a small amount of data. And not for a small number of processes, handling a large amount of data.

The little processes holds smaller data, and it just gets dropped, after the function is done, instead of getting reclaimed by a garbage collector.

This is probably what makes Elixir/Erlang good for telecom equipment, like packet switching hardware, but not good for more complex software applications that may need to fetch and manipulate a lot of structured data in multiple stages.

In this case, does anyone know of Elixir’s maximum throughput?

hopia | karma 254 | avg karma 1.12 2020-02-05 02:27:45+00:00 | [–] similar comments

Not to disagree with your analysis of the performance implications, but I don't think having all that data under a single or a few processes would be the right architectural pattern to handle this in Elixir.

The article says that the data is basically "per-user", indicating that the active client connection process could be used to store the data. It already hosts other data related to the client (connection) anyway. I think updating and querying it globally would be the trouble in that case.

Another could be storing the data in mnesia, BEAM's internal mutable in-memory DB. Probably better, but still not ideal to solve this.

Anyway, you're right in that no matter how you'd try to solve this problem on pure Elixir you'd still be seeing some bottlenecks because BEAM just isn't very well suitable for this kind of problems, hence Rust.

But can you elaborate on what you mean by other platforms catching up with Elixir's inherent concurrency advantages? Which modern platforms give similar features?

jerf | karma 85298 | avg karma 5.28 2020-02-05 02:59:12 | [–] similar comments

"The article says that the data is basically "per-user","

Given that this is a table of who is "online", I don't think that's per-user in the sense that you are inferring. I infer that it's not a whole bunch of little local data that doesn't interact, it's a big global table of who is online and not online, constantly being heavily read from and written to in real time. Consider from the perspective of Bob's Erlang process that he wants to go offline and notify all of his currently-online friends that is is going offline. Bob's Erlang process doesn't have that data. Bob's Erlang process is going to get it from the Big Table of Who's Online. That table is the problem; it can't be stored in Bob's Erlang process.

I was at least imagining that the table could be partitioned into pieces pretty trivially (first X bits of the hash), but with Erlang's design, that implies an IPC just to ask some server process to give me the PID of the chunk I need to talk to, which itself is going to bottleneck. (In practice we'd probably cheat and use a NIF to do that, but that amounts to an admission that Erlang can't do this, so....)

At smaller scales you could try to live update Bob's local information as it changes, but this breaks down in all sorts of ways at scales far smaller than Discord, scales much closer to "a single mid-sized company".

"Another could be storing the data in mnesia, BEAM's internal mutable in-memory DB."

I have used mnesia for loads literally a ten-thousandth as small as this, if that (I could probably tack two more zeros on there), and it breaks down. It is an absolutely ludicrous idea that mnesia could handle what Discord is doing here. Last I knew the official Erlang community consensus was basically that mnesia really shouldn't be used for anything serious; my experience backed that up.

I think a non-trivial part of the reason why Erlang hasn't taken off is that its community still seems to exist in 2003, where it's a really incredible unique language that solves huge problems that nobody else does. In 2003, it rather has a point. But a lot of things have learned from Erlang, and incorporated its lessons into newer designs, and moved on.

See my other comment for what other runtimes have Erlang's advantages, but I'd invite you just to consider what we seem to basically agree on here; Erlang would be wildly slower and require a lot more hardware than Rust, the Rust code probably wasn't that hard to write, ... and the Rust code is way more likely to be correct than the Erlang code, too. I mean, what more "catching up to Elixir's inherent concurrency advantages" in this context than "did a job Elixir couldn't possibly do" do you want?

hopia | karma 254 | avg karma 1.12 2020-02-05 03:45:45+00:00 | [–] similar comments

Yeah the scale is what makes this problem a problem here. I've done exactly that "online" stuff per user process and it works fine on a small scale, even when it needs to be globally inferred. But I suspect it'd quickly become the bottleneck when scaling.

I had no idea mnesia was that fragile though, what gives? What kind of issues did you encounter with it? What do you use now to solve those issues with Erlang/Elixir?

Sure, we all know Erlang doesn't shine in computationally intensive workloads. Obviously, Rust was the right call here. But stateful distributed soft real-time concurrency, can you really say with a straight face that Rust comes with all the same features as BEAM out-of-the-box? Or any other modern platform for that matter. I've yet to see Erlang/Elixir beaten in that particular niche.

jerf | karma 85298 | avg karma 5.28 2020-02-05 15:15:42+00:00 | [–] similar comments

"I had no idea mnesia was that fragile though, what gives? What kind of issues did you encounter with it? What do you use now to solve those issues with Erlang/Elixir?"

I had ~10,000 devices in the field with unique identifiers creating long-term, persistent connections to a central cluster. An mnesia table stored basically $PERSISTENT_ID -> PID they are connected to. It needed to be updated when they connected and disconnected, which let me emphasize was a relatively rare occurrence; the ideal system would be connected for days at a time, not connecting & disconnecting dozens of times a minute. At most, reconnection flurries might occasionally occur where they'd all try to connect over the course of a few minutes (they had backoff code built in) if the cluster was down for some reason.

Mnesia fell over. A lot. All I could find online as an explanation was basically "yeah, don't do that with mnesia". Bizarrely, it wasn't the connection flurries that did it, either... it was the normal "maybe a few dozen events per second" that tended to do it. Erlang itself was usually fine. (Although for machines right next to each other in a rack, I did lose the clustering more often than I'd like, and have to hit the REPL to re-associated nodes together. Much less often than mnesia corrupted itself, though.)

"can you really say with a straight face that Rust comes with all the same features as BEAM out-of-the-box?"

Well, that's another way of looking at what I was trying to say. That's the wrong question. Rust doesn't need "all the same features as BEAM". Rust needs "the features necessary to do the work". While the Erlang community is looking for a language that has "all the same features as BEAM" and smugly congratulating themselves that no other language seems to have cracked that yet, a number of languages are passing them by by implementing different features. Many of those languages, as I said, are informed by Erlang. Many of these new languages are choosing their "not exactly like Erlang" features in knowledge, not ignorance, as I think the Erlang community thinks.

Besides, Erlang builds in a lot of things that can be libraries in other languages. I built the replacement in Go. Mostly because it was hard to get people who wanted to work in Erlang but despite the rage on HN anytime Go comes up, getting people who are willing to work in Go was trivial even 5 years ago. (Hiring someone who knows Go already is still a bit of a challenge, but crosstraining someone into it is easy.) For the port, I wrote https://github.com/thejerf/reign . You will look at it and go "But Erlang has this and that and the other thing with its clustering, and your thing doesn't have those things!" And my response is twofold: First, that some of those things are supported in Go code in other ways than what you are expecting, and that was not intended to be "Erlang in Go" but "a library for helping port Erlang programs into Go without rearchitecting", and second... the resulting cluster has been more reliable and more performant (we actually cut the cluster from 4 to 2, because now even a single machine can handle the entire load), and all the "features" reign is missing, well, maybe they aren't so important out of the context of Erlang. I suppose in my own way this is another sort of story like Discord's; on the metrics I care about, my home-grown clustering library worked better for me than Erlang's clustering code.

(In fact, Go's even got the edge on Erlang for GC for my use case, which is one of the ways in which the new system is more performant. Now, it happens that my system is architected on sending around messages that may frequently be several megabytes in size, and Erlang was really designed for sending around lots of messages in the kilobyte range. Even as I was using it, Erlang got a lot better with handling that, but it still was never as good or fast as Go, and Go's only gotten better since then, too. I was able to do things in Go for performance to re-use my buffers that are impossible in Erlang.)

So, I mean, while I do deeply respect Erlang for its pioneering position, and I am particularly grateful for the many years I spent with it back when it was the only option of its sort (if I had to write the project in question in C++ or something, I just wouldn't have; do not think I "hate" Erlang or something, I am very grateful for it), if I am a bit less starry-eyed about it than some it's because I see it as... just code. It's just code. Erlang gets no special access to CPU instructions or special Erlang-only hardware that allows it to do things no other language can. It's just code. Code that can be and has been written in other languages, in other environments.

I like Erlang in a lot of ways, and respect its place in history. But it's community is insular, maybe even a bit sick, and I don't really expect that to change, because once an individual realizes it, they tend to just leave, leaving behind only the True Believers, who still believe that Erlang is the unique and special snowflake... that it was... 15 years ago.

hopia | karma 254 | avg karma 1.12 2020-02-05 22:35:57+00:00 | [–] similar comments

Thanks for the comprehensive reply!

I guess I better experiment more with mnesia before really using it for anything serious. Or find alternatives. We had Redis before but that experience turned out just awful so we got rid of it.

As for the community, I think Elixir is where it's at nowadays. There is, unsurprisingly, a very strong focus on webby stuff with Elixir, and a lot of the things you would build with it are just easy. Like a multi-machine chat server.

If I started to build a new distributed chat server today, Elixir would still be the easiest way to go, despite eventually likely not being the most performant solution out there. Discord likewise seems happy with their choice for this particular use case, only supplementing it with the likes of Rust for specific problems in their domain.

I mean you yourself built a lot of the Erlang's/BEAM's logic from the scratch on Go just to be able to use it there. I'm expecting I'd end up in a similar alley with Rust/Haskell/take your pick if I was attacking the problems where Elixir has all the facilities already set up and battle tested.

dennisgorelik | karma 2312 | avg karma 1.24 2020-02-04 20:41:46 | [–] similar comments

> Changing to a BTreeMap instead of a HashMap in the LRU cache to optimize memory usage.

Why would BTreeMap be faster than HashMap? HashMap performance is O(1), while BTreeMap performance is O(log N).

nemothekid | karma 13576 | avg karma 4.48 2020-02-04 20:46:26 | [–] similar comments

1. They never said it was faster, only that memory usage was better. Regardless, it could be the case that log N < C, if C is sufficiently large.

2. Memory usage on a hash map would be worse especially if the fill ratio is relatively low.

scott_s | karma 34069 | avg karma 3.96 2020-02-04 21:20:43 | [–] similar comments

This subthread explains why it's more memory efficient to use a tree-based structure: https://news.ycombinator.com/item?id=22239393. Short version is that in order to get good performance out of a hashtable based structure, you want to have more than n slots in order to achieve good performance.

Which brings me to my second point: hashtable based data structures are not worst-case O(1). They are worst-case O(n), because in the worst case, you will either have to scan every entry in your table (open addressing) or walk a list of size n (separate chaining). Of course, good hashtable implementations will not allow a situation with so many collisions, but in order to avoid that, they will need to allocate a new table and copy over the contents of the old, which is also a O(n) operation.

Given two kinds of data structures, one which is average-case O(1), but worst-case O(n) versus best- and worst-case O(log n), which one you choose depends on what kinds of performance you're optimizing for, and how bad the constants are that we've been ignoring. If you care more about throughput, then you usually want average-case O(1), as the occasional latency spikes aren't important to you. But if you care more about latency, then you'll probably want to choose worst-case O(log n), assuming that its implementations constants aren't too bad.

Jweb_Guru | karma 2749 | avg karma 2.24 2020-02-05 07:56:22+00:00 | [–] similar comments

Cuckoo hashmaps are worst case O(1) when implemented correctly, up to resizing (however, they do need more space and perform worse in virtually all real benchmarks).

harikb | karma 2924 | avg karma 4.18 2020-02-04 21:07:26+00:00 | [–] similar comments

> Discord has never been afraid of embracing new technologies that look promising.

> Embracing the new async features in Rust nightly is another example of our willingness to embrace new, promising technology. As an engineering team, we decided it was worth using nightly Rust and we committed to running on nightly until async was fully supported on stable.

> Changing to a BTreeMap instead of a HashMap in the LRU cache to optimize memory usage.

It is always an algorithm change

dancemethis1 | karma -7 | avg karma -0.5 2020-02-04 21:19:32+00:00 | [–] similar comments

Well, none of it matters since Discord is hostile software. No language will solve their privacy-trampling deeds.

fmakunbound | karma 1631 | avg karma 3.18 2020-02-04 21:37:17+00:00 | [–] similar comments

Why /does/ it run a GC every 2 minutes? I went looking and didnd't find a reason in the code...

https://github.com/golang/go/search?q=forcegcperiod&unscoped...

Go's GC seems kind of primitive.

truthwhisperer | karma 26 | avg karma 0.12 2020-02-04 21:37:31+00:00 | [–] similar comments

other example of bad IT management. Spend those millions on improving Go instead of refactoring code and moving to rust. And why the hell did you choose for Go anyway. Because some hancy fancy developper try to copy Google?

Bad.

dfee | karma 2710 | avg karma 2.97 2020-02-04 21:47:27+00:00 | [–] similar comments

The one problem I’m curious as to how channel-based chat applications solve, to which my google-fu has never lead me in the right direction: how do you handle subscriptions?

I imagine a bunch of front end servers managing open web sockets connections, and also proving filtering/routing of newly published messages. Alas, it’s probably best categorized as a multicast-to-server, multicast-to-user problem.

Anyways, if there’s an elegant solution to this problem, would love to learn more.

Pfhreak | karma 9045 | avg karma 4.25 2020-02-04 21:50:29+00:00 | [–] similar comments

Not sure if this is exactly what you are looking for, but I'd do some digging into consistent hash rings.

dfee | karma 2710 | avg karma 2.97 2020-02-04 22:03:13+00:00 | [–] similar comments

Oh, interesting: https://en.m.wikipedia.org/wiki/Consistent_hashing

> Consistent hashing maps objects to the same cache machine, as far as possible. It means when a cache machine is added, it takes its share of objects from all the other cache machines and when it is removed, its objects are shared among the remaining machines.

I guess the challenge here is that subscriptions are sparse: I.e. one ws connection can carry multiple channel subscriptions, thus undermining the consistent hash.

Pfhreak | karma 9045 | avg karma 4.25 2020-02-04 22:13:42+00:00 | [–] similar comments

There's a number of ways to tweak the algorithm, e.g. by generating multiple hashes per endpoint and then distributing them around a unit circle.

I've seen this used to consistently allocate customers to a particular set of servers, not just ensure you are hitting the right cache. It doesn't fully solve the subscription issue where multiple people are in multiple channels, but it could probably be used as a building block there.

yippir | karma 10 | avg karma 10.0 2020-02-04 21:56:27 | [–] similar comments

I chose Rust over Go after weighing the pros and cons. It was an easy decision. I wouldn't consider using a high level language that lacks generics. The entire point of using a high level language is writing less code.

shdh | karma 190 | avg karma 1.0 2020-02-04 23:21:03+00:00 | [–] similar comments

The syntax looks pedantic to me. Going to require some adjusting.

nottorp | karma 9517 | avg karma 1.87 2020-02-04 21:57:51+00:00 | [–] similar comments

Can someone wake me up when they switch from javascript to something native in the client?

I just checked and as usually, I have an entry labeled "Discord Helper (Not Responding)" in my process list. I don't think i've ever seen it in a normal state.

zlynx | karma 795 | avg karma 2.26 2020-02-04 22:56:00+00:00 | [–] similar comments

That is kind of bad Windows programming but easy to do when writing an app that doesn't need to handle Windows event messages. It probably sits in a loop waiting on socket events and doesn't care if you sent it a WM_QUIT or not. It would be easy to pump the message loop and ignore all, but why bother?

nottorp | karma 9517 | avg karma 1.87 2020-02-04 23:37:53+00:00 | [–] similar comments

Lol it's a javascript thing that instantiates a copy of Chrome, not a Windows program. I doubt they know what a WM_QUIT is...

viraptor | karma 41139 | avg karma 2.79 2020-02-04 22:00:53+00:00 | [–] similar comments

The next step I expected after LRU tunning was to do simple sharding per user, so that there are more services with smaller caches, (cancelling out the impact) with smaller GC spikes, offset in time from each other. I'm curious if that was considered and not done for some reason.

pkolaczk | karma 2452 | avg karma 1.99 2020-02-04 22:07:45+00:00 | [–] similar comments

This is consistent with my observations of porting Java code to Rust. Much simpler and nicer to read safe Rust code (no unsafe tricks) compiles to programs that outperform carefully tuned Java code.

marta_morena | karma 35 | avg karma 1.67 2020-02-04 22:18:22+00:00 | [–] similar comments

Sorry, but `Much simpler and nicer` is something that I highly doubt when you talk about Java to Rust. Unless the people writing the Java code were C programmers, lol, in which case I feel for you.

efaref | karma 1197 | avg karma 3.74 2020-02-04 22:25:33+00:00 | [–] similar comments

Rust's type system is more expressive than Java's so you can end up with much nicer to read code with stricter and more obvious invariants. There also tends to be way less of the `EnterpriseJavaBeanFactory`-style code in idiomatic Rust.

Polyisoprene | karma 101 | avg karma 1.44 2020-02-04 22:39:10 | [–] similar comments

Trolling is fun and all, but I wouldn’t say Rust’s type system that much more advanced than Java’s. The borrow checker definitely helps to catch errors, but I would rate them at basically the same level.

gameswithgo | karma 10455 | avg karma 3.71 2020-02-04 22:49:31 | [–] similar comments

So, being able to have a value or a reference vs everything always being a reference is a pretty massive difference. Option types instead of nulls is also a pretty large difference. Generics being better from a performance perspective is a large difference too.

As well, traits are quite a bit different than classes, but not always in a good way!

Polyisoprene | karma 101 | avg karma 1.44 2020-02-04 23:47:16 | [–] similar comments

Hopefully Java will support proper values regarding inlining and on-stack allocation for non-primitive types soon, but yes option types are a benefit in Rust as long as you’re not using any more advanced reference libraries.

tene | karma 899 | avg karma 2.78 2020-02-04 23:52:57 | [–] similar comments

Can you explain more about "advanced reference libraries", and how they interfere with the benefits of option types?

rattray | karma 6323 | avg karma 3.04 2020-02-04 23:02:14 | [–] similar comments

> Trolling is fun and all

This isn't in keeping with HN's guidelines[0], eg:

> Be kind. Don't be snarky.

Of course, you may be expressing an unpopular opinion, so your comment may receive downvotes regardless. You may wish to delete and repost without the first clause to more accurately gauge community response.

https://news.ycombinator.com/newsguidelines.html

Polyisoprene | karma 101 | avg karma 1.44 2020-02-04 23:35:49+00:00 | [–] similar comments

Sure, just reflecting on the “There also tends to be way less of the `EnterpriseJavaBeanFactory`-style code in idiomatic Rust.”

roca | karma 5364 | avg karma 3.68 2020-02-04 23:16:05+00:00 | [–] similar comments

Rust's type system is definitely much more powerful, even ignoring borrowing.

Rust's affine (ownership) types add a lot of power. They make it possible for APIs to take ownership of a passed-in object and guarantee no other references to it exist. For example this lets you manually deallocate resources (e.g. close a File), while preserving the invariant that if you have a reference to a File, then it is open.

Also, Rust traits and generics are a lot more powerful than anything Java has. For example Rust generics support associated types. E.g. the Iterator trait has an associated type Item:

trait Iterator { type Item; ... }

You can now write code that's generic over Iterator, and refers to its Item type:

fn first<I: Iterator>(iter: I) -> I::Item { iter.next().unwrap() }

Toy example, but associated types are really important.

Rust traits and generics are more powerful in other ways too. E.g. in Rust you can do anything with a generic type, unlike Java where type erasure means you can't write 'new T' etc.

roca | karma 5364 | avg karma 3.68 2020-02-04 23:28:48+00:00 | [–] similar comments

BTW I'm not knocking Java here. Java's simpler type system is actually great for the applications I think Java is good for --- business logic and simple mobile apps, the COBOL of the 21st century.

Well, except that Java went ahead and gratuitously complicated their type system with wildcards. That was crazy.

Polyisoprene | karma 101 | avg karma 1.44 2020-02-04 23:38:18 | [–] similar comments

Sure, but generics are available in Java as well and compared to Haskell and similar language I rate Rust as closer to Java.

jeremysalwen | karma 894 | avg karma 3.52 2020-02-05 00:16:48+00:00 | [–] similar comments

Rust's generics are much more powerful. You can do type level programming, e.g. : https://docs.rs/peano/1.0.2/peano/

There's an interesting comparison of Rust and Haskell's type system here: https://www.reddit.com/r/rust/comments/4jh8hv/question_about... I guess it's just a matter of opinion about what sorts of differences you think are significant vs not.

roca | karma 5364 | avg karma 3.68 2020-02-05 00:22:29+00:00 | [–] similar comments

I think most people would rate Rust as being closer to Haskell than Java at the type level.

Rust traits are very much like Haskell type classes. Java gives you classic OOP, but neither Rust nor Haskell do.

Rust generics are much more like Haskell generics than Java generics. Rust and Haskell both have associated types, Java doesn't. Java generics are crippled due to type erasure; Rust and Haskell don't have those limitations.

Rust and Haskell generics support a lot of type-level computation; Java doesn't.

Rust and Haskell don't have ubiquitous nullable-by-default values; Java does.

Rust and Haskell have discriminated sum types; Java doesn't.

The only way I think Rust is more like Java than Haskell at the type level is that Haskell has higher-kinded types and Rust/Java don't. There are plans to fix this in Rust though.

logicchains | karma 9077 | avg karma 2.62 2020-02-05 03:19:14+00:00 | [–] similar comments

>Java generics are crippled due to type erasure; Rust and Haskell don't have those limitations.

Both Rusk and Haskell erase types at runtime. The difference is they don't rely on runtime reflection for anything, so it doesn't hurt them.

roca | karma 5364 | avg karma 3.68 2020-02-05 04:19:39 | [–] similar comments

In this context "type erasure" means that Java compiles all instances of a generic method to a single implementation that is oblivious to the type parameters. Thus in Java you can't write "T t = new T()" where T is a generic parameter, because the type-erased code doesn't know what T to create.

In Rust, each instance of a generic function is compiled separately and customized as necessary to the specific type parameters. You can write "let t = T::new();" because the compiler will generate a call to the correct constructor for each instance of the generic code. In this sense, types are NOT erased.

gameswithgo | karma 10455 | avg karma 3.71 2020-02-04 22:31:16 | [–] similar comments

Remember that the comparison is a hand performance tuned Java program, vs a naive Rust implementation. When you are trying to extract performance out of Java, things can get pretty messy.

echelon | karma 19387 | avg karma 2.75 2020-02-05 00:45:35+00:00 | [–] similar comments

I write Java during the day but wish I could write Rust instead.

Rust has a much nicer standard library, and the semantics and abstractions are so incredibly beautiful. It's a language that has learned from the millions of human-years that went into other languages and ecosystems. Every corner and seam in the language design speaks to this.

Traits, union type enums, pattern matching, option and result types, derive(), error handling semantics - it's all incredibly intuitive and expressive.

If I had a choice to write Rust everywhere, I would.

millstone | karma 6895 | avg karma 3.52 2020-02-05 01:45:58+00:00 | [–] similar comments

Rust has its share of random-feeling nonsense, like requiring PhantomData to "work around" unused type params, or that you can't compare arrays longer than 32 elements.

gridspy | karma 2743 | avg karma 2.41 2020-02-05 02:07:05 | [–] similar comments

1. I've never had to use PhantomData since I started coding in rust pretty much fulltime 2 years ago. 2. You can compare arrays longer than 32, but the compiler will no longer create an automatic comparison for you. So it's not that you "can't" do it. -- That said, using == to compare >32 elements sounds inefficient, perhaps check your use-case?

xmprt | karma 3764 | avg karma 3.07 2020-02-05 02:39:25+00:00 | [–] similar comments

Not comparing arrays longer than 32 elements by == sounds like a feature to me.

millstone | karma 6895 | avg karma 3.52 2020-02-05 03:02:18+00:00 | [–] similar comments

The 32-count cliff is not just for Eq - it also defeats niceties like Debug, Hash, Default. This isn't a deliberate design decision, it's due to a (current) limitation of Rust's value-type generics.

The point is to illustrate that when you WTF using Rust, sometimes it's not you, it really is Rust.

lumost | karma 9692 | avg karma 3.34 2020-02-05 03:46:35+00:00 | [–] similar comments

A great example of this can be seen writing any recursive data-structure such as a doubly linked list.

ElderKorihor | karma 85 | avg karma 4.47 2020-02-05 04:33:22+00:00 | [–] similar comments

I think there are a few places were Rust didn't get it quite right, but this seems like a weird example. The 32 length limit is a temporary limitation, not a corner that Rust is painted into. Indeed, my understanding is that the limitation at this point is being maintained artificially due to an overabundance of caution (via https://doc.rust-lang.org/std/array/trait.LengthAtMost32.htm... ). That's not WTF, that's just TODO.

Jweb_Guru | karma 2749 | avg karma 2.24 2020-02-05 07:57:42 | [–] similar comments

It's not really a workaround (it didn't used to be required), it asks for PhantomData so it can figure out variance and infer auto traits.

adamnemecek | karma 57571 | avg karma 6.54 2020-02-05 15:42:35 | [–] similar comments

Which language doesn't.

dang | karma 18142 | avg karma 0.25 2020-02-05 04:23:33 | [–] similar comments

We detached this subthread from https://news.ycombinator.com/item?id=22240978.

romaniitedomum | karma 84 | avg karma 2.1 2020-02-04 22:31:17+00:00 | [–] similar comments

You're switching to Rust because Go is too slow? Colour me sceptical, but this seems more like an excuse to adopt a trendy language than a considered technical decision. Rust is designed first and foremost for memory safety, and it sacrifices a lot of developer time to achieve this, so if memory safety isn't high in your list of concerns Rust is probably not going to bring many benefits.

hajile | karma 6470 | avg karma 2.37 2020-02-04 22:44:34+00:00 | [–] similar comments

Did you read the article? The naive Rust version was better than the tuned golang version in every metric. The most important one (latency) simply wasn't fixable due to golang's GC (something that is a bit of a general GC issue I might add).

romaniitedomum | karma 84 | avg karma 2.1 2020-02-05 03:08:28+00:00 | [–] similar comments

Did you read my comment? I don't dispute that the Rust version is faster in every way. I am disputing that rewriting in Rust was a sensible technical decision, and in support of this I point you to where the author describes having to use a nightly build of the compiler to get async support. Given that they had to jump through a lot of hoops to make this work, I am saying they could have achieved the same speed increase with less effort using a stable C or C++ compiler. Hell, had they invested a fraction of the time spent rewriting in Rust in the Go version, I'll bet they could have improved it to the point where there was no need to rewrite it at all.

It's clear that Discord use Rust a lot, and that they are looking for any excuse to replace existing code with Rust code.

smabie | karma 4907 | avg karma 1.76 2020-02-04 22:47:58+00:00 | [–] similar comments

What would you recommend that doesn’t have a GC? Zig? C? Rust is a fine choice. Besides if you really don’t care, just make the entire program unsafe and you’ll still reap benefits over C or C++.

romaniitedomum | karma 84 | avg karma 2.1 2020-02-05 03:23:15 | [–] similar comments

Whatever language and toolset gets the job done with the least amount of effort. Given the hoops that Discourse had to jump through to get Rust working, that wasn't a good technical decision. They'd have got the same result with less pain with C++.

damnyou | karma -11 | avg karma -0.01 2020-02-05 13:34:14+00:00 | [–] similar comments

The costs of any long-lived service are dominated by ongoing maintenance, not the initial work.

iruoy | karma 532 | avg karma 2.84 2020-02-04 22:50:20 | [–] similar comments

So decreasing workload on the servers and avoiding spikes in the read states queue is bad business?

The article also states that is was quite easy to port over and didn't need any quirky tuning.

lllr_finger | karma 442 | avg karma 2.53 2020-02-04 22:58:18+00:00 | [–] similar comments

The goals of Rust are stated boldly right on the official website - "Performance" is one of them. In Discord's case, the hit in productivity was worth avoiding the GC issues in Go. I read the article and didn't come to the same conclusion, so I'm curious which passages led you to believe this was done to "adopt a trendy language"?

romaniitedomum | karma 84 | avg karma 2.1 2020-02-04 23:22:16 | [–] similar comments

Performance is not the core raison d'etre of Rust, and there are no shortage of testimonials to the difficulty new developers have with it, not to mention the slowness of its compiler. Given that, it's too much of a leap for me to get from "GC is too slow in Go" to "rewrite in Rust", at least when considered as a purely technical decision. There is no mention, for instance, of what other languages were considered. My guess is none were considered. Finally, the author states that Discord pride themselves in embracing new things, and cites having to work with the nightly build of the compiler to get async. All of this tells me that they chose Rust for non-technical reasons and were prepared to jump through all kinds of hoops to make it work. Which is fine, it's their business to run however they want, but I find the premise that Rust is an obvious choice for speed entirely unpersuasive. In most businesses, introducing unstable nightly builds of compilers to build production services would be a major red flag.

anon4242 | karma 366 | avg karma 2.49 2020-02-04 23:13:01+00:00 | [–] similar comments

Turns out that that boring stuff - type theory - they throw at you in uni, can be quite useful. Not only can it help with things like memory safety but also with speed. This is why for instance C++ std::sort() is faster than C qsort(), better type information available to the compiler allows it to make better optimizations. In rust the type system is king.

romaniitedomum | karma 84 | avg karma 2.1 2020-02-05 03:20:51+00:00 | [–] similar comments

No, in Ada the type system is king. If type theory is the solution, then Ada and SPARK, Ada's restricted subset for extra safety, leave Rust in the dust.

shanev | karma 859 | avg karma 3.9 2020-02-04 22:44:26+00:00 | [–] similar comments

When a company switches languages like this, it's usually because the engineers want to learn something new on the VC's dime. They'll make any excuse to do it. As many comments here show, there are other ways to solve this problem.

deepsun | karma 3058 | avg karma 1.77 2020-02-04 22:45:54 | [–] similar comments

Wait, isn't Go devs said they solved GC latency problems [1]?

(from 2015): "Go is building a garbage collector (GC) not only for 2015 but for 2025 and beyond: A GC that supports today’s software development and scales along with new software and hardware throughout the next decade. Such a future has no place for stop-the-world GC pauses, which have been an impediment to broader uses of safe and secure languages such as Go." [2]

[1] https://www.youtube.com/watch?v=aiv1JOfMjm0

[2] https://blog.golang.org/go15gc

terminaljunkid | karma 29 | avg karma 0.28 2020-02-05 09:32:00 | [–] similar comments

That seems to be written by some Manager with slight clue of tech, tbh.

arjunbajaj | karma 69 | avg karma 1.44 2020-02-04 23:03:58+00:00 | [–] similar comments

Question for the Discord team: Was implementing the same service in Elixir an option? Did you try it/why not?

robocat | karma 11778 | avg karma 2.08 2020-02-04 23:47:27 | [–] similar comments

Discord also use Elixir - there are comments elsewhere above for why Elixir might be a bad choice in this case.

arjunbajaj | karma 69 | avg karma 1.44 2020-02-05 01:04:40+00:00 | [–] similar comments

Thanks!

brylie | karma 1319 | avg karma 2.66 2020-02-04 23:11:37 | [–] similar comments

What are some recommended resources for a gentle introduction to Rust?

fatbird | karma 2749 | avg karma 2.62 2020-02-04 23:13:52 | [–] similar comments

I read the Rust Programming Language book over Christmas and it's a very good introduction to it, probably one of the best I've seen for any language. It's got a good voice, and it's very good about putting enough context around Rust design decisions to understand the why as well as the how. But's it's not so long that it feels like a slog.

steveklabnik | karma 91260 | avg karma 5.08 2020-02-04 23:46:43 | [–] similar comments

Thank you!

brylie | karma 1319 | avg karma 2.66 2020-02-05 01:15:56+00:00 | [–] similar comments

Link, for convenience:

https://doc.rust-lang.org/book/

woah | karma 10252 | avg karma 3.02 2020-02-04 23:15:01 | [–] similar comments

Switching to Rust is a good idea, but I was wondering- would it be possible to run two identical instances in parallel and return results from the fastest one? This would almost completely eliminate GC pauses from the final output.

archi42 | karma 1581 | avg karma 1.71 2020-02-04 23:34:18+00:00 | [–] similar comments

Uhm, I'd suppose the service runs on one or more dedicated nodes - so there should be no competition for RAM (or if a node runs multiple services, the I'd expect a fixed memory amount to be available). In such an environment, each fixed size LRU cache could just allocate a huge chunk of RAM for data + indices (index size is bound by data size). That's nothing to do with the ownership model, it's just manually managed memory.

Yes, reality is more complex since they probably have multi socket servers/NUMA, which might add memory access latencies and atomic updates to the LRU might require a locking scheme, which also isn't trivial (and where async Rust might be useful).

mc3 | karma 823 | avg karma 2.19 2020-02-05 00:23:12+00:00 | [–] similar comments

More accurately "Why Discord is switching a service from Go to Rust"

esjeon | karma 1171 | avg karma 2.79 2020-02-05 00:30:46 | [–] similar comments

I wonder if they actually did their homework. Doesn't matter if they like it, but they could have avoided rewriting, if they wanted.

The thing is, you can allocate memory outside of Go, and GC will simply ignore such regions, since GC only scan regions known to it. (Mmap should work like a charm here.) A drawback is that pointers in such regions will not be counted, but it's easy to workaround by copying whole data, which is encouraged by the language itself.

TBH, Go sucks for storing a large amount of data. As you can see here, even the simplest cache can be problematic. The language is biased towards large datacenters, where the amount of available resources are less of a concern. Say, this problem can be solved by having external cache servers and extra nodes around them. Latency will not be idealistic, but the service will survive with minimal changes.

pmarreck | karma 4903 | avg karma 1.11 2020-02-05 00:34:48+00:00 | [–] similar comments

Is the Discord server-side still coded in Elixir?

LaserToy | karma 437 | avg karma 2.04 2020-02-05 00:36:08+00:00 | [–] similar comments

I’m sorry, but isn’t it cashing 101 ? Do not keep long living objects in GC managed memory. And there are ways to do it in both go and even java.

meirelles | karma 175 | avg karma 2.73 2020-02-05 00:38:25+00:00 | [–] similar comments

The Twitch folks were facing a related situation with the GC. They developed a workaround that they called Ballast, reducing the overall latency and making it more predictable. Quite impressive results [0].

The Go's GC is groundbreaking in several aspects, but probably needs to provide ways to fine-tune it. Posts like this make me believe that one-size-fits-all settings are yet to be seen.

[0]: https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i...

blazespin | karma 3733 | avg karma 2.51 2020-02-05 00:51:13 | [–] similar comments

Confused, aren't they losing memory safety?

I get for certain core code situations, you want to manage all memory safety yourself (or use built in static GC), but beyond that it seems to me at a higher level you'd rather have the automatic GC. Why burden all of your developers rather than just a core few?

I don't think GC issues is a compelling argument to move everything to Rust. I'm not saying there aren't compelling arguments, but that just seems a bit odd that that's their main argument.

echeese | karma 357 | avg karma 2.06 2020-02-05 01:28:06 | [–] similar comments

Nah, guaranteed memory safety is actually one of Rust's main selling points

buzzerbetrayed | karma 1468 | avg karma 2.84 2020-02-05 01:41:01+00:00 | [–] similar comments

I’ve never heard the argument that moving to rust reduces memory safety. Isn’t memory safety what rust is known for?

Matthias247 | karma 3753 | avg karma 2.13 2020-02-05 05:15:29+00:00 | [–] similar comments

It is! But in Rust you still have an escape hatch in the form of the `unsafe` annotation which allows for mistakes which break memory safety. I don't think Go has something like that, unless you use the FFI. So saying that Go is at least as memory safe as Rust might not be too wrong of a statement.

However I think in total Rust is safer. E.g. Rust prevents a ton of race conditions in multithreaded code, which Go can not do.

Jweb_Guru | karma 2749 | avg karma 2.24 2020-02-05 07:54:53+00:00 | [–] similar comments

Go has data races on multiple cores in safe code, without using any unsafe intrinsics or C FFI.

musicale | karma 7954 | avg karma 1.63 2020-02-05 01:11:21+00:00 | [–] similar comments

1st Law of Garbage Collection: Consistent speed and efficiency usually requires circumventing the garbage collector.

yobert | karma 540 | avg karma 4.12 2020-02-05 02:23:40 | [–] similar comments

Not to participate in the flaming-- but I'd love to hear some stats about compile times for the two versions of the service. (Excellent write-up by the way! Thanks!)

sayusasugi | karma 17 | avg karma 0.89 2020-02-05 04:27:57+00:00 | [–] similar comments

Great, can the client be ported to Rust while you're at it? Electron is such a joke.

eric-hu | karma 1639 | avg karma 1.92 2020-02-05 04:37:03+00:00 | [–] similar comments

I'm curious what the product-engineering landscape in the company looks like to allow for a language rewrite to happen. I feel like this would be a hard sell in all companies I've worked at. Was this framed as a big bug fix? Or was faster performance framed as a feature?

modo_mario | karma 888 | avg karma 1.59 2020-02-05 05:50:23+00:00 | [–] similar comments

I think they're at a scale now where the cost of running it starts to become important as well. At least when we're talking about big performance increases like this.

moneywoes | karma 1126 | avg karma 0.62 2020-02-05 04:50:16 | [–] similar comments

Another blow for Google.

stiray | karma 2663 | avg karma 4.44 2020-02-05 05:47:27+00:00 | [–] similar comments

And brings me back to my years old nag. "Ok, you got GC, fine. But DO give me option to hand free specific memory when I want to. I don't consider hand allocation and deallocation such a pain than GC going wild."

This doesn't only go for Go.

dis-sys | karma 726 | avg karma 0.85 2020-02-05 05:52:01+00:00 | [–] similar comments

I believe the problem described in this blog has been at least partially addressed in the Go 1.12 release.

tonyferguson | karma -2 | avg karma -2.0 2020-02-05 07:00:16+00:00 | [–] similar comments

Wow, Rust is amazing, so fast! It is like these people never learnt c? Why did they spend all this time trying to optimise such a high level language? Surely they can afford a more experienced engineer who will tell them that is a path that isn't worth it? I jump straight to c when there is anything like this, although I guess Rust is an option these days.

raverbashing | karma 22421 | avg karma 1.64 2020-02-05 07:32:27+00:00 | [–] similar comments

> but because the garbage collector needed to scan the entire LRU cache in order to determine if the memory was truly free from references

Yeah please tell me again how GC is a superior solution to reference counting in cases when you know exactly when you don't need the object anymore.

(Hint: RC is not GC if the object is dealocating itself)

fxtentacle | karma 18712 | avg karma 5.34 2020-02-05 07:57:06+00:00 | [–] similar comments

Sounds like badly reinventing the wheel. If you need a large in-memory LRU cache, use memcached. Problem solved, because then Go doesn't need to allocate much memory anymore. And I'd wager that JSON serialization for sending a reply back to the client will dominate CPU load anyway, so that the overhead for Go to talk to Memcached will be barely noticeable.

FisherGuy44 | karma 54 | avg karma 3.86 2020-02-05 09:51:33+00:00 | [–] similar comments

This is not a fair comparison. Go 1.9.2 was released over 2 years ago. In that time they have fixed a lot of the GC stutter issues. Comparing rust nightly to a 2 year old compiler is unfair.

willvarfar | karma 17791 | avg karma 5.17 2020-02-05 12:52:28+00:00 | [–] similar comments

This is a bit late to add, but from the description of the problem in the article, the way to make the program faster, irregardless of language, is to use a big array rather than lists and trees. Carve the array up as necessary, so the array of users to offsets in the array where the data is. Basically, be your own memory allocator, with all the loss of safety but the order of magnitude improvement in efficiency that that brings.

juskrey | karma 1515 | avg karma 2.16 2020-02-05 14:06:36 | [–] similar comments

Golang: a post-academic delusion built around single petty feature. Rust: everything you needed and hated about C++, now in edible packaging with flavor.

crazypython | karma 4542 | avg karma 3.85 2020-02-05 14:17:51+00:00 | [–] similar comments

In D, you may explicitly delete memory while having a GC.

luord | karma 2640 | avg karma 2.84 2020-02-05 15:26:11 | [–] similar comments

Usually this kind of article is about "migrating from massively popular language to more niche language that we like better".

This is more from niche to niche. Thought that was interesting, but yet the discussion here wasn't all that different to the usual. Guess it's flamewars always, regardless of popularity.

Fire-Dragon-DoL | karma 1412 | avg karma 0.74 2020-02-06 03:57:23+00:00 | [–] similar comments

I run the same question here: can't a memory pool be used in this case?

In gaming industry there are similar problems with GC and they were solved with memory pools

Legal | privacy