Buffered Channels in Go: Tips and Tricks

cjslep | karma 3057 | avg karma 4.04 · 2018-01-15 13:26:20+00:00

All of these tricks work with unbuffered channels as well. So it leaves the far more interesting question "When do you use buffered or unbuffered channels?" unanswered. Naturally, it depends a lot on the kind of system you want to build. Amking a channel buffered exposes the performance pitfall where a slow consumer is masked or hidden until it's supplier fills the buffer, and the real throughput is then exposed.

Edit: An interesting application of a buffered channel, for example, is when creating an object pool of finite size. Or, in conjunction with a timer, rate limiting a piece of code. I don't think I have ever considered a buffered channel in the typical producer/consumer setup.

reply

kasey_junk | karma 13439 | avg karma 2.75 · 2018-01-15 07:37:07

Buffered channels are good for any use case where the different concurrent components operate on different cadences.

In degenerate cases that means that one of the cadences is inappropriately slow but in most cases it rounds out latency spikes and increases throughout.

For that matter I very rarely use zero size channels because if 2 things are on the same cadence why should I make them concurrent?

reply

jerf | karma 85298 | avg karma 5.28 · 2018-01-15 14:44:13+00:00

You have to be careful with that, though. Channels of size greater than one are not async channels, which you do not say but are sort of implying. They will still block like size-0 channels if they fill up, for instance. The sort of workloads where they will meaningfully help your performance, and nothing else will, turns out to be smaller than most people's intuition will lead them to believe. It's a very precise combination of "usually I do nothing but sometimes I get 10 requests in under 10microseconds from on source and I've got enough CPU to handle them all simultaneously". Otherwise you're generally just as well of to use normal 0-sized channels and let the backpressure flow back to the receiver; in many "normal" workflows it won't even cost you latency because if you're CPU blocked anyhow you'll still be waiting on CPU, not channels.

Most people's intuitions about work distribution aren't very good either; if you break a process up into 5 pieces, your human brain will tend to estimate the five pieces as roughly the same size ("within an order of magnitude of each other"), when in the real computer world they will almost always be separated by multiple orders of magnitude in size. Evenly breaking up a task is usually quite hard and not something that happens accidentally. It doesn't help you all that much to use a 10-element channel if the producer is producing an element every 100 nanoseconds while the consumer is consuming one element per 10 milliseconds, both reasonable numbers, but separated by 5 orders of magnitude. The advantage of using a buffered channel is lost in the noise of the consumer, and it's not hard for it to be a net loss given the complexities buffered channels can involve if you accidentally code as if they are asynchronous rather than buffered-but-still-ultimately-synchronous. Zero-sized channels are more likely to exhibit any errors you made during development or QA rather than in production.

I don't think I've ever shipped a non-zero-sized channel in Go. Occasionally designs go through phases where they may have them, but they always come back out before I'm done. Even the "only allow X of a certain resource to be used" always seems to turn into "only spawn X goroutines that use that resource and feed them work off a channel that has no reason to be buffered".

reply

kasey_junk | karma 13439 | avg karma 2.75 · 2018-01-15 15:04:48+00:00

Let me state explicitly, in either case you should be dealing with full channel cases.

I'd probably make a lot more use of empty channels if go had more sophisticated primitives around that. If I could say, publish to this channel until a timeout without all the overhead that is currently required.

As it stands (on my workloads), with unbuffered channels its hard to figure out if you have encountered a localized short term full channel or you've achieved some more degenerate case where you should be taking drastic load shedding action. Buffered channels make rough approximations of this very easy. Under normal operating modes this shouldn't be full. If it is sound the klaxon and begin load shedding.

And for most things that aren't pipelines, I simply don't think channels are very appropriate.

reply

jnordwick | karma 2641 | avg karma 1.97 · 2018-01-15 18:45:10+00:00

What?? Go channels don't have timeouts?

I had to read about this and I'm a little dumbfounded. Why wouldnt Go support timeouts for channels? The extra complexity seems huge especially if you wanted a timeout that was different than the library based hack -- for example if you wanted a timeout since last write.

I don't understand what the Go devs are protecting people from.

reply

packetslave | karma 4258 | avg karma 4.47 · 2018-01-15 14:19:45

It’s trivial to implement a channel read with timeout using select and time.After(). The first N google results for “golang channel timeout” show exactly that.

jnordwick | karma 2641 | avg karma 1.97 · 2018-01-15 15:34:53

It is still overly complex and only handles one specific timeout case. A timeout per connection? That can be a lot of extra goroutines for such a simple thing.

Or if you want a 30 second timeout since the last read? Way more complex.

Go's channels seem more than a little under developed.

reply

kasey_junk | karma 13439 | avg karma 2.75 · 2018-01-16 07:56:19

It’s worth noting that eliminates the range option which means you need to handle all the edge cases around closed/nil channels by hand.

On the publish side in most production cases you don’t want to use time.after as it leaks. So you’ll use either a context or a timer.

All told it’s a fair bit of boilerplate to get right everytime (which you can’t genericize) which is why I tend to use buffered channels that are sized such that being full is a degenerate case.

reply

cjslep | karma 3057 | avg karma 4.04 · 2018-01-15 14:49:37+00:00

Shouldn't that only matter if you are able to plan or dynamically scale your goroutines so that the cadence x volume produced is equal to the cadence x volume consumed? Otherwise due to the differing frequencies there will be one side that is blocked on sending (because the buffer is full) or consuming (because the buffer never reasonably fills up). I'm thinking of 1 producer to 1 consumer though, so 1:N could see some use of buffered channels. But I am not sold on having a bunch of memory used up in a buffer waiting for use is better than having the quicker goroutines paused or not-yet-scheduled for execution on an unbuffered channel.

I agree that a small buffer can smooth out latency issues on the consumer side only, and that requires both sides to be reasonably similar in frequency of the latencies are disproportionately large compared to the unit of work. But that requires a lot of live metrics to capture.

I still believe that unbuffered is the way to go.

reply

kasey_junk | karma 13439 | avg karma 2.75 · 2018-01-15 15:08:59+00:00

Batching is the important distinguisher here. The downstream should read as many events as possible and handle them together.

If you have 1:1 producer/consumer and they are doing 1:1 events to output, almost always I think people have gone concurrent when they shouldn't.

reply

bluetech | karma 251 | avg karma 3.44 · 2018-01-15 15:16:26+00:00

A buffered channel can be implemented with two unbuffered channels + a goroutine + an array.

shabbyrobe | karma 634 | avg karma 4.14 · 2018-01-15 13:53:31+00:00

Channels can be really useful but are massively oversold. Fortunately it has cooled off a bit in recent years but I still see heaps of go code that bends over backwards to use channels because "share memory by communicating blah blah blah", even in cases where using another synchronisation mechanism instead would radically simplify things.

rplnt | karma 6199 | avg karma 2.64 · 2018-01-15 14:03:19+00:00

> If you do have pointers, or if the item iself [sic] is a pointer, it is up to you to ensure that the pointed-to objects remain valid while it is in the queue and being consumed.

Does this mean the producer have to keep pointer for the queued item around or it could get garbage collected? That seems.. unintuitive? Or is it just that working with the object could make the old pointer invalid?

reply

kasey_junk | karma 13439 | avg karma 2.75 · 2018-01-15 08:09:00

The latter. The GC won’t collect it while the pointer references it.

innagadadavida | karma 208 | avg karma 0.38 · 2018-01-15 15:42:24+00:00

Interesting read. There are instances when you need unlimited capacity, and there is a proposal for it: https://github.com/golang/go/issues/20352 It doesn’t look like it will make it as the maintainers are opposed to it.