Hacker Read

cjslep · 2018-01-15 13:26:20+00:00

All of these tricks work with unbuffered channels as well. So it leaves the far more interesting question "When do you use buffered or unbuffered channels?" unanswered. Naturally, it depends a lot on the kind of system you want to build. Amking a channel buffered exposes the performance pitfall where a slow consumer is masked or hidden until it's supplier fills the buffer, and the real throughput is then exposed.

Edit: An interesting application of a buffered channel, for example, is when creating an object pool of finite size. Or, in conjunction with a timer, rate limiting a piece of code. I don't think I have ever considered a buffered channel in the typical producer/consumer setup.

reply

jerf | karma 85298 | avg karma 5.28 · | 2018-04-13 16:15:13+00:00

Fair enough. But it is important for people to realize that buffered channels aren't "async" channels; they're async until they fill up, at which point they resume blocking until there is a slot available. If you have some sort of channel network setup that is deadlocking with unbuffered channels and you "fix" it by adding some buffering in, unless you've got a very solid analysis as to why that buffering is correct, you haven't fixed it, you've just delayed the problem until load is higher.

It's in many ways academic anyhow. Most of the tricks I've seen for buffered channels are better done some other way, and in practice I almost always use unbuffered channels. Most of the time that someone thinks they want a buffered channel because of performance issues, they're actually exactly wrong and backwards... if there is some sort of mismatch between the speed of the consumers and producers you generally want the coupling introduced by unbuffered channels, even if it is counterintuitive.

reply

4ad | karma 5967 | avg karma 2.38 · | 2018-04-13 16:38:49+00:00

Yes, buffered channels should almost never be used for performance reasons.

Buffered channels should be used for their semantics, but in general unbuffered channels are preferred to buffered channels if possible because buffered channels cause combinatorial explosion of state, and are hard to reason about accurately.

reply

palebluedot | karma 3015 | avg karma 9.33 · | 2019-10-12 21:56:21+00:00

That's a good point, at steady-state, I'd imagine unbuffered channels would have the same throughput as a deeply buffered channels. The main advantage is being able to spool up faster and smooth out throughput, but it could be for most workloads that is not valuable enough. Perhaps I placed too much value on that.

Your comment (and others) have convinced me to do some more empirical testing and see how necessary buffered channels are for my goal.

reply

boyter | karma 7913 | avg karma 5.65 · | 2019-10-11 21:20:53+00:00

Backpressure to writers is the main reason to use buffer channels IMHO. I have never seen any performance differences between buffered or unbuffered for anything I have used.

The other reason to use them is controlling CPU. Spin up a pool of consumers and control how many cores they use with the buffer.

reply

kasey_junk | karma 13439 | avg karma 2.75 · | 2018-01-15 07:37:07

Buffered channels are good for any use case where the different concurrent components operate on different cadences.

In degenerate cases that means that one of the cadences is inappropriately slow but in most cases it rounds out latency spikes and increases throughout.

For that matter I very rarely use zero size channels because if 2 things are on the same cadence why should I make them concurrent?

reply

cjslep | karma 3057 | avg karma 4.04 · | 2018-01-15 14:49:37+00:00

Shouldn't that only matter if you are able to plan or dynamically scale your goroutines so that the cadence x volume produced is equal to the cadence x volume consumed? Otherwise due to the differing frequencies there will be one side that is blocked on sending (because the buffer is full) or consuming (because the buffer never reasonably fills up). I'm thinking of 1 producer to 1 consumer though, so 1:N could see some use of buffered channels. But I am not sold on having a bunch of memory used up in a buffer waiting for use is better than having the quicker goroutines paused or not-yet-scheduled for execution on an unbuffered channel.

I agree that a small buffer can smooth out latency issues on the consumer side only, and that requires both sides to be reasonably similar in frequency of the latencies are disproportionately large compared to the unit of work. But that requires a lot of live metrics to capture.

I still believe that unbuffered is the way to go.

reply

kasey_junk | karma 13439 | avg karma 2.75 · | 2018-01-15 15:04:48+00:00

Let me state explicitly, in either case you should be dealing with full channel cases.

I'd probably make a lot more use of empty channels if go had more sophisticated primitives around that. If I could say, publish to this channel until a timeout without all the overhead that is currently required.

As it stands (on my workloads), with unbuffered channels its hard to figure out if you have encountered a localized short term full channel or you've achieved some more degenerate case where you should be taking drastic load shedding action. Buffered channels make rough approximations of this very easy. Under normal operating modes this shouldn't be full. If it is sound the klaxon and begin load shedding.

And for most things that aren't pipelines, I simply don't think channels are very appropriate.

reply

jerf | karma 85298 | avg karma 5.28 · | 2019-10-11 20:14:03+00:00

Put in English, the type of an unbuffered channel is "a channel that always blocks on writing until the value has been read". The type of a buffered channel is "a channel that doesn't block when written to, until it is full in which case it blocks on writing until some other value has been read by some other unrelated goroutine".

The former is a reasonable concurrency primitive. The latter superficially seems similar, but is actually a much more complicated primitive, with the corresponding code understanding problems and the increased likelihood of more concurrency problems being hidden at low scale but coming out at scale (mostly deadlocks), plus the fact it seems similar is also problematic. The fact that the Go type system does not allow distinguishing between the two is also problematic.

It has some special-case purposes, but I also always scrutinize any channel created with buffering to make sure it is one of those special cases.

Buffered channels have not impressed me with their ability to do any sort of performance improvements. In almost all cases, if you've filled up your readers, you really want your writer to block. It provides cheap, effective backpressure; arguably for software-at-scale it's the most useful aspect of the channel primitive.

reply

fishywang | karma 729 | avg karma 2.58 · | 2019-10-12 01:16:00+00:00

Why having a 1000s buffered channel is useful in this case? If it's unbuffered, you still get the backpressure blocking on writers as a feature, since you have a worker queues of thousands of goroutines to read and handle them.

4ad | karma 5967 | avg karma 2.38 · | 2018-04-13 15:58:25+00:00

Not true. Buffered channels are able to express semantics impossible to express with unbuffered channels. E.g. a counting semaphore. The most common buffered channel buffer size is one, and the code would not be correct with either zero, or more than one. Using buffered channels in Go for their semantics instead of their performance is a very common idiom in Go. Much more common than adding buffering to channels as an optimization.

Xeoncross | karma 4298 | avg karma 3.83 · | 2016-09-23 15:04:07+00:00

Are you talking about doing this instead of buffering the channel? The reason I buffer it is because I don't want whatever is providing the worker with a message to block while waiting to send to the workers.

btilly | karma 52813 | avg karma 4.93 · | 2009-11-13 19:55:04+00:00

Several things about channels.

The first is a random correction to a common meme. Channels can be created buffered or unbuffered. If they are unbuffered, then sender blocks. If buffereed, then sender blocks when the buffer is full. Buffering can improve performance significantly.

Secondly when you set up complicated messes of goroutines talking over channels it is easy to get deadlocks. There is a deadlock detection mechanism, but I don't have any details beyond knowing that some people have run into it.

Thirdly am I the only person in the world who looks at the channel mechanism and thinks how naturally it maps onto a capability style security system? I've pointed that out a few times and nobody seems to bite on it. Odd.

reply

palebluedot | karma 3015 | avg karma 9.33 · | 2019-10-11 19:32:02

The preference of channel size being unbuffered or just 1 is interesting. That seems like something specific to a problem domain; for instance, in projects I am working on now, having a large buffered channel (1000s deep) is useful for worker queues of thousands of goroutines, that all read from a task feeder channel. This type of queuing seems go-idiomatic, and negates the need for additional synchronization. In this case, the backpressure blocking on writers is a feature.

tptacek | karma 394296 | avg karma 6.04 · | 2018-04-13 15:25:58

Buffering channels doesn't avoid deadlocks. As I understand it: buffering is a performance feature; if your code isn't correct without buffered channels, it isn't correct.

jerf | karma 85298 | avg karma 5.28 · | 2018-01-15 14:44:13+00:00

You have to be careful with that, though. Channels of size greater than one are not async channels, which you do not say but are sort of implying. They will still block like size-0 channels if they fill up, for instance. The sort of workloads where they will meaningfully help your performance, and nothing else will, turns out to be smaller than most people's intuition will lead them to believe. It's a very precise combination of "usually I do nothing but sometimes I get 10 requests in under 10microseconds from on source and I've got enough CPU to handle them all simultaneously". Otherwise you're generally just as well of to use normal 0-sized channels and let the backpressure flow back to the receiver; in many "normal" workflows it won't even cost you latency because if you're CPU blocked anyhow you'll still be waiting on CPU, not channels.

Most people's intuitions about work distribution aren't very good either; if you break a process up into 5 pieces, your human brain will tend to estimate the five pieces as roughly the same size ("within an order of magnitude of each other"), when in the real computer world they will almost always be separated by multiple orders of magnitude in size. Evenly breaking up a task is usually quite hard and not something that happens accidentally. It doesn't help you all that much to use a 10-element channel if the producer is producing an element every 100 nanoseconds while the consumer is consuming one element per 10 milliseconds, both reasonable numbers, but separated by 5 orders of magnitude. The advantage of using a buffered channel is lost in the noise of the consumer, and it's not hard for it to be a net loss given the complexities buffered channels can involve if you accidentally code as if they are asynchronous rather than buffered-but-still-ultimately-synchronous. Zero-sized channels are more likely to exhibit any errors you made during development or QA rather than in production.

I don't think I've ever shipped a non-zero-sized channel in Go. Occasionally designs go through phases where they may have them, but they always come back out before I'm done. Even the "only allow X of a certain resource to be used" always seems to turn into "only spawn X goroutines that use that resource and feed them work off a channel that has no reason to be buffered".

reply

thethirdone | karma 1155 | avg karma 2.55 · | 2017-04-26 20:24:55+00:00

Channels can be made with no buffering at all. In this case trying to send without a receiver is blocking, so the channel is not really acting as a queue anymore.

Buffering is only an option to improve performance. Programs should (if they are correct in a sense) be able to work with channels without buffering.

reply

bluetech | karma 251 | avg karma 3.44 · | 2018-01-15 15:16:26+00:00

A buffered channel can be implemented with two unbuffered channels + a goroutine + an array.

yandrypozo | karma 294 | avg karma 2.56 · | 2017-04-26 19:05:23+00:00

That's a common mistake, don't look at channels as queues, one thing is be buffered and other one it's working as a queue, actually buffered channels should be avoided at least you really need a buffer.

always_good | karma 4605 | avg karma 3.29 · | 2017-12-26 02:47:22+00:00

I don't understand. I would just about never want to use an unbounded buffer, for example, for a channel.