Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Tree-Structured Concurrency (blog.yoshuawuyts.com) similar stories update story
138 points by abhi9u | karma 809 | avg karma 4.96 2023-07-02 02:04:25 | hide | past | favorite | 31 comments



view as:

As a proponent of structured concurrency [1], I am happy to see this post.

I personally think that Rust would be the best language ever if it did a few things:

* Got rid of async in favor of structured concurrency built in to the language.

* Made compile times competitive with C.

I think that much of the complexity of the language would be vastly reduced by adopting structured concurrency and dropping async because of the very compatibility mentioned in the article.

I also think that Rust is unique among languages in that it would benefit most from structured concurrency because its borrow checker would interact with structured concurrency in great ways.

Compile times are still a problem, though.

[1]: https://gavinhoward.com/2019/12/structured-concurrency-defin...


I can't say I ever understood how and why async became such a big feature for Rust. I feel like I might understand someday (when I have more experience working with async code) but for now it kind of feels like adding it to the language was a massive, severe change which severely shaped the development and future of the language for years (async traits are still a work in progress!) and I just don't understand whether async was truly the right approach here, or the one feature to prioritize above all else.

As far as I know the community is (was?) still a bit divided on async, mainly because it's rough around the edges, and makes for bad compiler errors.

I wouldn't be surprised if 15+ years later people will think of async as "Probably great for its time, but there are better concurrency paradigms which we should shift to." similar to the shift from inheritance to composition.


You put into words a lot of the feeling I have about Rust.

Async might be a wrong abstraction for high-level programming: green threads/goroutines are so much easier to use.

On the other hand, writing a state machine for handling concurrent events/transitions is literally async done manually. libuv, any high-performance network server, UI events handling are all like that.

I agree that doing low-level async right is difficult and might be an impractically long process.


I wonder if it was a mistake: it'll be really hard to undo something like async because it infects the code. Which itself should have been a warning sign.

It would be really nice if the Rust standard library were to get structured concurrency similar to what Ada has:

https://learn.adacore.com/courses/Ada_For_The_CPP_Java_Devel...

https://en.wikibooks.org/wiki/Ada_Style_Guide/Concurrency

It allows multiple concurrent taks to run within a parent block of code and terminate at the end of the block.

It is extremely useful for embedded and parallel programming which would help Rust to further succeed in those areas.

So much concurrent and parallel Rust code relies on third-party libraries because the standard library offers primitives that work but lack the "creature comforts" that developers prefer.

It is good that futures-concurrency offers better choices for Rust developers, but ultimately it would be great for the Rust to adopt better concurrency APIs.


This sounds like what scoped threads at least partially provides. It was in pre-1.0 Rust, got moved to a 3rd party library, but stabilized in 1.63: https://doc.rust-lang.org/stable/std/thread/fn.scope.html

Some of the features are there, but not all:

It looks like each thread in scoped threads must be "manually started" (not launched automatically at the beginning of the scope).

No defined mechanisms for synchronization or data passing between scoped threads. There are probably ways it can be done, but I am not sure what approach would work best without some experimentation.


In Rust, the idiomatic solutions for data-passing would be a fifo channel (either std::sync::mpsc from the standard library or one of the popular libraries (crossbeam or flume)), and the solution for synchronization would be one-shot channels or zero-capacity channels.

Or you can just manually join the threads you started


futures::join!() and friends do what you’re after admirably. Is the complaint that something is still missing, or that it’s not in the standard library?

Ada's tasks don't require an explicit join, they're lexically scoped. So if you have a procedure with two subtasks that procedure will not terminate until those two subtasks have terminated (by whatever means they may be terminated, running to completion or an explicit `abort SomeTask;`).

Here are a couple pages on Tasking in Ada:

https://learn.adacore.com/courses/intro-to-ada/chapters/task... - doesn't compare to any other languages, just Ada code.

https://learn.adacore.com/courses/Ada_For_The_CPP_Java_Devel... - Includes comparisons to Java and C++.

http://www.ada-auth.org/standards/22rm/html/RM-9-8.html - more on cancellation via the abort statement. If you abort a task with subtasks running, it will terminate all of them.

Ada has some escape hatches around its task system that "breaks" from structured concurrency, but pretty much everything people have been asking for around structured concurrency is already in Ada and mostly the default behavior.


How possible is this kind of thing in Golang, in Elixer? I'm trying to choose tech for parsing, processing, calculating, and applying complex business rules to a massive csv parsing pipeline. Early stages but golang seems like the right candidate.

All this is easily possible. You can leverage fan-out/fan-in/pipeline concurrency patterns in Go.

It's really awful in Go to create concurrency patterns. In particular because the generics are clunky and verbose, and there are no type classes/traits. There's also no good cancellation convention: Context is verbose, dynamically typed, and not used in all the standard interfaces.

This is, actually, the erlang/elixir concurrency model, (with a lot of tooling behind for doing it well).

Isn’t erlang a very different, almost opposite concurrency model? I’m not an erlang expert but async message passing is a very different style. The limitation around background tasks in structured concurrency is trivial in erlang for example

Yes, but proceses are supervised (in a tree structure). So you have the same warranties described in structured concurrency

Yeah this seems compelling, especially the economics of it when paying for machine time.

Yet I yearn for higher tasking APIs such as a standard task-tree or task-graph.

The parrafin library helps a bit, but it's still heavy handed and feels abandoned (yes I or whomever I'm working for could take the maintainer mantle or write language proposals, but I don't actually know which solution would be best...).

The sparkel/parasail language prototypes (or actual implementations) had some interesting ideas but I'm not sure how it panned out and what the lessons learned where.


Cool to see people exploring this more. It's been around forever in Erlang with supervisor trees.

I was thinking “they reinvented GenServer”, but felt like a snob haha

Isn't that quite different? In structured concurrency, the parent thread blocks, doing nothing, until all the child threads terminate, at which point it carries on, using results produced by the children. Whereas with supervisor trees, the child processes run forever, and the parent process is there to restart them if they terminate.

[dead]

Concurrency, parallelism, async, multithreading is my favourite subject that I am interested in but I am not an expert.

Thank you for this article.

I think the ergonomics of modern programming languages for concurrency and parallelism is not there yet but I like structured concurrency.

In one of my projects I generate a dependency graph of work and then parallelise on all a node's egress edges in the graph. I just .join() threads in each node's thread on all ancestors of that node. This is how https://devops-pipeline.com/ works.

One approach to performance from parallelism, split/shard your data between threads and represent your concurrency AND parallelism as a tree of computation where leafs (leaves) of the tree never need to communicate until they complete leads to high performance because of Amdahls law and avoidance of synchronization during processing. Single core performance is fast and if you multithread it right you can get a performance boost. Single threaded can be faster than naïve multithreading due to synchronization overheads ( http://www.frankmcsherry.org/assets/COST.pdf )

My other approach is to create a syntax to represent multithreaded, concurrent and parallel state machines.

Kafka makes it fairly easy to parallelise pipelines and Go pipelines can be created, I just want to use a notation!

Here's my notation which I run on 2 threads:

  thread(s) = state1(yes) | send(message) | receive(message2);
  thread(r) = state1(yes) | receive(message) | send(message2);
This syntax is inspired by prolog facts. thread(s) and thread(r) are facts that the program waits for and need to be fired before the rest of the pipeline after the =. When state1(yes) is fired, the statemachine moves to the next state after the | pipe symbol. One thread sends send(message) a message and the other receives a message receive(message). You can also put multiple facts in each state line which provides asynchrony of multiple events joining into one state, kind of like multiple nodes connecting to a single node in a graph:

  order(orderno) checkout-received(orderno) = take-payment | save-order(orderno) | send-email-confirmation
This waits for order(orderno) and checkout-received(orderno) separate events and then moves to the take-payment action.

I have a simple Java parser and runtime for this syntax. What I like about it is that it combines state machines, parallelism and asynchrony.


I agree that it makes no sense that there's no syntax in modern languages for concurrency. With structured programming we have single threaded stuff all figured out. There needs to be an equivalent for concurrency. Maybe it's incorrect to think it can be done with special types like Future/Promise or whatever. Where are the if, while, for, return, drop of concurrency?


It was included as an incubator feature in Java 19, and will be available as a preview feature in Java 21 later this year

It's actually available as a preview feature in early access builds of Java 21 right now, for those who really live on the edge!

I'd love to see Kotlin migrate their coroutines to a thin layer on top of virtual threads. Would make interop with Java code much nicer and improve performance.

This approach is extremely popular. ?++ have it with parallel-for like libraries.

It has some downsides. When multiple threads join before allowing to continue it leaves some perf on the table. If you have limited number of threads to hardware concurrency then any thread that reaches join point is typically unavailable to deal with remaining work or other algorithms.

In gamedev with limited concurrency due to hardware it is more beneficial to let every worker thread to participate all the time. Instead of joining back to sequential execution one uses task dependencies. If continuation is done as separate task that is scheduled "when all" of the dependencies are done then there are no threads that unavailable due to waiting. This way if you have multiple parallel algorithms in flight then every thread is available for any task. Instead of me handwaving it here it is probably better to check a presentation from Sean Parent [1]

1. https://sean-parent.stlab.cc/papers-and-presentations/#bette...


Legal | privacy