Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Only guess I’d have is to protect the system against infinite-loop tasks, but I don’t remember any other runtime caring and an a task which never terminates seems easier to diagnose than one which disappears on you.


sort by: page size:

Because if a certain application expected that /dev/urandom never blocks and it suddenly starts blocking, the application might start behaving not as expected (performance degradation, race conditions, resource starvation, etc).

>it's to prevent your process sitting idle while it waits for I/O.

...with the goal of making your application faster.


Not sure, but reminds me of how garbage collectors must sometimes pause application threads.

I was thinking more about everything timeouting (or otherwise losing relevant state) around debugged thread / process.

Sounds like the Abro runtime never resumes the tasks so how could they throw?

:)

Its not that strange actually. If you compare it to threads, where you shoot off a thread which you never join(). It’s exactly the same scenario.

Any thread which doesn’t have a join needs a watchdog. Or register the unhandled exception handler.

Can’t remember if threads do this by default. Asyncio tasks for sure don’t, and they can’t really do it either, since a task in error state could be saved for later use for legitimate reasons. Closest you get is the warning printed to log about tasks that got GCd without ever being awaited, if you enable ASYNCIO_DEBUG=1


Why do you need to join?

Let's say I write a task that updates a progress bar as an infinite loop, and let it be gc'ed on program exit, without ever joining it. What's wrong with that design? I can, of course, modify the task to check a flag that indicates program completion, and exit when it's set. But does this extra complexity help the code quality in any way?

Or suppose I spawn a task to warm up some cache (to reduce latency when it's used). It would be nice if it completes before the cache is hit, but surely not at the cost of blocking the main program. I just fire-and-forget that task. If it executes only after the cache was hit, it will realize that, and become a no op. Why would I want to join it at the end? It may not be free (if the cache was never hit, why would I want to warm it up now that the main program is exiting?).


This is a very common one with concurrent code. I've seen many cases where something works most of the time until a log statement is removed.

Almost any application with an event loop doesn’t halt.

A determined user will still be able to figure out something that blocks forever, for instance run a WASM program that has an infinite loop in it.

Having an event loop that can freeze your whole application, is my guess.

It could even just be a timeout as part of retry logic or similar. A lot of people seem to be saying that there is no reasonable reason to have a `sleep` in a production application. But there are many legitimate reasons to need to delay execution of some code for a while.

Probably because there is no need for it ? If WaitGroup waits for the end of all tasks to continue, there is map/reduce which should do the trick.

> It can just mean ensuring that if the code does try to run indefinitely, it doesn't have unfortunate effects such as blocking the UI thread without the possibility of being interrupted.

Well that can be achieved by executing the code in a background worker thread. Which doesn't affect the UI thread in browsers... no sure how it's managed but I think you could terminate it after a certain amount of time too


Calling the function again may be entirely logical and may be because of independent state, e.g. locks or security. Think of the famous problem we had for decades with file deletion hanging for 10 seconds or so if you tried to delete a file in use. Somebody traced and disassembled the code; it was literally a hardcoded loop that did 10 iterations of attempting to get a lock on the file followed by sleeping one second if failed... The loop-and-sleep combo there might have been a hack, but it was excusable.

Because they have expressed the intent to run it by scheduling it on the event loop?

I don't follow the argument wrt tidying up rogue tasks. What does it mean for the task to be "rogue"? If there was some state change that made the task redundant - because it clearly wasn't when it was submitted! - then the code that makes that change, or some other code that observes it, should cancel it. If it isn't cancelled, the fact that nobody is able to observe the value that the task will yield is not sufficient to auto-cancel, as there may still be a dependency on side effects.

And, speaking of tidying up, what if the scheduled task is the one that performs some kind of cleanup?


> What happens when a tasklet take too long?

The same thing that happens when 1+1 == 3, or when a task tries to write to memory that it doesn't have permissions for. The static analysis that your system relies on for correct behavior is no longer valid, so a hardware belt-and-suspender mechanism (a schedule overrun timer interrupt, a lockstep core check failure, or an MPU fault, respectively) resets or otherwise safe-states the failed ECU and safety is assured higher up in the system analysis.


I wonder about why this is the case.

* Is it because we as users have been conditioned, through years of faulty software, to assume crap crashes/hangs when there are unexpected delays?

* Or is the majority of computing so fast and instantaneous that we can't bring ourselves to wait on something that doesn't have an immediate end in sight?


> The only steps that are persisted are workflow api calls like sleep(), startChildWorkflow(), or calling code that might fail (ie “Activity”, like a network request).

Ok, that's what I was wondering. Makes a lot more sense this way.

next

Legal | privacy