> I don't think that's a commonly-accepted (or useful) definition of "blocking." By that definition, getpid(2) is blocking.
When it comes to expecting a specific duration, getpid() is blocking. If you run getpid() in a tight loop and then have performance issues you can’t reasonably blame the system.
> This isn't a portable program; it's a Linux program
But the interface is a portable interface
> POSIX does not mandate that close blocks on anything other than removing the index from the fd table
And what if the fd-table is a very large hash table with high collision rate? How do you then specify how quickly close() should complete? 1ms/open fd? 10ms/open fd? Etc.
It should be clear that the problem here is that the author of the code had a faulty understanding of the system in which their code runs. Today the issue was close() just happened to be too “slow.” If the amount of input devices were higher, let’s say 2x more, then the same issue would have manifested even if close() were 2x “faster.” No matter how fast you make close() there is a situation in which this issue would manifest itself. I.e. the application has a design flaw.
u/CyberRabbi is absolutely correct. It's true that for _some_ kinds of devices you could expect fast close(2) IF the device documents that. But as you can see, implementing this can be hard even for devices where you'd think close(2) has to be fast. Even a tmpfs might have trouble making close(2) fast due to concurrency issues.
The correct thing to do when you don't care about the result of close(2) is to call it in a worker thread. Ideally there would be async system calls for everything including closing open resources. Ideally there would be only async system calls except for a "wait for next event on this(these) handle(s)" system call.
What the proposed patch does is delay a specific latent operation to an asynchronous context so that close() doesn’t block on that operation (which is freeing some memory).
The proposed patch isn’t a comprehensive fix, it admits there are still other sources of relatively high close() latency.
So that got me thinking, there is no way to fix this “bug” because there is no specification on how long close() should take to complete. As far as we are promised in user-land, close() is not an instantaneous operation. close() is a blocking operation! Even worse, it’s an IO operation.
So now I think the bug is in the application. If you want to avoid the latency of close() you should do it asynchronously in another thread. This is similar to the rule that you should not do blocking IO on the main thread in an event-loop based application.
It is appropriate to use your main thread for your OS interaction - polling fds, talking to display server, whatever I/O you need, etc. An open/close call should never take this long, and you should never need to make a large amount of them in sequence after startup.
What should not be on your main thread is any long blocking compute, which is why rendering/game logic often goes to another thread - although simple games could easily be single-threaded.
> The call is only supposed to return when all the data has been sent, so both.
Good luck. Splice from file to socket doesn’t send any data — it just references it. So if you mmap a file writably and then splice from it to a pipe, either splice needs to copy it or write-protect the mapping to enable copy-on-write or take a reference that will potentially propagate future changes from the mapping to the pipe. Write-protecting the mapping is slow, so pretty much all options don’t work. Blocking until the reference is gone makes no sense, because reference won’t go away after until the caller splices to the network, which will never happen. Deadlock.
Your fast path is still crippled by your slow paths' mess. It doesn't matter if you've isolated and optimized those paths in isolation, to the GC there's just Your Process and it's going to suspend Your Process whenever it wants for however long it needs regardless of what's currently happening.
So if you're latency sensitive then all of your code needs to be aggressive at avoiding object creation. All of your code becomes part of "the fast path", even if it's in a different thread.
Or you isolate your fast path in a different process or a non-GC'd runtime, the later being the approach taken here by Instagram.
> It should block if and only if there is actual io in flight which could produce a failure return that an application needs.
Blocking simply means that the specification does not guarantee an upper bound on the completion time. There is no other meaningful definition. POSIX is not an RTOS therefore nearly all system calls block. The alternative is that the specification guarantees an upper bound on completion time. In that case what is an acceptable upper bound for close() to complete in? 1ms? 10ms? 100ms? Any answer diminishes the versatility of the POSIX VFS.
> Syscalls should be fast unless there is a very good reason not to be.
I think this is an instance of confusing what should be with what is. We’ve been through this before with O_PONIES. The reality is that system calls aren’t “fast” and they can’t portably or dynamically be guaranteed to be fast. So far the only exception to this is gettimeofday() and friends.
Robust systems aren’t built on undocumented assumptions. Again, POSIX is not an RTOS. Anything you build that assumes a deterministic upper bound to a blocking system call execution time will inevitably break, evidenced by OP.
> SO_LINGER. Lingers on close if data is present. If this option is enabled and there is unsent data present when close() is called, the calling application program is blocked during the close() call, until the data is transmitted or the connection has timed out.
I had to look this one up for a refresher, but 100% violently agree - Such behavior certainly warrants a bug submission.
I'd say that the commonly accepted definition for a blocking call is one that may depend on I/O to complete, releasing control of the CPU core while waiting.
By that definition, getpid() is definitely nonblocking, though it doesn't have an upper bound in execution time. POSIX does not offer hard realtime guarantees.
close() in general would probably be blocking (as a filesystem may need to do I/O), but I'd expect it to behave nonblocking in most cases, especially when operating on virtual files opened read-only. Unfortunately, I don't think those kinds of behavioral details are documented.
> As soon as the second process writes a result file to a filesystem, it notifies the first process, and the first process exits. The second process can take time to exit, because it is not an interactive process.
Looks like a trade-off which may favor speed over reliability. If a syscall (like msync) in 2nd process will return an error there will no way to abort a build if 1st (main) process already has finished.
To me, the trick with closing the resource sounds like a hack.
Imagine you wanted to generalise it: We have a goroutine that may use some resources: sockets, files, keyboard, screen etc. To shut it down you cut it off from all those resources and hope that it will die.
Is this really an issue? So far I didn't care much about close or its synchronization. Isn't just calling CLOSED good enough? Do we really have to be exactly sure when CLOSE was called? If it was an issue I would rather do without CLOSE.
EDIT: or maybe just allow CLOSED guards in WITH, similar to SEND or RECEIVE, like
WITH
SEND(TxChan, text) DO
| RECEIVE(FailChan, fail) DO
| CLOSED(FailChan) DO
END
But I think CLOSE makes everything more complicated without a clear use; I mostly added it because Go has it.
EDIT2: yet another idea:
WITH
SEND(TxChan, text) DO
| ok := RECEIVE(FailChan, fail) DO
END
where ok is a BOOLEAN variable and the assignment is optional; looks a bit strange but allows the same closed handling as in Go.
In such a system each component still needs to be interruptible. If you're waiting on a blocking operation like disk or network I/O, the caller might give up but the request will still hang around until completion at which point it gets thrown out.
In the case of a blocked process, you may be able to force kill it at a system level but then you risk uncleaned up state (leaked connections/resources)
For instance, this is the default behavior with Postgres (queries complete even if the connection is closed)
Why would they call shutdown and then immediately close?
Just call close immediately.
In fact, there is almost no need for anyone ever to call shutdown.
It may make sense if you want to pass a socket descriptor to another process and you only want them to read from the socket, not write to it, to call a unidirectional shutdown on it before passing it on. But shutdown RDWR and then close is just silly.
The proper fix is not to add more exception handling but to remove the shutdown line altogether.
Don't ever call close on the hot path.
reply