fix: reduce deadlock potential in shell->client #180

ethanpailes · 2024-12-03T18:43:29Z

This patch attempts to reduce the potential to deadlock in the shell->client thread by removing the mutex lock around the shell->client unix socket stream. While recent changes to reduce deadlocks have successfully prevented a single locked session from gumming up all the other sessions, the core issue seems to be persisting. I don't see a smoking gun in the logs that a reporter sent me, but reducing the number of locks in the shell->client thread, which seems to be the one that gets locked up, might help. This should also be slight better for performance (not that it really matters here).

In addition to that main change, this patch contains a few little bugfixes:

Previously, every time the client probed for daemon presence, the daemon would put a distracting stack trace in its log. This wasn't a real issue, but it could be confusing for people not familiar with how shpool works. I've had a few people point to it in bug reports. I suppressed the stack trace and replaced it with a more civilized log line. Stack traces in the logs should be indicative of an issue, not business as usual.
I fixed the child_watcher thread to directly call waitpid rather than using the shpool_pty crate to do the work. This isn't ideal, but is required to avoid a use-after-close issue that was causing trouble. I think this bug was already present, but it just wasn't presenting a problem during tests for whatever reason (likely just timing). I had to fix this to get the tests passing.

This patch attempts to reduce the potential to deadlock in the shell->client thread by removing the mutex lock around the shell->client unix socket stream. While recent changes to reduce deadlocks have successfully prevented a single locked session from gumming up all the other sessions, the core issue seems to be persisting. I don't see a smoking gun in the logs that a reporter sent me, but reducing the number of locks in the shell->client thread, which seems to be the one that gets locked up, might help. This should also be slight better for performance (not that it really matters here). In addition to that main change, this patch contains a few little bugfixes: - Previously, every time the client probed for daemon presence, the daemon would put a distracting stack trace in its log. This wasn't a real issue, but it could be confusing for people not familiar with how shpool works. I've had a few people point to it in bug reports. I suppressed the stack trace and replaced it with a more civilized log line. Stack traces in the logs should be indicative of an issue, not business as usual. - I fixed the child_watcher thread to directly call waitpid rather than using the shpool_pty crate to do the work. This isn't ideal, but is required to avoid a use-after-close issue that was causing trouble. I think this bug was already present, but it just wasn't presenting a problem during tests for whatever reason (likely just timing). I had to fix this to get the tests passing.

ethanpailes requested a review from Aetf December 3, 2024 18:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reduce deadlock potential in shell->client #180

fix: reduce deadlock potential in shell->client #180

ethanpailes commented Dec 3, 2024

fix: reduce deadlock potential in shell->client #180

Are you sure you want to change the base?

fix: reduce deadlock potential in shell->client #180

Conversation

ethanpailes commented Dec 3, 2024