forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Call site bkpts #9747
Open
jimingham
wants to merge
4
commits into
swiftlang:swift/release/6.1
Choose a base branch
from
jimingham:call-site-bkpts
base: swift/release/6.1
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Call site bkpts #9747
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I have some reports of A/B inversion deadlocks between the ThreadPlanStack and the StackFrameList accesses. There's a fair bit of reasonable code in lldb that does "While accessing the ThreadPlanStack, look at that threads's StackFrameList", and also plenty of "While accessing the ThreadPlanStack, look at the StackFrameList." In all the cases I've seen so far, there was at most one of the locks taken that were trying to mutate the list, the other three were just reading. So we could solve the deadlock by converting the two mutexes over to shared mutexes. This patch is the easy part, the ThreadPlanStack mutex. The tricky part was because these were originally recursive mutexes, and recursive access to shared mutexes is undefined behavior according to the C++ standard, I had to add a couple NoLock variants to make sure it didn't get used recursively. Then since the only remaining calls are out to ThreadPlans and ThreadPlans don't have access to their containing ThreadPlanStack, converting this to a non-recursive lock should be safe. (cherry picked from commit b42a816)
…t-call-site-breakpoints"" This reverts commit c93fb6d.
I have some reports of A/B inversion deadlocks between the ThreadPlanStack and the StackFrameList accesses. There's a fair bit of reasonable code in lldb that does "While accessing the ThreadPlanStack, look at that threads's StackFrameList", and also plenty of "While accessing the ThreadPlanStack, look at the StackFrameList." In all the cases I've seen so far, there was at most one of the locks taken that were trying to mutate the list, the other three were just reading. So we could solve the deadlock by converting the two mutexes over to shared mutexes. This patch is the easy part, the ThreadPlanStack mutex. The tricky part was because these were originally recursive mutexes, and recursive access to shared mutexes is undefined behavior according to the C++ standard, I had to add a couple NoLock variants to make sure it didn't get used recursively. Then since the only remaining calls are out to ThreadPlans and ThreadPlans don't have access to their containing ThreadPlanStack, converting this to a non-recursive lock should be safe. (cherry picked from commit b42a816)
In fact, there's only one public API in StackFrameList that changes the list explicitly. The rest only change the list if you happen to ask for more frames than lldb has currently fetched and that always adds frames "behind the user's back". So we were much more prone to deadlocking than we needed to be. This patch uses a shared_mutex instead, and when we have to add more frames (in GetFramesUpTo) we switches to exclusive long enough to add the frames, then goes back to shared. Most of the work here was actually getting the stack frame list locking to not require a recursive mutex (shared mutexes aren't recursive). I also added a test that has 5 threads progressively asking for more frames simultaneously to make sure we get back valid frames and don't deadlock. (cherry picked from commit 186fac3)
adrian-prantl
approved these changes
Dec 14, 2024
@swift-ci please test |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a reworking of the PR I originally submitted as:
#9493
Turns out that patch could cause a A/B lock inversion - the SwiftREPL/BreakpointSimple.test was the only instance of that.
Fixing that required reworking the locking strategy for ThreadPlanStack and StackFrameList. The deadlock was artificial since in fact everybody acquiring these lock where it deadlocked were readers, not writers, so there wasn't in fact any contention.
This PR is a combination of the three patches on the llvm.org side, two that converted the two exclusive mutexes to shared mutexes, and the reapplication of the call site patch.
I'm doing this as a combo patch because the original patch caused no problems on llvm.org and so was not reverted, making putting this in as three separate patches troublesome.