-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print spans where tags are created and invalidated #2030
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaict this set of information will just keep growing indefinitely. At this point I think we could try a refactoring where SbTag actually is an Rc and we store the new tracking data in the heap allocation.
There were also some smaller cleanups you did (around mutable references and refcell borrowing) that I think can be pulled out into separate PRs or commits?
That is correct, but in this implementation we could store a finite amount of data per allocation. Anecdotally, the problematic tags often seem to be created recently.
Can you elaborate on what you mean by this? Tags have a sort of obvious reference-counted-like behavior, but the problem we have at the moment is that we cannot clean up tags until the whole allocation goes away. Unless we can remove tags from a borrow stack because they are deemed inaccessible, I don't think there's any room for improvement.
This PR contains bits of two other PRs that are required to make it work. I think talking about cleanups will be a lot easier once those land and I can cleanly rebase onto the exact revisions that land. |
My hope was that we could make the ones in the borrow stack be a |
☔ The latest upstream changes (presumably #1971) made this pull request unmergeable. Please resolve the merge conflicts. |
In this comment, I concluded we need something like what you're proposing but I didn't think to use So |
we could keep them around in a list just to be able to tell people at the end about all the leaked tags |
My concern is figuring out that a tag has been leaked. I'm just not sure how to implement that. |
b1f1b74
to
50e7b90
Compare
src/eval.rs
Outdated
.map(|frame| frame.current_span()) | ||
.unwrap_or(rustc_span::DUMMY_SP); | ||
if let Some(sb) = ecx.memory.extra.stacked_borrows.as_mut() { | ||
sb.get_mut().current_span = current_span; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there is some reasonable way we could just have the span available where it is currently missing? I assume this is for memory_read
and memory_written
and it seems not unreasonable to give them a TyCtxtAt
which contains a span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not entirely sure what you're getting at, but I have so many feelings about this code.
I've generally tried to avoid modifying rustc to provide the functionality I want here. But we definitely need some more state somewhere, I feel like I'm starting to shove state where it doesn't really belong.
This code produces DUMMY_SP
when there isn't a local span available. I'm not sure what span to select when there is nothing local. The topmost span? And this code is a bit constrained because it needs the array of local crates available to deduce which span it should be considering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is ecx.cur_span()
to get the best approximation of the "current" span.
However, I did not realize that you are deliberately computing another span than that here. The problem is that inside the read/write handlers we cannot have access to the entire machine state since there are outstanding borrows on particular allocations. We could have access to the stack though... that might not even require changes to rustc, if we can move the threads
field from Evaluator
to MemoryExtra
. (There is not really a conceptual distinction between those two, we just put things into MemoryExtra
whenever that is required.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is how I think you can entirely avoid this "set_cur_span" thing: RalfJung@a20fb71
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to thread a bit more logic through to have access to the local crate list, but the logic is factored better!
☔ The latest upstream changes (presumably #2024) made this pull request unmergeable. Please resolve the merge conflicts. |
2ad160b
to
2ad417e
Compare
After much searching, I have found an example where these diagnostics do very poorly.
This is some classic invalidation. People do this sort of thing all the time, but my attempt to diagnose this has gone wrong in many ways.
The span for pub struct StackVec<T, const N: usize> {
vec: [MaybeUninit<T>; N],
len: Size,
} So the retag of For what it's worth, I had to search through 61 crates that I know contain SB issues before I found one where this PR's diagnostics emitted something confusing. |
Is it possible that the invalidation happens at a |
I'm pretty sure it does. I minimized the code into this fn main() {
let mut x = 0i32;
let z = &mut x as *mut i32;
x.do_bad();
unsafe {
let _oof = *z;
}
}
trait Bad {
fn do_bad(&mut self) {
// who knows
}
}
impl Bad for i32 {} Which emits this
So I think the problem here, at least in my opinion, is that the interpreter stack has a frame which says its span is the entire called function during a |
We certainly cannot point at the call site, the retag is part of the callee operations (that is crucial for its role in optimizations, and also pragmatically speaking when that span is set we have no idea who will call this function). The span for the retag is determined here. |
Ah, yes. I think that we should probably surface the kind of retag that's happening here in a diagnostic, which may help avoid confusion. If nothing else it helps people get to the part of the linked SB reference that explains what's going on. I agree with the FIXME, maybe I'll try fixing it. But more broadly, I'm thinking about helping people fix their code. Pointing at the argument is technically correct, but the change that needs to be made to fix the aliasing in this situation is at the call site. So for diagnostic purposes, we could back up a stack frame when picking the current local span if we are inside a FnEntry retag. But of course really the problem here is crossing a function boundary, so I wonder if there's similar lack of useful information with the span selection during retagging that happens upon function return. |
How universally true is that for |
I'm at 1/1 so far. Currently trying to get this branch into my bot setup so that I can do a proper search. It might be a while, but I'm determined to know. |
7f37fa6
to
50f19e3
Compare
I wrote a little script that looks for invalidation errors where the first line of the span contains Best fixed at the call site: Best fixed in the function definition: Structurally problematic: One thing that sticks out to me is that a lot of these are this pattern: ptr::copy(arr.get_unchecked(src), arr.get_unchecked_mut(dst)); where the author wants to pass two pointers into their container to a function, so they call two functions to get them where the receiver of the second one is Mostly posting this to share my progress, still thinking on what to do with this information to be more helpful in diagnostics. |
50f19e3
to
0224071
Compare
☔ The latest upstream changes (presumably #2047) made this pull request unmergeable. Please resolve the merge conflicts. |
I merged In particular, |
0224071
to
fa31195
Compare
* Pass a ThreadInfo down to grant/access to get the current span lazily * Rename add_* to log_* for clarity * Hoist borrow_mut calls out of loops by tweaking the for_each signature * Explain the parameters of check_protector a bit more
dcf53c8
to
972b3b3
Compare
src/stacked_borrows.rs
Outdated
@@ -111,7 +118,12 @@ pub struct GlobalStateInner { | |||
tracked_call_ids: HashSet<CallId>, | |||
/// Whether to track raw pointers. | |||
tag_raw: bool, | |||
/// Extra per-allocation information | |||
extras: HashMap<AllocId, AllocHistory>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should at least try to move extras
into Stacks
. It looks to me like the log_*
functions could actually be implemented on Stacks
.
src/stacked_borrows/diagnostics.rs
Outdated
extras.current_time += 1; | ||
} | ||
|
||
fn get_stack_history( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the use of iterator combinators vs for loops and and_then
/map
vs if let
feels very random in this function and I keep expecting a specific choice to have a reason.
Maybe create one function per TagHistory
field and use ?
more within these functions?
* Store the local crates in an Rc<[CrateNum]> * Move all the allocation history into Stacks * Clean up the implementation of get_logs_relevant_to a bit
r=me with those updated unless they are up to date |
@@ -764,8 +797,16 @@ trait EvalContextPrivExt<'mir, 'tcx: 'mir>: crate::MiriEvalContextExt<'mir, 'tcx | |||
Permission::SharedReadWrite | |||
}; | |||
let item = Item { perm, tag: new_tag, protector }; | |||
stacked_borrows.for_each(range, |offset, stack| { | |||
stack.grant(orig_tag, item, (alloc_id, range, offset), &*global) | |||
let mut global = this.machine.stacked_borrows.as_ref().unwrap().borrow_mut(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This moves a borrow
into the closure, which is probably bad for performance.
Is it possible to keep it at the old place, or does that lead to multiple borrows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it does lead to multiple borrows. I actually attempted this transformation before in a previous commit then I amended it out when CI failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, then please add a comment explaining that.
Same here. Would be good to also double-check that the benchmarks did not get any worse with these refactorings. |
Without raw pointer tagging, I cannot measure a regression. With it, I measure a regression of ~5% in |
@bors r+ |
📌 Commit 8ff0aac has been approved by |
Thanks for checking! Very excited for this PR being in miri proper |
☀️ Test successful - checks-actions |
Thanks a ton for pushing this through, @saethlin. :-) FWIW I hope to make raw ptr tagging the default once permissive provenance works, so if there is something we can do about those 5% that would be great... |
My instinct is that the 5% could be recovered if instead of a |
The lazy current span calculation computes the current span once per invalidated tag, potentially computing it many times in one interpreter step 🤦 |
Oh, good point. |
Factor current-span logic into a caching handle After #2030 and while working on #1935 it became quite clear that we need to do some caching here, because some retag operations generate many calls to `log_invalidation`, and would thus search the current thread's stack _many_ times for a local crate. This caching fixes that. This handle type also has the nice benefit of tucking away all the `ThreadManager` + `CrateNum` logic.
This was very helpful in fixing fitzgen/bacon-rajan-cc#37. Thanks for doing it! |
5225225 called this "automatic tag tracking" and I think that may be a reasonable description, but I would like to kill tag tracking as a primary use of Miri if possible. Tag tracking isn't always possible; for example if the UB is only detected with isolation off and the failing tag is made unstable by removing isolation. (also it's bad UX to run the tool twice)
This is just one of the things we can do with #2024
The memory usage of this is shockingly low, I think because the memory usage of Miri is driven by allocations where each byte ends up with its own very large stack. The memory usage in this change is linear with the number of tags, not tags * bytes. If memory usage gets out of control we can cap the number of events we save per allocation, from experience we tend to only use the most recent few in diagnostics but of course there's no guarantee of that so if we can manage to keep everything that would be best.
In many cases now I can tell exactly what these codebases are doing wrong just from the new outputs here, which I think is extremely cool.
New helps generated with plain old
cargo miri test
onrust-argon2
v1.0.0:And with
-Zmiri-tag-raw-pointers
onslab
v0.4.5And without raw pointer tagging,
cargo miri test
onhalf
v1.8.2The second suggestion is close to guesswork, but from experience it tends to be correct (as in, it tends to locate the pointer the user wanted) more often that it doesn't.