-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a generic Destination Propagation optimization on MIR #72632
Conversation
r? @eddyb (rust_highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
☔ The latest upstream changes (presumably #72935) made this pull request unmergeable. Please resolve the merge conflicts. |
I haven't looked into these in detail, but:
Seems like we're very close to handling them though. |
Exciting! |
This comment has been minimized.
This comment has been minimized.
This modified example from #56172 regresses with this pass: #[inline(never)]
pub fn g(clip: Option<&bool>) {
clip.unwrap();
let item = SpecificDisplayItem::PopStackingContext;
do_item(&DI {
item,
});
do_item(&DI {
item,
});
}
pub enum SpecificDisplayItem {
PopStackingContext,
Other([f64; 22]),
}
struct DI {
item: SpecificDisplayItem,
}
fn do_item(di: &DI) { unsafe { ext(di) } }
extern {
fn ext(di: &DI);
} Nightly optimizes this properly, but with this pass, one of the |
Since I always expected this to work the other way around, replacing dest with source, not the other way around, what's the reasoning behind doing it this way? Feel free to just point me somewhere where I can read up on it myself, I just didn't find anything right away. |
The return variable is always called |
That's what RVO does, assigning the return slot to a value that would otherwise use a separate stack slot, but that's just one way. Doing copy propagation works just as well as long as there is only one possible value that is being assigned.
is transformed to the following by this pass:
but copy propagation would lead to:
Something like dead store elimination would then have clean up the dead assignment to The example from above is akin to
and gets converted to
while copy propagation would give
The difference being that copy propagation doesn't force the two destinations to be alive at the same time, and apparently allowing more optimizations to happen e.g. in case of aggregates. The case where the approach taken here works better is like this:
I'm wondering whether it's more common to use a single value in multiple places, or assign to single destination from a value that has been set in multiple places. I suppose the truth is somewhere in between... Maybe limiting this pass to cases where only a single replacement is possible is useful? |
This comment has been minimized.
This comment has been minimized.
⌛ Testing commit 2f9271b with merge 5b31eada25b8031d6ddf1ef9c45878b0751431e7... |
💔 Test failed - checks-actions |
Scary but spurious? @bors retry |
Implement a generic Destination Propagation optimization on MIR This takes the work that was originally started by @eddyb in rust-lang#47954, and then explored by me in rust-lang#71003, and implements it in a general (ie. not limited to acyclic CFGs) and dataflow-driven way (so that no additional infrastructure in rustc is needed). The pass is configured to run at `mir-opt-level=2` and higher only. To enable it by default, some followup work on it is still needed: * Performance needs to be evaluated. I did some light optimization work and tested against `tuple-stress`, which caused trouble in my last attempt, but didn't go much in depth here. * We can also enable the pass only at `opt-level=2` and higher, if it is too slow to run in debug mode, but fine when optimizations run anyways. * Debuginfo needs to be fixed after locals are merged. I did not look into what is required for this. * Live ranges of locals (aka `StorageLive` and `StorageDead`) are currently deleted. We either need to decide that this is fine, or if not, merge the variable's live ranges (or remove these statements entirely – rust-lang#68622). Some benchmarks of the pass were done in rust-lang#72635.
☀️ Test successful - checks-actions, checks-azure |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
@jonas-schievink is there issue open for that assertion? |
No, I haven't seen that happen anywhere else so far |
// The intended semantics here aren't documented, we just assume that nothing that | ||
// could be written to by the assembly may overlap with any other operands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of inline assembly are documented in the unstable book:
Operand expressions are evaluated from left to right, just like function call arguments. After the
asm!
has executed, outputs are written to in left to right order. This is significant if two outputs point to the same place: that place will contain the value of the rightmost output.
This takes the work that was originally started by @eddyb in #47954, and then explored by me in #71003, and implements it in a general (ie. not limited to acyclic CFGs) and dataflow-driven way (so that no additional infrastructure in rustc is needed).
The pass is configured to run at
mir-opt-level=2
and higher only. To enable it by default, some followup work on it is still needed:tuple-stress
, which caused trouble in my last attempt, but didn't go much in depth here.opt-level=2
and higher, if it is too slow to run in debug mode, but fine when optimizations run anyways.StorageLive
andStorageDead
) are currently deleted. We either need to decide that this is fine, or if not, merge the variable's live ranges (or remove these statements entirely – StorageLive (and even StorageDead) may be unnecessary in MIR. #68622).Some benchmarks of the pass were done in #72635.