-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --spirt-passes
and a reduce
pass for replacing ops with their inputs/constants.
#988
Conversation
235474b
to
29626d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's missing a changelog entry :)
I'm going back and forth on actually landing this right now. It's opt-in on top of SPIR-T itself being opt-in (you have to set something like One argument I could see, though, is it can get more testing this way. I'll add a changelog entry for |
29626d7
to
5a8bd42
Compare
5a8bd42
to
dc2581d
Compare
… anchors. (#18) This extra step, just before turning multi-version pretty-printed fragments into a HTML table row, allows for a clearer visual match between the columns, turning the multi-version mode into something closer to a pass "differ", but the use of *anchors* (instead of line contents) makes it more conservative, reliable and somewhat simpler. In short, if two adjacent versions contain a *definition* like `v123 = ...`, it should now end up on the same horizontal line in both versions (assuming it's not e.g. out of order wrt most of the other definitions). Before, adding or removing instructions (or even their pretty-printing needing different line amounts) would result in one side being behind (and the other, ahead), which would accumulate (even if most of their contents might be similar or even identical). With this PR, a lot of that should be alleviated, as padding (of empty lines) is added to keep both versions aligned as much as possible. --- *However*, while it works great for correcting *small* misalignments (the original usecase), it's quite zealous (and lacks any heuristics or further constraint-solving etc.), leading to results like this: |[**Before**](https://htmlpreview.github.io/?https://gist.github.com/eddyb/387fb2b7e128e8ca4e4c00ff82baf459/raw/0-before-spirt%252318-simplest_shader.spirt.html&dark#func1)<br><sub><sup>(click for complete<br>pretty HTML example)</sup></sub>|![image](https://user-images.githubusercontent.com/77424/224555823-318a9019-3463-4f9a-ac61-430d48197b72.png)| |:-:|-| |[**After**](https://htmlpreview.github.io/?https://gist.github.com/eddyb/387fb2b7e128e8ca4e4c00ff82baf459/raw/1-after-spirt%252318-simplest_shader.spirt.html&dark#func1)<br><sub><sup>(click for complete<br>pretty HTML example)</sup></sub>|![image](https://user-images.githubusercontent.com/77424/229123954-fd79df73-0cd4-41c6-b515-59646c0bdc5e.png)| This is for the same SPIR-T as the last screenshot from [the Rust-GPU `reduce`+`fuse_selects` PR](EmbarkStudios/rust-gpu#988 (comment)). The `OpNop`s don't help (and those should be gone once SPIR-T APIs are adjusted), but it's generally much worse looking than I was hoping. Maybe it's worth it for the ability, but some things are just annoying. Based on this screenshot alone, several possible further improvements come to mind: <sub>(_**EDIT**: several of these have been implemented in this PR since it was opened_)</sub> * [x] ~~hidden anchors could be added for e.g. `ControlNode`s and `ControlRegion`s, even if there's not necessarily any syntax to attach them to (other than the keyword for non-`Block` `ControlNode`s and the `{...}` brackets for `ControlRegion`?)~~ * [x] ~~the contents of the aligned lines could be compared and some kind of graying/translucency used to de-emphasize the lines that haven't changed~~ * ~~there's a risk here of making them confused with *removed* lines~~ * [x] ~~this is where merely *partial* grayscaling, or some kind of sepia effect, might come in handy~~ * [x] ~~such diff-like mode could supplant the use of `colspan` *entirely*, and allow horizontal scrolling of the whole table, with the max line length being used to size the cells (e.g. `td { max-width: 100ch; }`), instead of divvying up the whole viewport width~~ * <sub>may need some level of interactivity (based on which column is being hovered on?)</sub> * Rust-GPU's `--spirt-dump-passes` could skip the unstructured control-flow column, which adds a bunch of gaps for no good reason (or it could be kept but with the alignment turned off?) * <sub>euristics (or just more user control) to avoid large gaps</sub> * <sub>some additional "layout" pass could rearrange all the lines that are "floating" in between anchors</sub> * <sub>crucially, the current state is that the new vertical gaps are _always_ inserted just before the anchor, _and never elsewhere_, but the lines before the anchor might be more "attached" to it than the ones after (sadly, we can't just switch where the vertical gap goes unconditionally: anchors can easily precede multiple-line contents that _should not_ be broken up)</sub> * <sub>e.g. ideally the vertical gap would be at the outermost indentation level, which would allow e.g. structured control-flow to "coalesce together" (may require either using `LineOp` more directly, or just making indentation special in `TextOp`)</sub> * <sub>it almost feels like this requires a redesign of a bunch of the `print::pretty` layout, to allow "flexible vertical spacing", which frankly seems like overkill</sub> * <sub>dynamic control over the HTML table via JS, to allow user adjustments on the fly</sub> * <sub>some part of the anchor alignment algorithm would likely need to be implemented in JS, to avoid the combinatorial explosion in the data that would need to be statically generated otherwise</sub>
Right now the supported "reduction rules" are very limited (using Rust-y pseudosyntax for examples):
OpCompositeExtract
on the result ofOpCompositeInsert
- e.g.(a, b).0
➡️a
OpBitcast
on constants - e.g.5i32 as u32
➡️5u32
OpSwitch
on abool
-derived int - e.g.match (cond as u32) {...}
➡️if cond {...} else {...}
DataInst
reduction, but aControlNode
one, and it feels a bit hacky, but if we had an easier way to confirm the range of the integer beingOpSwitch
ed on, it would be equivalent to transformingswitch x { case 0: A; case 1: B; }
toif x != 0 {B} else {A}
(and then the job ofreduce
is to simplify thex != 0
to an existingbool
)(these were chosen by the motivation to make
for i in 0..n
optimize as well as the equivalentwhile
)However, the real power of this
reduce
pass is in the ways in which it plumbs dataflow through conditional control-flow and loops - using the pair example:if ... { (a, b) } else { (c, d) }.0
➡️if ... { a } else { c }
a
andc
are the same, or one of them isOpUndef
x = (a, b); loop { use(x.0); ... x = (c, d); }
➡️x_0 = a; loop { use(x_0); ... x_0 = c; }
(the mutation above is purely notational, that's "loop state", aka φ across a backedge in SSA)
Because of the restriction on "transforming away computation" without generating new SPIR-T
DataInst
s, at the SPIR-V level onlyOpPhi
s may be newly generated, everything else is reused or even removed.(and
OpPhi
s, aka "φ nodes", have the opportunity to go unmodified all the way down to the driver generating machine code and allocating registers based on the SSA dataflow, so it's hard to estimate their "true cost")(EDIT: a much simpler
fuse_selects
pass was also added after the PR was opened, to "seal the deal" onfor
loops, by combining twoif
-else
s with identical conditions - also see example lower below)Either way, this PR is powerful enough to deal with
for i in 0..n
:(click to see the shader code used to get these outputs)
Note: this SPIR-T is not from
--dump-spirt-passes
but rendered from the output ofspirv-opt
(sadly that process itself, which involves SPIR-T restructurization, results in anif _ { (true, ...) } else { (false, ...) }
that's not in the SPIR-V itself)while
for
for
--spirt-passes=reduce,fuse_selects
And this is what
--dump-spirt-passes
itself looks like, for the example above:(there's still some stray
OpNop
s, a funny limitation of SPIR-T's linked lists accidentally not supporting removal yet, but in most cases wholeBlock
s of unused instructions can be removed, so it's not that bad)