-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: manage call stacks using a tree #6791
base: master
Are you sure you want to change the base?
Conversation
Peak Memory Sample
|
Compilation Sample
|
Looks like this is an improvement over #6747 and gets us closer to the optimal #6753 (comment) |
We are eating a performance penalty however. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some good improvements here - I wonder why regression_4709 increased so much in compilation time though. From the description it sounds like inlining could take longer but regression_4709 seems to be dominated by a large loop instead.
My guess is that it is due to getting a big list of flattened call stacks; not too deep but very wide, which means some elements have a lot of children and searching among them would be significant. I will try to use some sorted container for the children, like a btree. |
@@ -85,6 +86,8 @@ impl Function { | |||
/// Note that any parameters or attributes of the function must be manually added later. | |||
pub(crate) fn new(name: String, id: FunctionId) -> Self { | |||
let mut dfg = DataFlowGraph::default(); | |||
// Adds root node for the location tree | |||
dfg.add_location_to_root(Location::dummy()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that could be done in DataFlowGraph::default()
itself? Or maybe the new CallStackHelper
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - we may still want to change call stacks back to Vecs from lists now that we no longer need the sharing there
Changes to Brillig bytecode sizes
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
.iter() | ||
.rev() | ||
.take(1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we're searching in reverse so that we find recent additions faster; is there any significance as to why give up after 1000 children, after which a duplicate entry is acceptable? Is that value something that was actually observed?
I wonder if you thought some reverse index from e.g. the hash of the location to the child CallStackId
could be used to make this faster and easier to reason about? Like children: HashMap<u64, CallStackId>
where the key is fxhash::hash64(location)
.
Description
Problem*
Resolves #6603
Summary*
Call stacks are stored inside a big tree, which allows to share identical prefixes between call stacks
Additional Context
The only drawback is that we need to re-create the trees among function contexts during inlining
Documentation*
Check one:
PR Checklist*
cargo fmt
on default settings.