Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example of an unexpectedly-huge contract artifact. #6786

Open
iAmMichaelConnor opened this issue Dec 12, 2024 · 6 comments
Open

Example of an unexpectedly-huge contract artifact. #6786

iAmMichaelConnor opened this issue Dec 12, 2024 · 6 comments
Assignees

Comments

@iAmMichaelConnor
Copy link
Collaborator

AztecProtocol/aztec-packages#10546

In particular, create_note and create_note_no_init_check

https://github.com/AztecProtocol/aztec-packages/blob/6f209fb69fcce868c6e0fe9b79b5ac3f3a1e5c48/noir-projects/noir-contracts/contracts/stateful_test_contract/src/main.nr#L45

@github-project-automation github-project-automation bot moved this to 📋 Backlog in Noir Dec 12, 2024
@iAmMichaelConnor
Copy link
Collaborator Author

@TomAFrench
Copy link
Member

TomAFrench commented Dec 12, 2024

@aakoshh Do you have bandwidth to take a look at this?

@aakoshh
Copy link
Contributor

aakoshh commented Dec 13, 2024

The first jump I found in the size was AztecProtocol/aztec-packages@3304046 where the create_note_no_init_check goes from 164,516 bytes in the previous commit up all the way to 13,815,792, or if we consider the entire stateful_test_contract-StatefulTest.json then it goes from 3.3M to 42M. This is not the final size, but it's something worth intestigating.

Commands to check:

PROGRAM_DIR=$pwd/noir-projects/noir-contracts 
git checkout 3304046704e257902e32b86baf1aafc8b23bcaf6
cd noir/noir-repo
cargo run --release -p nargo_cli -- --program-dir $PROGRAM_DIR compile --package stateful_test_contract
jq '.functions | sort_by(.name) | .[] | {name: .name, size: .bytecode | length }' $PROGRAM_DIR/target/stateful_test_contract-StatefulTest.json

@aakoshh
Copy link
Contributor

aakoshh commented Dec 13, 2024

Within the syncing PR AztecProtocol/aztec-packages#10307 that backs the above squashed commit, the two commits where there are significant bytecode size increases are:

@aakoshh
Copy link
Contributor

aakoshh commented Dec 14, 2024

The following is the printout of the number of lines in the SSA in AztecProtocol/aztec-packages@369ea54

COMPILING CONTRACT FUNCTION create_note_no_init_check
Initial SSA:: 7661
After Defunctionalization:: 7693
After Removing Paired rc_inc & rc_decs:: 7693
After Runtime Separation:: 8136
After Resolving IsUnconstrained:: 8136
After Inlining (1st):: 12098
After Mem2Reg (1st):: 8727
After Simplifying (1st):: 7953
After `as_slice` optimization: 7953
After `static_assert` and `assert_constant`:: 7953
After Loop Invariant Code Motion:: 7953
After Unrolling:: 40728
After Simplifying (2nd):: 38161
After Flattening:: 58927
After Removing Bit Shifts:: 58927
After Mem2Reg (2nd):: 33335
After Inlining (2nd):: 33813
After Remove IfElse:: 1287564
After Constant Folding:: 967623
After EnableSideEffectsIf removal:: 965798
After Constraint Folding:: 963258
After Dead Instruction Elimination:: 958799
After Simplifying:: 958797
After Array Set Optimizations:: 958797
After Constant Folding with Brillig:: 958478
After Dead Instruction Elimination:: 958458

We can see that there is a dramatic increase in the Remove IfElse step, going from 33,813 to 1,287,564

Compare it with the same in its parent commit AztecProtocol/aztec-packages@0577c1a

COMPILING CONTRACT FUNCTION create_note_no_init_check
Initial SSA:: 8009
After Defunctionalization:: 8041
After Removing Paired rc_inc & rc_decs:: 7861
After Runtime Separation:: 8342
After Resolving IsUnconstrained:: 8342
After Inlining (1st):: 13330
After Mem2Reg (1st):: 11720
After Simplifying (1st):: 10834
After `as_slice` optimization: 10834
After `static_assert` and `assert_constant`:: 10834
After Loop Invariant Code Motion:: 10834
After Unrolling:: 44343
After Simplifying (2nd):: 41738
After Flattening:: 71721
After Removing Bit Shifts:: 71721
After Mem2Reg (2nd):: 35739
After Inlining (2nd):: 35277
After Remove IfElse:: 39020
After Constant Folding:: 31940
After EnableSideEffectsIf removal:: 30111
After Constraint Folding:: 27518
After Dead Instruction Elimination:: 19253
After Simplifying:: 19251
After Array Set Optimizations:: 19251
After Constant Folding with Brillig:: 18969
After Dead Instruction Elimination:: 18944

In this version RemoveIfElse adds less than 4,000 lines.

Further digging revealed that remove_if_else.rs calls value_merger.rs on arrays, which merges the then and else values being assigned to arrays by merging their values index by index, and creating new arrays of the merged values, without the if-then-else. This introduces 4096 new opcodes for each merge in this example, doing it hundreds of times. None of these two modules had changed, so some preceding SSA pass must have put a lot more arrays in the code than existed before.

Printing the number of arrays reveals where they get introduced.

This is in 0577c1a

After Simplifying (2nd):: 41738
f0: if-then-else arrays: 0
f44: if-then-else arrays: 0
f103: if-then-else arrays: 0
f110: if-then-else arrays: 0
f117: if-then-else arrays: 0
f159: if-then-else arrays: 0
f170: if-then-else arrays: 0
f175: if-then-else arrays: 0
After Flattening:: 71721
f0: if-then-else arrays: 77
f44: if-then-else arrays: 10
f103: if-then-else arrays: 78
f110: if-then-else arrays: 38
f117: if-then-else arrays: 1
f159: if-then-else arrays: 22
f170: if-then-else arrays: 102
f175: if-then-else arrays: 60
...
After Mem2Reg (2nd):: 35739
f0: if-then-else arrays: 17
f44: if-then-else arrays: 0
f103: if-then-else arrays: 30
f110: if-then-else arrays: 0
f117: if-then-else arrays: 0
f159: if-then-else arrays: 0
f170: if-then-else arrays: 0
f175: if-then-else arrays: 22
After Inlining (2nd):: 35277
f0: if-then-else arrays: 17
After Remove IfElse:: 39020
f0: if-then-else arrays: 0

And this is in 369ea54:

After Flattening:: 58927
f0 if-then-else arrays: 352
f44 if-then-else arrays: 10
f103 if-then-else arrays: 46
f110 if-then-else arrays: 46
f117 if-then-else arrays: 1
f159 if-then-else arrays: 26
f170 if-then-else arrays: 126
f175 if-then-else arrays: 36
...
After Mem2Reg (2nd):: 33335
f0 if-then-else arrays: 283
f44 if-then-else arrays: 0
f103 if-then-else arrays: 22
f110 if-then-else arrays: 0
f117 if-then-else arrays: 0
f159 if-then-else arrays: 0
f170 if-then-else arrays: 0
f175 if-then-else arrays: 16
After Inlining (2nd):: 33813
f0 if-then-else arrays: 283
After Remove IfElse:: 1287564
f0 if-then-else arrays: 0

The numbers show the count of of arrays that would be merged in the entry block of the functions, the only block remove_if_else looks at, I assume because Inlining should have eliminated all other blocks by then (at least all other functions disappeared). 283*4096 is explains most of the growth in the instruction count.

    let mut count = 0;
    let block = function.entry_block();
    for instruction in function.dfg[block].instructions() {
        if let Instruction::IfElse { then_condition: _, then_value, else_value: _ } =
            function.dfg[*instruction]
        {
            let then_value = function.dfg.resolve(then_value);
            if let Type::Array(..) = function.dfg.type_of_value(then_value) {
                count += 1;
            }
        }
    }

So the question is why has flatten_cfg introduced so many more if-then-else arrays then before; and this is a module that has changed in this commit, with the crux being these lines: where previously it was a Store, now it can be an IfElse, which in case of an array gets removed later by merging values.

One further thing to look at would be whether there were more stores to arrays in this commit then before.
Adding this extra count shows that before the change there were more stores to arrays, but they got eliminated by mem2reg:

After Flattening:: 71721
f0 if-then-else arrays: 77, store arrays: 516
After Mem2Reg (2nd):: 35739
f0 if-then-else arrays: 17, store arrays: 0
After Remove IfElse:: 39020
f0 if-then-else arrays: 0, store arrays: 0

whereas after there are more if-then-else and less stores, but the if-then-else's don't get eliminated, which then plays a magnified role during remove_if_then:

After Flattening:: 58927
f0 if-then-else arrays: 352, store arrays: 374
After Mem2Reg (2nd):: 33335
f0 if-then-else arrays: 283, store arrays: 0
After Remove IfElse:: 1287564
f0 if-then-else arrays: 0, store arrays: 0

@aakoshh
Copy link
Contributor

aakoshh commented Dec 15, 2024

Looking at what is causing the following:

AztecProtocol/aztec-packages@70cf870f the size goes from 4,331,388 to 15,129,080 compared to feef655

Printing SSA sizes does not reveal any smoking gun, the numbers are similar to above.
Actually in this case, commit feef655 looks bigger on paper than what follows:

Dead Instruction Elimination (2nd): 874,549 lines, 36,977,266 chars
f0: 874,081 instructions
ACIR OPCODES BEFORE TRANS: 298,782
ACIR OPCODES AFTER TRANS: 299,471

Whereas in 70cf870f we have a smaller SSA, but the final ACIR opcode count is 5x larger:

Dead Instruction Elimination (2nd): 641,474 lines, 28,936,683 chars
f0: 640,942 instructions
OPCODES BEFORE TRANS: 1,580,140
OPCODES AFTER TRANS: 1,580,497

The following are the statistics of how many ACIR opcodes were generated from how many encountered Instructions:

In feef655:

f0: 874081 instructions
IncrementRc: instructions=73, opcodes=0, avg=0
Call: instructions=143, opcodes=1562, avg=10
Not: instructions=756, opcodes=0, avg=0
Binary: instructions=437052, opcodes=7918, avg=0
ArrayGet: instructions=145101, opcodes=288850, avg=1
Truncate: instructions=8, opcodes=86, avg=10
EnableSideEffectsIf: instructions=1598, opcodes=0, avg=0
RangeCheck: instructions=168, opcodes=304, avg=1
MakeArray: instructions=302, opcodes=0, avg=0
Cast: instructions=287870, opcodes=0, avg=0
Constrain: instructions=1010, opcodes=522, avg=0
ACIR OPCODES=298782

In 70cf870f:

f0: 640942 instructions
IncrementRc: instructions=30, opcodes=0, avg=0
Call: instructions=172, opcodes=1822, avg=10
Not: instructions=1167, opcodes=0, avg=0
Binary: instructions=476607, opcodes=1262996, avg=2
ArrayGet: instructions=158692, opcodes=314999, avg=1
Truncate: instructions=9, opcodes=93, avg=10
EnableSideEffectsIf: instructions=1686, opcodes=0, avg=0
RangeCheck: instructions=168, opcodes=304, avg=1
MakeArray: instructions=303, opcodes=0, avg=0
Cast: instructions=906, opcodes=0, avg=0
Constrain: instructions=1202, opcodes=682, avg=0
ACIR OPCODES=1581401

The biggest visible change is in the ratio of opcodes generated from Binary instructions: we have roughly the same number of instructions, but instead of 7,918 opcodes end u with 1,262,996 them.

Just looking at binary instructions:

Binary add: instructions=144620, opcodes=732, avg=0
Binary eq: instructions=424, opcodes=1669, avg=3
Binary lt: instructions=828, opcodes=2436, avg=2
Binary mul: instructions=147400, opcodes=3078, avg=0
Binary or: instructions=18, opcodes=3, avg=0
Binary sub: instructions=143762, opcodes=0, avg=0

and then:

Binary add: instructions=157852, opcodes=315542, avg=1
Binary eq: instructions=424, opcodes=1669, avg=3
Binary lt: instructions=860, opcodes=2436, avg=2
Binary mul: instructions=317308, opcodes=943346, avg=2
Binary or: instructions=18, opcodes=3, avg=0
Binary sub: instructions=145, opcodes=0, avg=0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 📋 Backlog
Development

No branches or pull requests

3 participants