Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

~5x compilation slowdown compared to 2.13 on ~400 files #20217

Closed
keynmol opened this issue Apr 18, 2024 · 12 comments
Closed

~5x compilation slowdown compared to 2.13 on ~400 files #20217

keynmol opened this issue Apr 18, 2024 · 12 comments
Assignees
Labels

Comments

@keynmol
Copy link
Contributor

keynmol commented Apr 18, 2024

I have created this repository by removing most of the dependencies and plugins: https://github.com/keynmol/smithy4s-aws-only

It contains ~428 Scala files.

You can switch Scala version using environment variable:

$ SCALA_VERSION="2.13.13" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 28178 ms

$ SCALA_VERSION="3.3.3" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
 [info]   root / Compile / compileIncremental                           : 146483 ms

$ SCALA_VERSION="3.4.1" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 158857 ms

I'm observing consistent 5x slowdown compared to 2.13 - the code doesn't change (no codegenerators).

Minimising it further would be difficult, as this difference is most pronounced on this relatively large amount of files.

@keynmol keynmol added the stat:needs triage Every issue needs to have an "area" and "itype" label label Apr 18, 2024
@WojciechMazur
Copy link
Contributor

The slowdown seems to be similar to the one observed in #20120, maybe these 2 are related

@lrytz
Copy link
Member

lrytz commented Apr 18, 2024

Here are -Yprofile-trace files from 2 and 3 compilers (I did publishLocal on #19897).

traces.zip

It seems about typing the implicit val schema (there are 500 of them), each takes around 240 ms on Scala 3 but only a 2 ms on Scala 2.

@Gedochao Gedochao added itype:performance area:implicits related to implicits and removed stat:needs triage Every issue needs to have an "area" and "itype" label labels Apr 18, 2024
@Gedochao
Copy link
Contributor

Gedochao commented Jul 4, 2024

@noti0na1 let's check how bad it is at this point, since we did a number of improvements since this was raised (#20399 and #20523, and some others).

@keynmol
Copy link
Contributor Author

keynmol commented Jul 4, 2024

I don't think it made a difference unfortunately:

~/p/k/smithy4s-aws-only (main) > SCALA_VERSION="2.13.13" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 14466 ms

~/p/k/smithy4s-aws-only (main) > SCALA_VERSION="3.3.3" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 65765 ms

~/p/k/smithy4s-aws-only (main) > SCALA_VERSION="3.5.0-RC2" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 63716 ms

~/p/k/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240703-75a15c2-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 64124 ms

@noti0na1
Copy link
Member

Due to the differences in the implicit search approach, the compile time in Scala 3 is expected to differ from Scala 2. If we could create a minimal example with one implicit definition that clearly takes longer to compile, we can then identify potential improvements.

CC @Gedochao

@Baccata
Copy link

Baccata commented Jul 16, 2024

The fact that the repro is large is deceptive : the code shared by @keynmol is a repetition of the exact same implicit pattern across a great many files/classes. Running a profiler (like yourkit) on the compilation of that project should allow for precisely pinpointing possible areas of improvement, and profiling the compiler would likely be more difficult on a minified repro that would contain a single occurrence of the implicit.

@noti0na1 noti0na1 added the stat:needs minimization Needs a self contained minimization label Jul 18, 2024
@noti0na1
Copy link
Member

noti0na1 commented Jul 25, 2024

The core issue is: during resolveOverloaded, the alts are widened in many places and multiple times. However, the alts are provisional TermRefs in this case, so the result of denot is not cached. When an alt has denotation union with a large number of alternatives, then computation is expensive and producing the same result over and over again.

I tried to cache some result of alt.widen inside resolveOverloaded (which is probably not the correct way), and the compiling time of the example project can be reduced to Scala 2 level.

cc @odersky

@odersky
Copy link
Contributor

odersky commented Jul 25, 2024

If it helps, maybe we should do what you tried: cache in resolveOverloaded.

@Gedochao Gedochao removed the stat:needs minimization Needs a self contained minimization label Aug 5, 2024
noti0na1 added a commit that referenced this issue Sep 18, 2024
A simple performance improvement extracted from #21278. This PR reduces `denot` calls in `widen`s.

This is a simpler approach which can reduce the compilation time of
#20217 from 60s to about 40s.
@keynmol
Copy link
Contributor Author

keynmol commented Sep 23, 2024

Should I see an improvement in 3.6.0-RC1-bin-20240922-22ed2fb-NIGHTLY (from yesterday?) given PR was merged 5 days ago.

~/p/i/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240922-22ed2fb-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 66566 ms

~/p/i/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240922-22ed2fb-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 62118 ms

@noti0na1
Copy link
Member

noti0na1 commented Sep 23, 2024

Hi @keynmol , can you test the nightly versions before and after #21584 again? I do observe some improvement on my laptop.

#21278 would improve more than #21584, but it is too complicated and may introduce some overhead in other cases.

We will think about other way to improve in the future.

@keynmol
Copy link
Contributor Author

keynmol commented Sep 23, 2024

I don't see much difference on my laptop:

~/p/i/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240922-22ed2fb-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 61573 ms
~/p/i/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240919-ba9e59a-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 62749 ms

But here's something fascinating - on Github's ubuntu runners the difference between 2.13 and Scala 3 is "only" 2x!

https://github.com/keynmol/smithy4s-aws-only/actions/runs/10997520261

CleanShot 2024-09-23 at 16 33 27@2x

I have no idea why my Apple M2 laptop is affected by this so much...

@soronpo
Copy link
Contributor

soronpo commented Sep 23, 2024

I have no idea why my Apple M2 laptop is affected by this so much...

Maybe the CI matrix should add at least M1 explicitly?
image

I don't know if the self-hosted runners include M1 or M2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
8 participants