Introduce Best Effort compilation options #17582

jchyb · 2023-05-25T11:43:56Z

The aim of this ~~draft~~ PR is to introduce a compiler improvement for the use in IDEs. The idea was to allow for IDE's to work on erroring projects more easily, without any manual touchups (see scalameta/metals-feature-requests#327). To achieve this, a tasty-like format that can include errored/incomplete trees can be produced by the compiler, and consumed by the presentation compiler. Though mostly finished, this is a draft PR, meant for the maintainers and contributors of dotty to review the approach presented here. I am not yet sure would would it take to actually merge a feature like this one, although it is worth noting that all of the changes presented here are hidden under experimental options.

Introduction

This PR introduces 2 new experimental options for the compiler: -Ybest-effort and -Ywith-best-effort-tasty. It also introduces the Best Effort TASTy format, a TASTy aligned file format able to hold some errored trees. Behaviour of the options and the format is documented as part of this PR in the best-effort-compilation.md docs file.

Considerations

My overall aim was to be able to handle as many cases, with little as little maintainance necessary as possible. This is for example why pretty much the only required phases are Parser and Typer - they are enough for, as far as I know, all necessary metals completions and I did not find any issues with setting their products (best effort tasty files) as dependencies. Requiring, for example, PostTyper, would require for the errored trees to be able to somehow pass through that phase, meaning a large investment from me into working the existing known error cases through there (muddling the codebase in the process) and possibly from the maintainers working on both Typer (to be able to produce „correct” error trees) and PostTyper (to be able to consume „correct” errored trees), which would obviously make the entire initiative dreadful.

This is also why any tests are able to be put into a blacklist file, in case something changes and a neg test will not pass, or a new test will be added as part of issue fix that does not play well with the best-effort features.

Controversial aspects of the implementation

I wanted to relay some implementation aspects that may be seen as controversial, to more easily obtain feedback related to them (all welcome).

Allowed crashing - I imagine this may confuse users of metals, so those crashes should likely not be shown there in the logs. Handling every possible incorrect tree in the pickling phase would be quite a maintenance toll, so we allow the compiler to crash in some instances. If this proves to be a significant problem, one easy solution would be to catch any error in Best Effort TASTy pickling and unpicking and print some vague message. I also want to reiterate that this can only happen when using the newly added options. EDIT: I added a separate error message wrapper for crashes obtained during bets-effort compilation.
Undocumented Best Effort TASTy grammar - or rather documented only by the implementation. Serializing incorrect trees (without defining any set of what these incorrectness constitute as, possibly excluding some possible errored trees in the process) is a difficult problem. This aspect leads to my implementation, where I’ve made it so the pickler is allowed to panic and not serialize certain subtrees, or serialize error types in their place (same with unpicking), so some subtrees may be incomplete. Also, if this was documented we would likely want to version it, which leads to… EDIT: I added format grammar documentation in BestEffortTastyFormat.scala.
Best Effort Tasty versioning - or lack thereof. Unpickling will still fail when when compiling with an earlier minor version, but above that I added no other requirements. I think the implementation adjustments should be allowed in the patch releases, so we can course correct in case of mistakes. However, I think as the feature stabilizes, the changes will stop happening. If we already allow some error trees to not work (and we have to), some cases not working across some patch releases does not seem like a big issue (whether anyone even uses multiple patch versions across one multimode build is another thing). Additionally would personally like to see this backported onto LTS one day, so that’s another reason. EDIT: I still do not check separately best effort tasty versioning, but I reserved a space in the file to be able to do that if that becomes necessary.

The flipside of all of these quirks is that the feature in general works really well. Out of the ~~1312~~ 1559 failing „neg” test files checked, ~~1286~~ 1534 manage to pass the pickling and unpicking tests (most of the fails and crashes are caused by cyclic reference errors).

sjrd · 2023-05-26T12:18:10Z

I'm not opposed to having a slightly different, best-effort format for non-compiling code. However, I am worried about the "intentionally unspecified" aspect of it. Citing from a comment in the PR:

The Best Effort TASTy format also informally extends the regular TASTy grammar to allow the handling of as
large amount of incorrect trees produced by the compiler as possible. Because of this informality and unlike the regular
TASTy format, the BETASTy format should stay internal to the compiler.

(emphasis mine). Yet the whole point is for that format to be read from IDEs, which are definitely not internal to the compiler. That is already a direct contradiction, and is a symptom of what I'm worried about.

If we go this route, we should in fact specify the best-effort tasty format. Ideally, it would be possible for something like tasty-query to consume it and still provide some amount of guarantees.

If we don't specify it, all we have is some random data produced by some compiler, which some IDEs may be able to process to some extent. I am perhaps exaggerating those "some"s for effect, but even so, that's not a good situation to be in.

What fundamentally prevents specifying this format? Are there known blockers? Can we solve them? Even if we cannot attach run-time (evaluation) semantics to the resulting symbol table and trees, we should be able to attach specified, compile-time (structural) semantics.

jchyb · 2023-07-03T12:52:15Z

Sorry for the belated reply, I wanted to take care of some other issues before I came back to this.

By writing that it should stay internal to the compiler I meant that it should only be able to be read and written by the compiler. Since Metals uses the compiler directly (via the so-called presentation compiler) and never reads from the tasty itself, it is not an issue there.

The informality is there mostly because of the implementation. Accounting for unforeseen states of the compiler can be difficult. After all, we do aim to pass incorrect trees into the pickling phase, so since they are incorrect, reasoning about what will be found there is tough. So the solution I came up was to try to force pickling with a series of fallbacks, based on the already found neg-test of the compiler. If something cannot be pickled, in its place we pickle an ErrorType, Empty block, or nothing (depending on the situation). The main problem lies with unpicking. Since we pickle very naively like this, there are some situations where an assumption used in unpickler may not be correct, so we end up fallbacking in unpickler as well (made up example, since it is tough for me to remember specifics right now, we might fail to pickle a method, then while unpicking a class the method is expected there, so we stop unpickling that).

This may sound somewhat dire, but the two alternatives I know of are much worse to me:

rewrite completely pickler and unpickler for best effort - extremely high maintenance cost (not even mentioning implementation)
limit to easily identifiable cases, like simple Incorrect Type Errors, Missing Type, etc., and incorporate that into the current pickler and unpickler implementation - much less useful, in my opinion

So the be able to be formalized, it would need to be rewritten into one of the above.

sjrd · 2023-07-03T17:36:00Z

Since we pickle very naively like this, there are some situations where an assumption used in unpickler may not be correct, so we end up fallbacking in unpickler as well (made up example, since it is tough for me to remember specifics right now, we might fail to pickle a method, then while unpicking a class the method is expected there, so we stop unpickling that).

IMO this is the core of the problem. Your proposal is to say "anything goes", and who knows whether unpickling will be able to recover from it. Instead, I propose to specify what the unpickler needs to be able to recover from. Pointing to a non-existent method? That's fine in a sense, with the caveat that we cannot compute a tpe for such a tree, so it becomes ErrorType as well. If those things are specified, then external tools can recover from the same situations.

tgodzik · 2023-07-03T17:40:15Z

If those things are specified, then external tools can recover from the same situations.

I think the idea is that this shouldn't be used by external tools currently or ever (?)

sjrd · 2023-07-04T07:32:00Z

I don't think that point of view makes sense. If it stayed within one project (in the sbt sense of the term), that would make sense. But as soon as you cross the project boundary, even "within the compiler" can mean "across different versions of the compiler". Or different options. Also, it means that build tools need to communicate these files around; likely take them into consideration for incremental compilation (you can't afford to drop down to full recompilation just because an upstream dependency has a best-effort output), etc. IMO, that means these things will de facto be read by other tools.

jchyb · 2023-07-04T12:56:59Z

About the use in different compiler versions. My first idea was to limit the use to only the same patch compiler versions. This way we would be able to freely enact changes to the implementation, especially important for the LTS. This made sense to me as it seemed unlikely that many people would use different versions of scala 3 across different projects.

However, it's not like the current best-effort implementation has any guarantees to uphold anyway. It can still fail to produce/read betasty for rare cases, like for programs with certain circular type dependencies. So if we were to allow using betasty across different versions, and it would only work sometimes, this still might be better than artificially limiting it to a subset of project structures. As the implementation matures and is kept unchanged in subsequent compiler versions, it is going to fail for less and less cases. And that approach is always going to work for more cases than when limiting betasty use to the same compiler patch version. This is supported (somewhat by accident) by the fallback-based implementation where we may naively fallback both while writing and reading the betasty file. So ultimately, what I decided there was to keep the strong requirements about the minor versions from regular tasty file format, and still allow it to try reading (and perhaps failing to read) betasty from different patch versions.

You are right about the incremental compilation - we cannot really use it with the way best-effort compilation is now. But I don't think it can be used for failing projects anyway. Previously, modules depending on failing modules would just not be able to be compiled at all.

tgodzik · 2023-07-05T13:13:00Z

About the use in different compiler versions. My first idea was to limit the use to only the same patch compiler versions. This way we would be able to freely enact changes to the implementation, especially important for the LTS. This made sense to me as it seemed unlikely that many people would use different versions of scala 3 across different projects.

I think that scenario is highly unlikely and since we have only best-effort compilation, it's fine if that fails at times.

You are right about the incremental compilation - we cannot really use it with the way best-effort compilation is now. But I don't think it can be used for failing projects anyway. Previously, modules depending on failing modules would just not be able to be compiled at all.

It's highly unlikely that we could provide sensible incremental compilation here actually, only using the correctly compiled artifacts before. I don't think we should try to support it at this time for artifacts with potential errors.

@sjrd what kind of specification would you have in mind actually? How would that help the tooling? We would want to keep the changes to the compiler as maintainable as possible, so that we don't need to rewrite too many things.

@jchyb would it be possible to at least specify the cases you know of? So that we have a list for potencial tool formalization later?

Kordyjan · 2023-07-05T13:35:55Z

The best way forward is to decide that BETASTy files can only be read and written by the exact version of the compiler. This way, we can really treat them as an implementation detail. This will be beneficial as I believe there will be many reiterations and improvements along the way after shipping the first version. Maintaining any compatibility guarantees about BETASTy files is not our goal right now.

Later, when we are sure about the design and implementation, we can "open" the format by writing the specification and maintaining the same compatibility guarantees as TASTy does. Doing it prematurely may be counterproductive, as the design may change based on feedback.

jchyb · 2023-08-30T12:03:49Z

Hello again, apologies for a lack of (visible) movement in this PR.
In the meantime, I managed to add many improvements to the implementation here, to the point where I was also now able to specify the grammar used in the Best Effort TASTy files in a concise manner (found in BestEffortTastyFormat). With that in mind, I still agree with Paweł’s remarks above, and with the fact that for now the format should only be consumed by the compiler. However, to future proof the (now unchecked) compatibility concerns, I added and additional field (now unused, but reserved) for betasty versioning - though if deemed unnecessary for now, I can of course remove it.
All in all, I consider the work here complete, and ready for review. A more concise explanation of the PR can be found in the commit message, which I think could be helpful for review.

compiler/src/dotty/tools/dotc/Run.scala

compiler/src/dotty/tools/dotc/ast/tpd.scala

compiler/src/dotty/tools/dotc/core/Contexts.scala

tasty/src/dotty/tools/tasty/besteffort/BestEffortTastyHeaderUnpickler.scala

tasty/src/dotty/tools/tasty/TastyHeaderUnpickler.scala

tasty/src/dotty/tools/tasty/TastyFormat.scala

sjrd · 2023-11-23T15:46:08Z

compiler/src/dotty/tools/dotc/core/TypeErasure.scala

+      if ctx.isBestEffort then report.error(msg)
+      else throw FatalError(msg)


This one is a bit worrying. The doc comment explicitly says that it won't let !tp1.exists pass through. Presumably, callers rely on that not to crash just after.

I have added a comment to the doc of the method to reflect this. In the situations I tested, this was (luckily) not a problem, and the only solution here would be to keep throwing FatalError, which would stop the best-effort compilation (with a crash, no less). What I'm trying to say here is, by not throwing this FatalError we minimize crashes, and maximize successful best effort compilations

sjrd · 2023-11-23T15:47:59Z

compiler/src/dotty/tools/dotc/core/SymDenotations.scala

     *  TODO: Find a more robust way to characterize self symbols, maybe by
     *       spending a Flag on them?


I believe a flag has been spent on that since this was written, namely SelfName. Should this whole method become

final def isSelfSym(using Context): Boolean = is(SelfName)

?

sjrd

Overall looks good now. I only minor, localized comments.

compiler/src/dotty/tools/dotc/Driver.scala

compiler/src/dotty/tools/dotc/ast/TreeInfo.scala

compiler/src/dotty/tools/dotc/core/SymbolLoaders.scala

sjrd · 2024-04-08T11:27:37Z

compiler/src/dotty/tools/dotc/core/SymbolLoaders.scala

-        moduleRoot.classSymbol.rootTreeOrProvider = unpickler
+      if (!isBestEffortTasty || ctx.withBestEffortTasty) then
+        val tastyBytes = tastyFile.toByteArray
+        val unpickler = new tasty.DottyUnpickler(tastyFile, tastyBytes, isBestEffortTasty = isBestEffortTasty)


This line is suspicious. It locally recreates an unpickler that seems to be equivalent to the private val unpickler at the class level. Moreover, the old code did not create such a local duplicate. Should this line be removed?

RIght, thank you for catching that. You are absolutely right, I must have added it by mistake on one of the rebases

sjrd · 2024-04-08T11:36:47Z

compiler/src/dotty/tools/dotc/core/tasty/TreePickler.scala

+  private inline def passesConditionForErroringBestEffortCode(condition: Boolean)(using Context): Boolean =
+    ((!ctx.isBestEffort && ctx.reporter.errorsReported) || ctx.usedBestEffortTasty) || condition


Consider making the condition a by-name argument so that it does not get evaluated at all in normal runs:

Suggested change

private inline def passesConditionForErroringBestEffortCode(condition: Boolean)(using Context): Boolean =

((!ctx.isBestEffort && ctx.reporter.errorsReported) || ctx.usedBestEffortTasty) || condition

private inline def passesConditionForErroringBestEffortCode(condition: => Boolean)(using Context): Boolean =

((!ctx.isBestEffort && ctx.reporter.errorsReported) || ctx.usedBestEffortTasty) || condition

Good point. Also while rechecking things I've noticed that I've put ! here incorrectly. I remembered you might have signaled about that before, but I ended up just changing the name at that time. Now it's all correct

compiler/src/dotty/tools/dotc/core/tasty/TreeUnpickler.scala

2 new experimental options are introduces for the compiler: `-Ybest-effort` and `-Ywith-best-effort-tasty`. A related Best Effort TASTy (.betasty) format, a TASTy aligned file format able to hold some errored trees was also added. Behaviour of the options and the format is documented as part of this commit in the `best-effort-compilation.md` docs file. `-Ybest-effort` is used to produce `.betasty` files in the `<output>/META-INF/best-effort`. `-Ywith-best-effort-tasty` allows to use them during compilation, limiting it to the frontend phases if such file is used. If any .betasty is used, transparent inline macros also cease to be expanded by the compiler. Since best-effort compilation can fail (e.g. due to cyclic reference errors which sometimes are not able to be pickled or unpickled), the crashes caused by it are wrapped into an additional descriptive error message in the aim to fail more gracefully (and not pollute our issue tracker with known problems). The feature is tested in two ways: * with a set of pairs of dependent projects, one of which is meant to produce .betasty by using `-Ybest-effort`, and the other tries to consume it using `-Ywith-best-effort-tasty`. * by reusing the compiler nonbootstrapped neg tests, first by running them with `-Ybest-effort` option, and then by running read-tasty tests on the produced betasty files to thest best-effort tastt unpickling Additionally, `-Ywith-best-effort-tasty` allows to print `.betasty` via `-print-tasty`.

jchyb · 2024-04-17T11:17:37Z

@sjrd Thank you for the your last review. I have applied all of the suggestions and rebased on top of the latest PR's (including pipelining). I've also made sure to carefully recheck everything this time, however I only ended up adjusting the passesConditionForErroringBestEffortCode method, everything else is as it was before

sjrd · 2024-04-18T07:53:53Z

@jchyb Consider squashing everything before merging. Also make sure the commit message is adequate (if we merge as is we take the PR description, which is outdated).

jchyb · 2024-04-18T09:24:55Z

Thank you so much for all of the reviews and overall help with this @sjrd!

tgodzik · 2024-04-18T09:26:27Z

Congrats @jchyb ! Great work!

jchyb mentioned this pull request May 25, 2023

feature: Add support for Scala 3's Best Effort compilation scalameta/metals#5219

Merged

bishabosha requested a review from sjrd May 25, 2023 12:43

nicolasstucki assigned jchyb Jul 13, 2023

jchyb mentioned this pull request Jul 19, 2023

Support Scala 3's Best Effort compilation scalacenter/bloop#2049

Merged

jchyb force-pushed the feature/best-effort branch 2 times, most recently from ed42813 to 8cca998 Compare August 7, 2023 16:38

jchyb force-pushed the feature/best-effort branch 2 times, most recently from 731bb17 to 4c31f5c Compare August 23, 2023 11:54

jchyb force-pushed the feature/best-effort branch 7 times, most recently from 365f6eb to 6dda6c2 Compare August 30, 2023 08:18

jchyb marked this pull request as ready for review August 30, 2023 12:01

jchyb force-pushed the feature/best-effort branch from 6dda6c2 to 188d9a4 Compare October 30, 2023 09:13

jchyb assigned sjrd and unassigned jchyb Oct 30, 2023

jchyb force-pushed the feature/best-effort branch from 188d9a4 to 5475c2b Compare November 7, 2023 14:11

sjrd requested changes Nov 23, 2023

View reviewed changes

jchyb removed their assignment Mar 26, 2024

sjrd requested changes Apr 8, 2024

View reviewed changes

jchyb force-pushed the feature/best-effort branch 6 times, most recently from b61f501 to 61c5107 Compare April 16, 2024 09:29

jchyb and others added 10 commits April 17, 2024 11:24

Apply review suggestions

0cd8452

fixes after rebase

df0feaa

fix TastyPrinter for betasty

603a23e

Solve mima issue (by doubling private method definition)

bdd6d72

fix windows tests

c08db68

apply review suggestions

14601e8

post-rebase fixes

3c19ebc

fix passesConditionForErroringBestEffortCode and improve doc comments

cdf4f98

remove unsafe-nulls from pickler

2eb73c5

jchyb force-pushed the feature/best-effort branch from 61c5107 to 2eb73c5 Compare April 17, 2024 09:29

jchyb requested a review from sjrd April 17, 2024 11:13

sjrd approved these changes Apr 18, 2024

View reviewed changes

sjrd assigned jchyb and unassigned sjrd Apr 18, 2024

jchyb merged commit 95a8a9c into scala:main Apr 18, 2024
19 checks passed

jchyb deleted the feature/best-effort branch April 18, 2024 09:18

Kordyjan added this to the 3.5.0 milestone May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce Best Effort compilation options #17582

Introduce Best Effort compilation options #17582

jchyb commented May 25, 2023 •

edited

Loading

sjrd commented May 26, 2023

jchyb commented Jul 3, 2023

sjrd commented Jul 3, 2023

tgodzik commented Jul 3, 2023

sjrd commented Jul 4, 2023

jchyb commented Jul 4, 2023

tgodzik commented Jul 5, 2023

Kordyjan commented Jul 5, 2023

jchyb commented Aug 30, 2023

sjrd Nov 23, 2023

jchyb Mar 26, 2024

sjrd Nov 23, 2023

sjrd Nov 24, 2023

sjrd left a comment

sjrd Apr 8, 2024

jchyb Apr 17, 2024

sjrd Apr 8, 2024

jchyb Apr 17, 2024

jchyb commented Apr 17, 2024

sjrd commented Apr 18, 2024

jchyb commented Apr 18, 2024

tgodzik commented Apr 18, 2024

		if ctx.isBestEffort then report.error(msg)
		else throw FatalError(msg)

		* TODO: Find a more robust way to characterize self symbols, maybe by
		* spending a Flag on them?

		private inline def passesConditionForErroringBestEffortCode(condition: Boolean)(using Context): Boolean =
		((!ctx.isBestEffort && ctx.reporter.errorsReported) \|\| ctx.usedBestEffortTasty) \|\| condition

Introduce Best Effort compilation options #17582

Introduce Best Effort compilation options #17582

Conversation

jchyb commented May 25, 2023 • edited Loading

Introduction

Considerations

Controversial aspects of the implementation

sjrd commented May 26, 2023

jchyb commented Jul 3, 2023

sjrd commented Jul 3, 2023

tgodzik commented Jul 3, 2023

sjrd commented Jul 4, 2023

jchyb commented Jul 4, 2023

tgodzik commented Jul 5, 2023

Kordyjan commented Jul 5, 2023

jchyb commented Aug 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sjrd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jchyb commented Apr 17, 2024

sjrd commented Apr 18, 2024

jchyb commented Apr 18, 2024

tgodzik commented Apr 18, 2024

jchyb commented May 25, 2023 •

edited

Loading