Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Best Effort compilation options #17582

Merged
merged 10 commits into from
Apr 18, 2024

Conversation

jchyb
Copy link
Contributor

@jchyb jchyb commented May 25, 2023

The aim of this draft PR is to introduce a compiler improvement for the use in IDEs. The idea was to allow for IDE's to work on erroring projects more easily, without any manual touchups (see scalameta/metals-feature-requests#327). To achieve this, a tasty-like format that can include errored/incomplete trees can be produced by the compiler, and consumed by the presentation compiler. Though mostly finished, this is a draft PR, meant for the maintainers and contributors of dotty to review the approach presented here. I am not yet sure would would it take to actually merge a feature like this one, although it is worth noting that all of the changes presented here are hidden under experimental options.

Introduction

This PR introduces 2 new experimental options for the compiler: -Ybest-effort and -Ywith-best-effort-tasty. It also introduces the Best Effort TASTy format, a TASTy aligned file format able to hold some errored trees. Behaviour of the options and the format is documented as part of this PR in the best-effort-compilation.md docs file.

Considerations

My overall aim was to be able to handle as many cases, with little as little maintainance necessary as possible. This is for example why pretty much the only required phases are Parser and Typer - they are enough for, as far as I know, all necessary metals completions and I did not find any issues with setting their products (best effort tasty files) as dependencies. Requiring, for example, PostTyper, would require for the errored trees to be able to somehow pass through that phase, meaning a large investment from me into working the existing known error cases through there (muddling the codebase in the process) and possibly from the maintainers working on both Typer (to be able to produce „correct” error trees) and PostTyper (to be able to consume „correct” errored trees), which would obviously make the entire initiative dreadful.

This is also why any tests are able to be put into a blacklist file, in case something changes and a neg test will not pass, or a new test will be added as part of issue fix that does not play well with the best-effort features.

Controversial aspects of the implementation

I wanted to relay some implementation aspects that may be seen as controversial, to more easily obtain feedback related to them (all welcome).

  1. Allowed crashing - I imagine this may confuse users of metals, so those crashes should likely not be shown there in the logs. Handling every possible incorrect tree in the pickling phase would be quite a maintenance toll, so we allow the compiler to crash in some instances. If this proves to be a significant problem, one easy solution would be to catch any error in Best Effort TASTy pickling and unpicking and print some vague message. I also want to reiterate that this can only happen when using the newly added options. EDIT: I added a separate error message wrapper for crashes obtained during bets-effort compilation.
  2. Undocumented Best Effort TASTy grammar - or rather documented only by the implementation. Serializing incorrect trees (without defining any set of what these incorrectness constitute as, possibly excluding some possible errored trees in the process) is a difficult problem. This aspect leads to my implementation, where I’ve made it so the pickler is allowed to panic and not serialize certain subtrees, or serialize error types in their place (same with unpicking), so some subtrees may be incomplete. Also, if this was documented we would likely want to version it, which leads to… EDIT: I added format grammar documentation in BestEffortTastyFormat.scala.
  3. Best Effort Tasty versioning - or lack thereof. Unpickling will still fail when when compiling with an earlier minor version, but above that I added no other requirements. I think the implementation adjustments should be allowed in the patch releases, so we can course correct in case of mistakes. However, I think as the feature stabilizes, the changes will stop happening. If we already allow some error trees to not work (and we have to), some cases not working across some patch releases does not seem like a big issue (whether anyone even uses multiple patch versions across one multimode build is another thing). Additionally would personally like to see this backported onto LTS one day, so that’s another reason. EDIT: I still do not check separately best effort tasty versioning, but I reserved a space in the file to be able to do that if that becomes necessary.

The flipside of all of these quirks is that the feature in general works really well. Out of the 1312 1559 failing „neg” test files checked, 1286 1534 manage to pass the pickling and unpicking tests (most of the fails and crashes are caused by cyclic reference errors).

@sjrd
Copy link
Member

sjrd commented May 26, 2023

I'm not opposed to having a slightly different, best-effort format for non-compiling code. However, I am worried about the "intentionally unspecified" aspect of it. Citing from a comment in the PR:

The Best Effort TASTy format also informally extends the regular TASTy grammar to allow the handling of as
large amount of incorrect trees produced by the compiler as possible. Because of this informality and unlike the regular
TASTy format, the BETASTy format should stay internal to the compiler.

(emphasis mine). Yet the whole point is for that format to be read from IDEs, which are definitely not internal to the compiler. That is already a direct contradiction, and is a symptom of what I'm worried about.

If we go this route, we should in fact specify the best-effort tasty format. Ideally, it would be possible for something like tasty-query to consume it and still provide some amount of guarantees.

If we don't specify it, all we have is some random data produced by some compiler, which some IDEs may be able to process to some extent. I am perhaps exaggerating those "some"s for effect, but even so, that's not a good situation to be in.

What fundamentally prevents specifying this format? Are there known blockers? Can we solve them? Even if we cannot attach run-time (evaluation) semantics to the resulting symbol table and trees, we should be able to attach specified, compile-time (structural) semantics.

@jchyb
Copy link
Contributor Author

jchyb commented Jul 3, 2023

Sorry for the belated reply, I wanted to take care of some other issues before I came back to this.

By writing that it should stay internal to the compiler I meant that it should only be able to be read and written by the compiler. Since Metals uses the compiler directly (via the so-called presentation compiler) and never reads from the tasty itself, it is not an issue there.

The informality is there mostly because of the implementation. Accounting for unforeseen states of the compiler can be difficult. After all, we do aim to pass incorrect trees into the pickling phase, so since they are incorrect, reasoning about what will be found there is tough. So the solution I came up was to try to force pickling with a series of fallbacks, based on the already found neg-test of the compiler. If something cannot be pickled, in its place we pickle an ErrorType, Empty block, or nothing (depending on the situation). The main problem lies with unpicking. Since we pickle very naively like this, there are some situations where an assumption used in unpickler may not be correct, so we end up fallbacking in unpickler as well (made up example, since it is tough for me to remember specifics right now, we might fail to pickle a method, then while unpicking a class the method is expected there, so we stop unpickling that).

This may sound somewhat dire, but the two alternatives I know of are much worse to me:

  • rewrite completely pickler and unpickler for best effort - extremely high maintenance cost (not even mentioning implementation)
  • limit to easily identifiable cases, like simple Incorrect Type Errors, Missing Type, etc., and incorporate that into the current pickler and unpickler implementation - much less useful, in my opinion

So the be able to be formalized, it would need to be rewritten into one of the above.

@sjrd
Copy link
Member

sjrd commented Jul 3, 2023

Since we pickle very naively like this, there are some situations where an assumption used in unpickler may not be correct, so we end up fallbacking in unpickler as well (made up example, since it is tough for me to remember specifics right now, we might fail to pickle a method, then while unpicking a class the method is expected there, so we stop unpickling that).

IMO this is the core of the problem. Your proposal is to say "anything goes", and who knows whether unpickling will be able to recover from it. Instead, I propose to specify what the unpickler needs to be able to recover from. Pointing to a non-existent method? That's fine in a sense, with the caveat that we cannot compute a tpe for such a tree, so it becomes ErrorType as well. If those things are specified, then external tools can recover from the same situations.

@tgodzik
Copy link
Contributor

tgodzik commented Jul 3, 2023

If those things are specified, then external tools can recover from the same situations.

I think the idea is that this shouldn't be used by external tools currently or ever (?)

@sjrd
Copy link
Member

sjrd commented Jul 4, 2023

I don't think that point of view makes sense. If it stayed within one project (in the sbt sense of the term), that would make sense. But as soon as you cross the project boundary, even "within the compiler" can mean "across different versions of the compiler". Or different options. Also, it means that build tools need to communicate these files around; likely take them into consideration for incremental compilation (you can't afford to drop down to full recompilation just because an upstream dependency has a best-effort output), etc. IMO, that means these things will de facto be read by other tools.

@jchyb
Copy link
Contributor Author

jchyb commented Jul 4, 2023

About the use in different compiler versions. My first idea was to limit the use to only the same patch compiler versions. This way we would be able to freely enact changes to the implementation, especially important for the LTS. This made sense to me as it seemed unlikely that many people would use different versions of scala 3 across different projects.

However, it's not like the current best-effort implementation has any guarantees to uphold anyway. It can still fail to produce/read betasty for rare cases, like for programs with certain circular type dependencies. So if we were to allow using betasty across different versions, and it would only work sometimes, this still might be better than artificially limiting it to a subset of project structures. As the implementation matures and is kept unchanged in subsequent compiler versions, it is going to fail for less and less cases. And that approach is always going to work for more cases than when limiting betasty use to the same compiler patch version. This is supported (somewhat by accident) by the fallback-based implementation where we may naively fallback both while writing and reading the betasty file. So ultimately, what I decided there was to keep the strong requirements about the minor versions from regular tasty file format, and still allow it to try reading (and perhaps failing to read) betasty from different patch versions.

You are right about the incremental compilation - we cannot really use it with the way best-effort compilation is now. But I don't think it can be used for failing projects anyway. Previously, modules depending on failing modules would just not be able to be compiled at all.

@tgodzik
Copy link
Contributor

tgodzik commented Jul 5, 2023

About the use in different compiler versions. My first idea was to limit the use to only the same patch compiler versions. This way we would be able to freely enact changes to the implementation, especially important for the LTS. This made sense to me as it seemed unlikely that many people would use different versions of scala 3 across different projects.

I think that scenario is highly unlikely and since we have only best-effort compilation, it's fine if that fails at times.

You are right about the incremental compilation - we cannot really use it with the way best-effort compilation is now. But I don't think it can be used for failing projects anyway. Previously, modules depending on failing modules would just not be able to be compiled at all.

It's highly unlikely that we could provide sensible incremental compilation here actually, only using the correctly compiled artifacts before. I don't think we should try to support it at this time for artifacts with potential errors.

@sjrd what kind of specification would you have in mind actually? How would that help the tooling? We would want to keep the changes to the compiler as maintainable as possible, so that we don't need to rewrite too many things.

@jchyb would it be possible to at least specify the cases you know of? So that we have a list for potencial tool formalization later?

@Kordyjan
Copy link
Contributor

Kordyjan commented Jul 5, 2023

The best way forward is to decide that BETASTy files can only be read and written by the exact version of the compiler. This way, we can really treat them as an implementation detail. This will be beneficial as I believe there will be many reiterations and improvements along the way after shipping the first version. Maintaining any compatibility guarantees about BETASTy files is not our goal right now.

Later, when we are sure about the design and implementation, we can "open" the format by writing the specification and maintaining the same compatibility guarantees as TASTy does. Doing it prematurely may be counterproductive, as the design may change based on feedback.

@jchyb jchyb force-pushed the feature/best-effort branch 2 times, most recently from ed42813 to 8cca998 Compare August 7, 2023 16:38
@jchyb jchyb force-pushed the feature/best-effort branch 2 times, most recently from 731bb17 to 4c31f5c Compare August 23, 2023 11:54
@jchyb jchyb force-pushed the feature/best-effort branch 7 times, most recently from 365f6eb to 6dda6c2 Compare August 30, 2023 08:18
@jchyb jchyb marked this pull request as ready for review August 30, 2023 12:01
@jchyb
Copy link
Contributor Author

jchyb commented Aug 30, 2023

Hello again, apologies for a lack of (visible) movement in this PR.
In the meantime, I managed to add many improvements to the implementation here, to the point where I was also now able to specify the grammar used in the Best Effort TASTy files in a concise manner (found in BestEffortTastyFormat). With that in mind, I still agree with Paweł’s remarks above, and with the fact that for now the format should only be consumed by the compiler. However, to future proof the (now unchecked) compatibility concerns, I added and additional field (now unused, but reserved) for betasty versioning - though if deemed unnecessary for now, I can of course remove it.
All in all, I consider the work here complete, and ready for review. A more concise explanation of the PR can be found in the commit message, which I think could be helpful for review.

@jchyb jchyb force-pushed the feature/best-effort branch from 6dda6c2 to 188d9a4 Compare October 30, 2023 09:13
@jchyb jchyb assigned sjrd and unassigned jchyb Oct 30, 2023
@jchyb jchyb force-pushed the feature/best-effort branch from 188d9a4 to 5475c2b Compare November 7, 2023 14:11
compiler/src/dotty/tools/dotc/Run.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/Run.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/ast/tpd.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/core/Contexts.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/core/Contexts.scala Outdated Show resolved Hide resolved
tasty/src/dotty/tools/tasty/TastyHeaderUnpickler.scala Outdated Show resolved Hide resolved
tasty/src/dotty/tools/tasty/TastyFormat.scala Outdated Show resolved Hide resolved
Comment on lines 761 to 762
if ctx.isBestEffort then report.error(msg)
else throw FatalError(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit worrying. The doc comment explicitly says that it won't let !tp1.exists pass through. Presumably, callers rely on that not to crash just after.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a comment to the doc of the method to reflect this. In the situations I tested, this was (luckily) not a problem, and the only solution here would be to keep throwing FatalError, which would stop the best-effort compilation (with a crash, no less). What I'm trying to say here is, by not throwing this FatalError we minimize crashes, and maximize successful best effort compilations

Comment on lines 716 to 721
* TODO: Find a more robust way to characterize self symbols, maybe by
* spending a Flag on them?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe a flag has been spent on that since this was written, namely SelfName. Should this whole method become

    final def isSelfSym(using Context): Boolean = is(SelfName)

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jchyb jchyb removed their assignment Mar 26, 2024
Copy link
Member

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good now. I only minor, localized comments.

compiler/src/dotty/tools/dotc/Driver.scala Show resolved Hide resolved
compiler/src/dotty/tools/dotc/ast/TreeInfo.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/core/SymbolLoaders.scala Outdated Show resolved Hide resolved
moduleRoot.classSymbol.rootTreeOrProvider = unpickler
if (!isBestEffortTasty || ctx.withBestEffortTasty) then
val tastyBytes = tastyFile.toByteArray
val unpickler = new tasty.DottyUnpickler(tastyFile, tastyBytes, isBestEffortTasty = isBestEffortTasty)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is suspicious. It locally recreates an unpickler that seems to be equivalent to the private val unpickler at the class level. Moreover, the old code did not create such a local duplicate. Should this line be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RIght, thank you for catching that. You are absolutely right, I must have added it by mistake on one of the rebases

Comment on lines 76 to 77
private inline def passesConditionForErroringBestEffortCode(condition: Boolean)(using Context): Boolean =
((!ctx.isBestEffort && ctx.reporter.errorsReported) || ctx.usedBestEffortTasty) || condition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making the condition a by-name argument so that it does not get evaluated at all in normal runs:

Suggested change
private inline def passesConditionForErroringBestEffortCode(condition: Boolean)(using Context): Boolean =
((!ctx.isBestEffort && ctx.reporter.errorsReported) || ctx.usedBestEffortTasty) || condition
private inline def passesConditionForErroringBestEffortCode(condition: => Boolean)(using Context): Boolean =
((!ctx.isBestEffort && ctx.reporter.errorsReported) || ctx.usedBestEffortTasty) || condition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Also while rechecking things I've noticed that I've put ! here incorrectly. I remembered you might have signaled about that before, but I ended up just changing the name at that time. Now it's all correct

@jchyb jchyb force-pushed the feature/best-effort branch 6 times, most recently from b61f501 to 61c5107 Compare April 16, 2024 09:29
jchyb and others added 10 commits April 17, 2024 11:24
2 new experimental options are introduces for the compiler:
`-Ybest-effort` and `-Ywith-best-effort-tasty`. A related
Best Effort TASTy (.betasty) format, a TASTy aligned file format able
to hold some errored trees was also added. Behaviour of the options and
the format is documented as part of this commit in the
`best-effort-compilation.md` docs file.

`-Ybest-effort` is used to produce `.betasty` files in the
`<output>/META-INF/best-effort`. `-Ywith-best-effort-tasty` allows to
use them during compilation, limiting it to the frontend phases if such
file is used. If any .betasty is used, transparent inline macros also
cease to be expanded by the compiler.

Since best-effort compilation can fail (e.g. due to cyclic reference
errors which sometimes are not able to be pickled or unpickled),
the crashes caused by it are wrapped into an additional descriptive
error message in the aim to fail more gracefully (and not pollute our
issue tracker with known problems).

The feature is tested in two ways:
 * with a set of pairs of dependent projects, one of which is
 meant to produce .betasty by using `-Ybest-effort`, and the other
 tries to consume it using `-Ywith-best-effort-tasty`.
 * by reusing the compiler nonbootstrapped neg tests, first by running
 them with `-Ybest-effort` option, and then by running read-tasty tests
 on the produced betasty files to thest best-effort tastt unpickling

Additionally, `-Ywith-best-effort-tasty` allows to print `.betasty` via
`-print-tasty`.
@jchyb jchyb force-pushed the feature/best-effort branch from 61c5107 to 2eb73c5 Compare April 17, 2024 09:29
@jchyb jchyb requested a review from sjrd April 17, 2024 11:13
@jchyb
Copy link
Contributor Author

jchyb commented Apr 17, 2024

@sjrd Thank you for the your last review. I have applied all of the suggestions and rebased on top of the latest PR's (including pipelining). I've also made sure to carefully recheck everything this time, however I only ended up adjusting the passesConditionForErroringBestEffortCode method, everything else is as it was before

@sjrd
Copy link
Member

sjrd commented Apr 18, 2024

@jchyb Consider squashing everything before merging. Also make sure the commit message is adequate (if we merge as is we take the PR description, which is outdated).

@sjrd sjrd assigned jchyb and unassigned sjrd Apr 18, 2024
@jchyb jchyb merged commit 95a8a9c into scala:main Apr 18, 2024
19 checks passed
@jchyb jchyb deleted the feature/best-effort branch April 18, 2024 09:18
@jchyb
Copy link
Contributor Author

jchyb commented Apr 18, 2024

Thank you so much for all of the reviews and overall help with this @sjrd!

@tgodzik
Copy link
Contributor

tgodzik commented Apr 18, 2024

Congrats @jchyb ! Great work!

@Kordyjan Kordyjan added this to the 3.5.0 milestone May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants