Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

safety checks #77

Merged
merged 14 commits into from
Jan 18, 2023
Merged

safety checks #77

merged 14 commits into from
Jan 18, 2023

Conversation

shwestrick
Copy link
Owner

@shwestrick shwestrick commented Jan 14, 2023

Progress on #43. This patch implements a flag --check which, before writing to the output, checks the following:

  • lex(output) = lex(input): no tokens added, removed, or reordered (including comments).
  • parse(output) = parse(input): make sure nothing terrible has gone wrong in the parser. (If the result of the lexer is good, then this check should be superfluous. But, uhhh, for my sanity... I'm just gonna check it explicitly.)
  • format(output) = format(input): make sure the formatter is idempotent.

Together, these checks should ensure that smlfmt is always safe to use. The first two checks are crucial, whereas the idempotence check is moreso an aesthetic concern. And actually, I've found a few edge cases where smlfmt is not idempotent, due to how comments are handled. (We should fix that later.)

So, for now, with --check enabled, smlfmt will exit with an error if safety is violated (i.e., due to tokens mangled), and will only output a warning if idempotence is violated.

  • CheckOutput.check interface
  • --check flag, and plumbing for calling CheckOutput.check in the right place
  • implement CheckOutput.checkTokenSeqs.checkTokens. Implement Token.sameExceptForMultilineIndentation, which compares two tokens, handling multiline tokens where it's okay for indentation to differ.
  • implement CompareAst.
  • check for bugs from existing tests

@azdavis
Copy link
Contributor

azdavis commented Jan 16, 2023

this is nifty, once impl'd i'll probably call smlfmt from millet with this flag and move the disclaimer to only apply to my naive formatter.

can there be specific exit codes so a caller of smlfmt (like millet) can tell the difference between user error (e.g. parse error) and smlfmt internal error (e.g. lex(output) != lex(input) etc)? that way millet could ignore smlfmt errors about bad user input but show a warning/request the user fill out a bug report when smlfmt fails the safety checks.

@shwestrick
Copy link
Owner Author

can there be specific exit codes so a caller of smlfmt (like millet) can tell the difference between user error (e.g. parse error) and smlfmt internal error (e.g. lex(output) != lex(input) etc)? that way millet could ignore smlfmt errors about bad user input but show a warning/request the user fill out a bug report when smlfmt fails the safety checks.

Oh, nice idea. Exit codes seem like they would work. Although, I'm wondering, looking ahead -- would it be useful to have an interface for interesting queries and responses? E.g., smlfmt could provide a server interface with JSON messages? We could use it for different error messages, but also could use it for more control over formatting, e.g. smlfmt could return only a file diff, or we could request that smlfmt only reformats a specific region within a file.

@azdavis
Copy link
Contributor

azdavis commented Jan 16, 2023

that sounds nice but also a lot more work

@shwestrick
Copy link
Owner Author

shwestrick commented Jan 17, 2023

On the smlfmt side, it would be really easy to do an initial protocol. I'm picturing just three possible messages initially:

  • success with formatted output, e.g. { "tag": "success", "data": ... }
  • error: invalid SML input { "tag": "parse-error", "data": ... }
  • error: smlfmt bug { "tag": "bug", "data": ... }

This wouldn't be much harder to implement than different exit codes.

Would this be difficult to support on Millet's side?

(I think it's worth the investment, because then the protocol could be incrementally extended with more interesting functionality down the line.)

@azdavis
Copy link
Contributor

azdavis commented Jan 17, 2023

oh no it wouldn't really be much harder to consume that on the millet side. the 'more work' i was thinking of would be smlfmt being able to input/output incremental diffs.

i'd suggest a flag like --json-version=<N> where <N> is some integer starting with 1 to output json data. then if you need to change the output in a backwards incompatible way you can request new callers use e.g. json version 2 while keeping version 1 around.

@shwestrick
Copy link
Owner Author

Excellent! The --json-version thing is a great idea, thanks. I'll try throwing this together soon and let you know how it goes.

@shwestrick shwestrick merged commit 8090550 into main Jan 18, 2023
@shwestrick shwestrick deleted the check-output branch January 18, 2023 05:13
@shwestrick shwestrick changed the title (WIP) safety checks safety checks Jan 18, 2023
@shwestrick shwestrick mentioned this pull request Jan 18, 2023
@shwestrick shwestrick mentioned this pull request Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants