safety checks #77

shwestrick · 2023-01-14T19:37:04Z

Progress on #43. This patch implements a flag --check which, before writing to the output, checks the following:

lex(output) = lex(input): no tokens added, removed, or reordered (including comments).
parse(output) = parse(input): make sure nothing terrible has gone wrong in the parser. (If the result of the lexer is good, then this check should be superfluous. But, uhhh, for my sanity... I'm just gonna check it explicitly.)
format(output) = format(input): make sure the formatter is idempotent.

Together, these checks should ensure that smlfmt is always safe to use. The first two checks are crucial, whereas the idempotence check is moreso an aesthetic concern. And actually, I've found a few edge cases where smlfmt is not idempotent, due to how comments are handled. (We should fix that later.)

So, for now, with --check enabled, smlfmt will exit with an error if safety is violated (i.e., due to tokens mangled), and will only output a warning if idempotence is violated.

CheckOutput.check interface
--check flag, and plumbing for calling CheckOutput.check in the right place
~~implement CheckOutput.checkTokenSeqs.checkTokens~~. Implement Token.sameExceptForMultilineIndentation, which compares two tokens, handling multiline tokens where it's okay for indentation to differ.
implement CompareAst.
check for bugs from existing tests

azdavis · 2023-01-16T09:12:48Z

this is nifty, once impl'd i'll probably call smlfmt from millet with this flag and move the disclaimer to only apply to my naive formatter.

can there be specific exit codes so a caller of smlfmt (like millet) can tell the difference between user error (e.g. parse error) and smlfmt internal error (e.g. lex(output) != lex(input) etc)? that way millet could ignore smlfmt errors about bad user input but show a warning/request the user fill out a bug report when smlfmt fails the safety checks.

shwestrick · 2023-01-16T17:03:49Z

can there be specific exit codes so a caller of smlfmt (like millet) can tell the difference between user error (e.g. parse error) and smlfmt internal error (e.g. lex(output) != lex(input) etc)? that way millet could ignore smlfmt errors about bad user input but show a warning/request the user fill out a bug report when smlfmt fails the safety checks.

Oh, nice idea. Exit codes seem like they would work. Although, I'm wondering, looking ahead -- would it be useful to have an interface for interesting queries and responses? E.g., smlfmt could provide a server interface with JSON messages? We could use it for different error messages, but also could use it for more control over formatting, e.g. smlfmt could return only a file diff, or we could request that smlfmt only reformats a specific region within a file.

azdavis · 2023-01-16T19:24:47Z

that sounds nice but also a lot more work

shwestrick · 2023-01-17T17:14:14Z

On the smlfmt side, it would be really easy to do an initial protocol. I'm picturing just three possible messages initially:

success with formatted output, e.g. { "tag": "success", "data": ... }
error: invalid SML input { "tag": "parse-error", "data": ... }
error: smlfmt bug { "tag": "bug", "data": ... }

This wouldn't be much harder to implement than different exit codes.

Would this be difficult to support on Millet's side?

(I think it's worth the investment, because then the protocol could be incrementally extended with more interesting functionality down the line.)

azdavis · 2023-01-17T22:57:12Z

oh no it wouldn't really be much harder to consume that on the millet side. the 'more work' i was thinking of would be smlfmt being able to input/output incremental diffs.

i'd suggest a flag like --json-version=<N> where <N> is some integer starting with 1 to output json data. then if you need to change the output in a backwards incompatible way you can request new callers use e.g. json version 2 while keeping version 1 around.

shwestrick · 2023-01-17T23:14:36Z

Excellent! The --json-version thing is a great idea, thanks. I'll try throwing this together soon and let you know how it goes.

…tions. request bug reports

shwestrick added 4 commits January 14, 2023 14:12

set up infrastructure to perform check on output

d05a8b6

work on CompareAst

2a02579

proper checks for multiline tokens

4871f04

CompareAst equal_spec

6b132ea

CompareAst equal_sigexp, equal_sigdec

25cfabe

shwestrick added 7 commits January 17, 2023 12:21

CompareAst equal_syntaxseq

57b5350

CompareAst equal_ty

8fe6a81

CompareAst equal_pat

9fe7d07

CompareAst equal_exp

d95e227

CompareAst equal_dec

1a00e0b

finish up CompareAst

064891b

better error message for check fail

46eba50

shwestrick added 2 commits January 17, 2023 20:24

only warning for non-idempotent formatting. fail on other check viola…

90cb89c

…tions. request bug reports

handle errors better and add test

6e908b0

shwestrick merged commit 8090550 into main Jan 18, 2023

shwestrick deleted the check-output branch January 18, 2023 05:13

shwestrick changed the title ~~(WIP) safety checks~~ safety checks Jan 18, 2023

shwestrick mentioned this pull request Jan 18, 2023

Autoformatter tests #43

Closed

shwestrick mentioned this pull request Nov 1, 2023

TODO: new release #94

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

safety checks #77

safety checks #77

shwestrick commented Jan 14, 2023 •

edited

Loading

azdavis commented Jan 16, 2023

shwestrick commented Jan 16, 2023

azdavis commented Jan 16, 2023

shwestrick commented Jan 17, 2023 •

edited

Loading

azdavis commented Jan 17, 2023

shwestrick commented Jan 17, 2023

safety checks #77

safety checks #77

Conversation

shwestrick commented Jan 14, 2023 • edited Loading

azdavis commented Jan 16, 2023

shwestrick commented Jan 16, 2023

azdavis commented Jan 16, 2023

shwestrick commented Jan 17, 2023 • edited Loading

azdavis commented Jan 17, 2023

shwestrick commented Jan 17, 2023

shwestrick commented Jan 14, 2023 •

edited

Loading

shwestrick commented Jan 17, 2023 •

edited

Loading