Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boolean operator short-circuiting #713

Merged
merged 1 commit into from
Dec 16, 2024
Merged

Conversation

jbardin
Copy link
Member

@jbardin jbardin commented Nov 30, 2024

Taking inspiration from #538, this continues the effort to allow short-circuiting of boolean operators. I don't think we need to switch the operators to right-associative for short-circuiting to work like the previous PR, as I think the problem with the prior PR was that the decision to short-circuit requires both lhs and rhs values because short-circuiting happens in both directions for either operator. My guess was that the explicit switch to right-associative made some rhs short-circuits work, but it still doesn't work both ways.

Given that HCL already evaluates boolean operations like so, it appears the associative-ness is OK:

> true || false && false
true
>  true || (false && false)
true
> (true || false) && false
false

and same with distributive identities with implied order:

> (true && true || false) == ((true && true) || (true && false))
true
> (true && false || true) == ((true && false) || (true && true))
true

(maybe that's just restating the same thing, but I figure I'd type it out ;)).

Since single boolean operators are fully associative, foo && (bar && baz) is equivalent to (foo && bar) && baz, when both && and || are present operator precedence takes care of the rest. An example being:

> false && (true || true)
false
> false && true || true
true

which is correct because && has precedence over ||.

The full ability to short-circuit also allows us to handle all the unknown cases (as previously described in hashicorp/terraform#31078), filling out the truth tables in their entirety:

&& true false unknown
true true false unknown
false false false false
unknown unknown false unknown
|| true false unknown
true true true true
false true false unknown
unknown true unknown unknown

@apparentlymart
Copy link
Contributor

When I tried this before in #538 I apparently ended up needing to teach the parser to parse the && and || operators as right-associative, so that foo && bar && baz would parse as foo && (bar && baz) instead of (foo && bar) && baz.

Off the top of my head I don't remember why that was needed, and maybe something else you did here made that not necessary for this version, but just wanted to point it out in case it matters because it seems like all of the test cases here only test one operation at a time. (My PR seems to only test the single-operation cases also, so I guess I must've run out of time to work on it just after dealing with whatever problem that parser change was supposed to be solving. 🤔 )

@jbardin
Copy link
Member Author

jbardin commented Dec 1, 2024

This actually came out of some work to sort out and audit unknown and marks handling, which I combined with some ideas from #538 to get the short-circuiting fully working too. (my comment was mostly what I wanted to write for the PR body, so I moved it up there ;))

@jbardin jbardin force-pushed the jbardin/bool-short-circuit branch 2 times, most recently from dc88dcc to 144fdb5 Compare December 2, 2024 16:47
@jbardin
Copy link
Member Author

jbardin commented Dec 2, 2024

Add a complete test of all 2 and 3 part boolean expressions including unknowns.

@apparentlymart
Copy link
Contributor

Ah yes, I think I remember what I was dithering about in that other PR now:

The intuition from some other languages is that the terms evaluate from left to right and evaluation stops as soon as they reach a term that is sufficient to decide the answer.

HCL has two extra oddities that I was pondering different ways to represent:

  • If any term is unknown then we can't know whether or not short-circuiting would occur, so need to make some call about how to treat the remaining terms in that case.
  • We use expression evaluation both for value resolution and for type checking (using unknown values for the latter) and so we do still want to do some evaluating of remaining terms after a short-circuit.

I had been debating with myself whether to try to find some way to preserve the left-to-right behavior that folks would probably be expecting or whether to make it fully symmetrical so that e.g. anything && false would short-circuit in the same way as false && anything, always preventing the full evaluation of the other operand if either operand is sufficient to decide the answer.

The associativity fiddling was part of my experiments with different approaches to that question, but I agree that some answers to that question don't require anything special to happen in the parser. The approach you've proposed here seems like it's one of the shapes that doesn't require a specific parse tree shape.

@jbardin jbardin marked this pull request as ready for review December 2, 2024 19:30
@jbardin jbardin requested a review from a team as a code owner December 2, 2024 19:30
@jbardin jbardin requested a review from a team December 2, 2024 19:30
@jbardin
Copy link
Member Author

jbardin commented Dec 2, 2024

For completeness I took the full truth-table test and ran it against main, which did not result in any discrepancies in known results. All failed tests were from an unknown result which now becomes known.

@jbardin jbardin force-pushed the jbardin/bool-short-circuit branch from 7bda92a to 355e88a Compare December 3, 2024 16:33
@jbardin
Copy link
Member Author

jbardin commented Dec 3, 2024

Something which my initial attempt missed was the complete handling of diagnostics. There's two real reasons to need short-circuiting in HCL, to account for unknown values, and to skip diagnostics which may be dependent on the short-circuiting side of the expression.

We had all the unknown cases accounted for so far in the PR, but diagnostics handling was still sort of an afterthought, and handled on a best effort basis. The problem is that diagnostics must be a primary concern and handled in conjunction with the short-circuit logic, so my first version of ShortCircuit could not deal with combinations of unknowns and diagnostics since they were handled separately.

The updated signature is not as tidy (and far removed from the initial prototype in 538), but allows us to keep the diagnostics paired with the path taken through the expression:

ShortCircuit func(lhs, rhs cty.Value, lhsDiags, rhsDiags hcl.Diagnostics) (cty.Value, hcl.Diagnostics)

Another option would be to return an indication of which side was taken and allow the caller to pick the diagnostics, but while it would have fewer arguments to handle, it seems less flexible and not really any more clear.

@jbardin jbardin requested a review from a team December 3, 2024 17:13
@jbardin
Copy link
Member Author

jbardin commented Dec 3, 2024

I would note that this is attempting to converge with the same behavior implemented by conditional expressions, but ends up being somewhat convoluted because && and || are normal binary operators whereas ConditionalExpr is implemented in isolation.
Perhaps another valid approach would be to parse && and || separately as a new BooleanOpExpr so that all the logic could reside in the operator implementation rather than a new ShortCircuit callback. One of my goals here was to not mess with parsing at all, but that may be preferable if it turns out there are more edge cases that haven't yet been covered.

@apparentlymart
Copy link
Contributor

apparentlymart commented Dec 3, 2024

FWIW I had considered that too, and had decided against it because the AST nodes in hclsyntax are all exported, along with helper functions for performing tree walks, and so it's ambiguous whether the AST shape is a compatibility constraint.

Messing with the associativity of the operators was a concession I allowed myself because it seemed like an existing caller trying to do a tree walk and visit every node would still get a reasonable answer -- no new AST node types, just a different tree shape -- but adding entirely new AST nodes is potentially more risky because any third-party type switch over all of them would not cover it.

Perhaps it's worth doing anyway since I do agree that it's a cleaner design looking forward. I dunno. 🤷‍♂️

@jbardin
Copy link
Member Author

jbardin commented Dec 3, 2024

Ah yes, exhaustive type switching is always a hazard in these things 🤦‍♂️ Well, I'd prefer to not think about that unless I have to, and even if the ShortCircuit addition isn't pretty, it's not like we're planning on implementing loads of new operators anytime in the near future 😉

dsa0x
dsa0x previously approved these changes Dec 5, 2024
Implement short-circuiting logic for boolean binary operators
Copy link
Member

@liamcervante liamcervante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jbardin
Copy link
Member Author

jbardin commented Dec 10, 2024

@hashicorp/team-ip-compliance, can you give this a review so we can proceed?

Copy link
Contributor

@mukeshjc mukeshjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the changes.

@jbardin jbardin merged commit 03baea4 into main Dec 16, 2024
13 checks passed
@jbardin jbardin deleted the jbardin/bool-short-circuit branch December 16, 2024 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants