Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CS2] Fix handling of parameters that are complex #4430

Conversation

GeoffreyBooth
Copy link
Collaborator

@GeoffreyBooth GeoffreyBooth added this to the 2.0.0 milestone Jan 24, 2017
@GeoffreyBooth GeoffreyBooth requested a review from lydell January 24, 2017 04:53
@GeoffreyBooth GeoffreyBooth changed the base branch from master to 2 January 24, 2017 04:53
…menting variable (or more generally any complicated parameter that isComplex)
@GeoffreyBooth GeoffreyBooth force-pushed the destructured-parameter-evaluation-order branch from 49ccded to c4b6d27 Compare January 24, 2017 05:08
@GeoffreyBooth
Copy link
Collaborator Author

The problem with just testing that @value.isComplex() is that parameter default values are complex. So something as simple as a = 1 is complex, and now we no longer support parameter default values, period. I can’t do @value.isComplex() and not @value instanceof Assign because then I don’t catch a = next() and the tests fail.

@GeoffreyBooth GeoffreyBooth changed the title [CS2] Fix handling of parameters that are complex [wip][CS2] Fix handling of parameters that are complex Jan 24, 2017
…er of execution matters; but don’t pull _all_ complex parameters out of the parameter list, so that we don’t lose parameter default values
@GeoffreyBooth
Copy link
Collaborator Author

GeoffreyBooth commented Jan 24, 2017

Okay, so I think I’ve found a way to detect when parameter execution order might matter, and in those cases I pull the parameter out to define within the function body; but I feel like this is a losing battle. I feel like we’ll just be chasing more and more edge cases as people think of more creative ways to squeeze expressions into parameters. I’m not sure that the execution order of parameters should be something we’re guaranteeing.

@GeoffreyBooth GeoffreyBooth changed the title [wip][CS2] Fix handling of parameters that are complex [CS2] Fix handling of parameters that are complex Jan 24, 2017
@connec
Copy link
Collaborator

connec commented Jan 24, 2017

@GeoffreyBooth I think the approach we should take is an all or nothing one - either all parameter defaults are compiled in the signature, or they're all compiled in the body. They are functionally equivalent, after all, so people shouldn't be too hung up about where they appear.

In terms of maximising the 'happy path', once we're compiling to ES2015 destructured parameters (with defaults), this issue will go away on its own.

I'm not sure I understand your point about

So something as simple as a = 1 is complex

In that case, @value should be a NumberLiteral, which is not complex. Am I misunderstanding?

@lydell
Copy link
Collaborator

lydell commented Jan 24, 2017

@connec So the “real” fix is not this PR, but to output destructuring as ES2015? Sounds good to me.

@connec
Copy link
Collaborator

connec commented Jan 24, 2017

@lydell yes, pretty much. The test case signature (({ a = ++i }, b = ++i)) is valid ES2015.

let i = 0
const test = ({ a = ++i }, b = ++i) => [a, b]
console.log(test({})) // [ 1, 2 ]

The remaining issue would then be #4413.

@jashkenas
Copy link
Owner

@connec So the “real” fix is not this PR, but to output destructuring as ES2015? Sounds good to me.

Amen to that.

@GeoffreyBooth
Copy link
Collaborator Author

We support things in parameters that ES doesn't, like expansions or splats in non-final positions, and this. We already discussed dropping support for those and decided not to. We don't have the option of just limiting parameters to only what ES supports.

At the time I think I mentioned how it would be a pain to deal with this inside destructuring. I think we would need to check every destructured parameter to see if it contains this, and if so declare it in the function body. And I guess because of this execution order concern, the moment we find a first "declare in the body" parameter, all subsequent parameters also need to go in the body, like what we do now with an expansion or non-final splat.

@connec
Copy link
Collaborator

connec commented Jan 24, 2017

The solution to handling this in parameters was partially implemented as part of the classes work:

  1. Go through each name in the params (involves traversing destructured parameters) to extract a list of this assignments, remove this from the param and rename the params to temporary names if necessary.
  2. Compile params as normal (all this references were removed).
  3. Compile a list of this assignments in the function body (or after super in constructors).

Expansions and splats do make things interesting. I guess the approach we currently take there would need to continue (i.e. all parameters after a splat go in the function body).

@jashkenas
Copy link
Owner

We support things in parameters that ES doesn't, like expansions or splats in non-final positions, and this. We already discussed dropping support for those and decided not to. We don't have the option of just limiting parameters to only what ES supports.

I think maybe you should consider it again. If the mission statement for CS2 is "CoffeeScript, with all of the ES6/7 goodies", then folks are going to expect things like classes and destructuring and arrow functions to work in the same way that they do in JavaScript.

If it gives you more expected behavior, is easier to implement, produces cleaner output, and will lead to less future questions of "Why is my CoffeeScript destructuring doing this strange thing?", then perhaps it's the way to go.

@GeoffreyBooth
Copy link
Collaborator Author

GeoffreyBooth commented Jan 24, 2017

@jashkenas when there’s no choice, yes. That’s what led to the breaking change for parameter default values now applying the default only for undefined rather than undefined or null—we had to choose whether we were going to output ES’ default values, with their limitation of undefined only, or keep always assigning default values in the function body. There was no middle ground.

For something like a splat parameter, there is a middle ground: when it’s the last parameter, it gets output as ES; when it’s not, we do contortions and declare/assign variables in the function body. I think that’s the model we want to follow: when people stick to what’s possible in ES, we should output idiomatic ES; when they type something that isn’t possible in ES, but we can achieve it without too much drama, we should give them what they want. There’s no reason to ban non-final splats just because they’re not possible in ES, aside from sparing us the hassle of implementing them. Likewise with this in parameters, especially since there’s a lot of CoffeeScript code out there that would break if we drop support for that.

@jashkenas
Copy link
Owner

when they type something that isn’t possible in ES, but we can achieve it without too much drama, we should give them what they want

Sure, in principle. Completely sound reasoning.

but I feel like this is a losing battle. I feel like we’ll just be chasing more and more edge cases as people think of more creative ways to squeeze expressions into parameters.

I think this might be a "too much drama" case.

@GeoffreyBooth
Copy link
Collaborator Author

I think this might be a “too much drama” case.

It might be, but in this case we could opt to keep the 1.x approach. There’s no compatibility reason to do ES’ output, unlike classes; it just pretties up the output. That improved output might not be worth the breaking changes.

@GeoffreyBooth
Copy link
Collaborator Author

GeoffreyBooth commented Jan 25, 2017

Getting back to this PR, I did some investigating into what was wrong with Param::isComplex being defined as @name.isComplex() or (@value?.isComplex? and @value.isComplex()). That was @connec’s suggestion that started this PR. The reason I pulled away from isComplex is because it started compiling

f = (opts = {}) ->

into

f = function(opts) {
  opts = opts != null ? opts : {};
};

which is not what we want. (For one thing, for ES compatibility it needs to be !== undefined.) So why is {} apparently complex?

Because all objects are complex. Base sets isComplex to true, and most classes that extend Base don’t redefine it. Obj and Arr are two such classes. If you think about it, it makes sense: an object can contain other objects, arrays, functions, function calls, you name it. {} may not be complex, but objects in general certainly could be.

So where does this leave us? Well, the safe solution would be to put back the check for @value.isComplex() as the way to decide whether a parameter default value can stay in the parameter list or move into the function body. I could go further and add an exception for objects and arrays, to traverse them looking for complex children; if I find none, then those objects and arrays (like {} or []) could stay as parameter default values too. That would be the general way forward: default to defining any complex default value in the function body, and add more and more exceptions as we identify cases that are safe to put in parameter lists. What do you think?

Update: see latest commits for an implementation of this approach.

…s only when undefined, not when null or undefined
… allowable in a function parameter list rather than the function body (there are lots more detections we could add to find additional “safe” parameters)
…not complex be allowed in the function parameter list (like `obj.prop`)
@connec
Copy link
Collaborator

connec commented Jan 25, 2017

@GeoffreyBooth I maintain that the best approach to this would be to compile all parameter defaults to ES2015, including 'complex' parameter names (i.e. destructured ones). This will resolve the ordering problem.

For the case of splats, expansions, or anything else that trips that up, we should move all default assignments after the interruption into the method body.

Contrived example time:

i = 0
foo = ({ a = ++i }, b = ++i, [ c = ++i ], rest..., { d = ++i }, e = ++i, [ f = ++i ]) ->
  console.log a, b, c, d, e, f, rest
console.log foo {}, undefined, [], 'woah', {}, undefined, []
var foo
foo = function ({ a = ++i }, b = ++i, [ c = ++i ], ...arg) {
  var rest, j, d, e, f;
  rest = 4 <= arg.length ? arg.slice(0, j = arg.length - 3) : (j = 0, []);
  ({ d = ++i } = arg[j++]);
  e = arg[j++];
  if (e === undefined) {
    e = ++i;
  }
  ([ f = ++i ] = arg[j++]);
  return console.log(a, b, c, d, e, f, rest);
}
console.log(foo({}, undefined, [], 'woah', {}, undefined, [])) // 1 2 3 4 5 6 [ 'woah' ]

@connec
Copy link
Collaborator

connec commented Jan 25, 2017

Note: The only departure of the above from the current 2 compilation is that a and c are compiled directly as destructured parameters, and d and f are compiled as destructured assignments.

@GeoffreyBooth
Copy link
Collaborator Author

@connec Yes, that is the plan. Compiling destructuring into ES syntax just hasn’t been implemented yet, and I didn’t want to tackle that in this PR. We can ship 2.0.0-alpha1 without it. It’s under discussion in coffeescript6/discuss#69, and it’s a big enough effort that it will require its own PR.

So does this PR address everyone’s concerns with parameter ordering? I think I want to make one more change, which is to refactor Param::isComplex into Param::canBeInParameterList to make clear what this method is trying to achieve; and while I’m at it, I think a refactor of all the other isComplex methods is overdue. Presumably most of them could simply be renamed mustCache . . . @jashkenas, some insight into isComplex would be greatly appreciated!

But aside from unnecessary cleanup, is this PR acceptable to everyone?

Copy link
Collaborator

@lydell lydell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like your idea to refactor Param::isComplex into Param::canBeInParameterList, and isComplex methods into mustCache.

val = new Op '?', ref, param.value if param.value
exprs.push new Assign new Value(param.name), val, '=', param: yes
if param.value?
condition = new Literal param.name.value + ' === undefined'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ coffee -bpe 'undefined'
void 0;

Should we replace undefined with void 0 for consistency?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be (roughly) condition = new Op '==', param, new UndefinedLiteral, and would then remain consistent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using void 0 in the first place? Because it’s a few characters shorter? Because apparently many years ago it was possible to redefine undefined? SO.

I would think that undefined is more readable, and since one of our goals is to produce human-readable code, we should replace all void 0s with undefined.

if @value instanceof Value
unless @value.base.isComplex()
no
else
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd either flip this unless…else around or change it into if not…else.

if node.isComplex()
# This can be further refined. An empty object will evaluate as
# not having any complex children, but as long as it has at least
# one property it will be considered complex.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give examples on ...

  • things that this code does not consider complex
  • things that this code does consider complex, but could be improved no to in the future?

@connec
Copy link
Collaborator

connec commented Jan 25, 2017

In terms of naming, perhaps flipping it around to Node::isAtomic (or Node::isIdempotent, but that's a bit long) would be clearer? An atomic/idempotent operation can be arbitrarily complex, but must not have side effects (i.e. {} creates a new reference, so must be cached, but name can appear as many time as we want, because it is an atomic/idempotent expression).

@jashkenas
Copy link
Owner

It's not even really about "must cache" — it's about "complicated enough that we should cache". A long repeated expression, even if idempotent, shouldn't be output multiple times by the CoffeeScript compiler. Cleaner to stuff the result in a temporary variable instead.

@GeoffreyBooth
Copy link
Collaborator Author

@connec I think of atomic as meaning small and unified. At work I encourage atomic commits in Git, meaning each commit should do just one thing (as explained in the commit message) and nothing more. I would see {} and probably think it was atomic.

Idempotent is much better, but it has the same problem: why is {} idempotent? Thinking of idempotent as meaning “no side effects”, I would think that an empty object should qualify. It's arguably no clearer than isComplex.

I do like starting with is though. Is there a word or phrase that clearly explains what we're looking for, which is whether this node can be output repeatedly or whether we need to create a reference to it and only output the reference?

@connec
Copy link
Collaborator

connec commented Jan 25, 2017

I guess from @jashkenas' comment shouldCache captures the compiler's perspective on it pretty well. That does make it conceptually distinct from the case here though, which is definitely param specific, and has nothing to do with caching.

In reference to {} - this is equivalent to new Object (as [] is to new Array), and so should be seen as having side-effects. 'Side-effects' in this context means within the runtime, which would perform allocations etc. Thinking about it, I probably would call {} atomic, though.

@connec
Copy link
Collaborator

connec commented Jan 25, 2017

As an example of "complicated enough that we should cache", a.b ? 'default' will cache the intermediate a.b, due to Value::isComplex being based on the presence of accesses, even though (excepting accessor methods) a.b is idempotent.

@jashkenas
Copy link
Owner

jashkenas commented Jan 25, 2017

If you look back at old tickets, a lot of the existing isComplex rules in 1.X have been thought through way back in the dark ages of 2010. For example:

a.b

Is not idempotent. JavaScript has getters, and a.b can have a side effect.

@vendethiel
Copy link
Collaborator

@jashkenas
Copy link
Owner

@jashkenas I thought your own stance was that getters/setters were a bad part anyway?

It certainly is. But that doesn't make it idempotent.

@GeoffreyBooth
Copy link
Collaborator Author

@jashkenas you beat me to it—yes, I need to revise this to not treat a.b as safe for a parameter list. Or more specifically, not safe for a parameter list if any previous parameter has been pulled out to declare in the function body. We could allow something like ++i or generator() or a.b until we hit a parameter that needs to be put in the function body, after which all remaining parameters would need to be declared in the function body.

Currently something like destructuring (in our 1.x mode) gets declared in the function body, without triggering the remaining parameters to also go in the function body. This would need to change. Even after we implement CS2 destructuring, there will still be the case of this within destructuring in a parameter, which would need to go in the function body; so this needs to be fixed regardless.

@GeoffreyBooth
Copy link
Collaborator Author

And @vendethiel even if we think getters and setters are bad, we still need to support them. An object or class can be imported from a library and have a getter or setter. Using that object in a function parameter in CoffeeScript should still work.

@vendethiel
Copy link
Collaborator

vendethiel commented Jan 25, 2017

And @vendethiel even if we think getters and setters are bad, we still need to support them. An object or class can be imported from a library and have a getter or setter. Using that object in a function parameter in CoffeeScript should still work.

We've been treating them as side-effects-free ... sometimes, for some time.

$ coffee -bce "a.b? c"
if (typeof a.b === "function") {
  a.b(c);
}

…g complex parameters go into the function body; no need to create lots of exceptions of when to choose whether to put a complex param in the body
@GeoffreyBooth
Copy link
Collaborator Author

Okay, I focused on resolving the immediate issue that triggered this PR. I think I’ve got it now, but I’m tired. Can someone with fresh eyes please review this? I feel like nodes.coffee:1970-2015 could be simplified/collapsed somehow, but my attempts kept breaking the tests. At least right now all the tests pass.

Parameter default values are preserved until the first isComplex() param, after which all following parameters are just their reference and the default value is assigned in the function body. This preserves the correct order, while also preserving default values in the parameter list in most cases. Once we implement ES destructuring, there will be a reduction in parameters that need to be declared in the function body, and even more default values will be preserved in parameter lists. (For this, refactoring isComplex will go from nice-to-have to necessary, as destructuring will certainly be complex/”should cache” but not something we want to always bump into a function body rather than a parameter list.)

@GeoffreyBooth
Copy link
Collaborator Author

Okay, I’ve refactored isComplex to shouldCache, including updating the relevant comments. I read through all uses of isComplex and this is always the intent. This new name actually fits in much better, as it’s used by other methods named cache and cacheReference. It’s all starting to make sense now.

@lydell the sections you made comments on are no longer around. After I refactored Code to put all declarations in the body after the first shouldCache parameter, it’s no longer necessary to try to determine if a parameter is idempotent or not.

I think this PR is ready for merge. Any other notes?

@GeoffreyBooth
Copy link
Collaborator Author

@lydell did I satisfy your review? Do you have any further notes?

@GeoffreyBooth GeoffreyBooth merged commit cbea7b5 into jashkenas:2 Feb 1, 2017
@lydell lydell deleted the destructured-parameter-evaluation-order branch February 1, 2017 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants