Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behaviour of for loop/comprehension variable #728

Closed
akva opened this issue Sep 27, 2010 · 33 comments
Closed

Inconsistent behaviour of for loop/comprehension variable #728

akva opened this issue Sep 27, 2010 · 33 comments

Comments

@akva
Copy link

akva commented Sep 27, 2010

for i in [1..2]
    'simple'
alert i # 3

for i in [1..2]
    (-> i)
alert i # undefined

i = 0
for i in [1..2]
    (-> i)
alert i # 0

So the semantics of the code following a comprehension depends on the comprehension's body. I understand why it's happening and I realize it's not a good practice to use loop variable after the loop anyway. But semantics should be clearly defined imho.
This behaviour seems to be the result of #423

@akva
Copy link
Author

akva commented Sep 27, 2010

Maybe a bit more realistic example:

for i in [1..2]
    foo = 42
alert foo # 42

for i in [1..2]
    foo = 42
    (-> i)
alert foo # foo is not defined

Again, I understand why it works this way but isn't it confusing.

@jashkenas
Copy link
Owner

akva: did you run into this in production code? You're absolutely right that this is the artifact from issue #423, but personally, I think that the benefits outweigh the downsides ... what do other folks think?

@akva
Copy link
Author

akva commented Sep 28, 2010

One more artifact, #423 behaves differently in the presence of a return statement.

func = ->
    res = []
    for item in ['one', 'two', 'three']
        if item == 'two'
            return res
        else
            res.push(-> item)

alert func()[0]() # 'two', instead of expected 'one'

Here is my thoughts on this. JavaScript does not have a block scope, only function scope. It doesn't seem feasible to me to simulate block scope in a non leaky way. Doing so will break something, somewhere and sometimes in very subtle ways. It's true, the way functions close over loop variables always comes as a surprise when one encounter it for the first time. But once you've learnt this behaviour, it's actually very easy to understand and remember, since the semantics are very consistent. And it works this way in all major dynamic languages.

Besides, as you pointed out in #423, there are easy workarounds in the form of forEach or map.

did you run into this in production code?

Not yet (because I don't use loop variables after the loop ;)), but let me ask you a counter question. How often do you really need to create closures inside a loop in production code?

@jashkenas
Copy link
Owner

You need to create closures inside of a loop all the time ... when reading an array of files in Node.js, when firing ajax callbacks for a list of modified items in the browser, when mixing in method implementations from a third-party library...

That was the original impetus for adding this feature -- it bites folks in JavaScript constantly. Here are two recent messages on the Node.js list from folks who are surprised by it:

http://groups.google.com/group/nodejs/browse_thread/thread/cbd95f659e566b9c
http://groups.google.com/group/nodejs/browse_thread/thread/0c9e4eba5c02d108

Our solution, despite being inconsistent, is the lesser of two evils, I think, as it allows for fast normal loops without block scope -- safe loops with block scope in any place where it could possibly matter, and there's an easy work around if you need to export a variable from within the inner block scope: set it to null before you start the loop.

Agree?

@akva
Copy link
Author

akva commented Sep 28, 2010

Well. This could be quite a typical client side event handling code
(just pseudo code)

for li in lis
    li.click -> do_something li

However, this code is quite inefficient. For production code one would rather write something like.

ul.click (e) -> do_something e.target # ul is a parent of all those li's.

This code is more efficient, doesn't give any surprises, and arguably more modular - you can factor out the event handler function and use it elsewhere.

http://groups.google.com/group/nodejs/browse_thread/thread/0c9e4eba5c02d108

The best solution in this thread is actually the last post by Matt Ranney - use batch API. Batch style especially works well for ajax; you don't wanna fire ajax requests in a loop.

Now, of course there are valid use cases when closing over a loop variable is necessary. I just guess they are not (or shouldn't be) as common as it seems.

One more objection against #423 would be this. It teaches beginners the wrong thing and gives them false expectations.

v = 1
func = ->
    console.log v
v = 2
func() # what should this produce? 

People who get used to the #423 behaviour, without knowing the implementation details, may expect the above code to produce 1.

@hen-x
Copy link

hen-x commented Oct 1, 2010

We could have it both ways, and make loop bodies both capture and leak their variables when they contain an explicit closure.

Consider:
memos = (-> i) for i in [1..100]
alert i
...
var _i, _result, i, memos;
memos = (function() {
_result = [];
for (_i = 1; _i <= 100; _i++) {
(function() {
var i = _i;
return _result.push(function() {
return i;
});
})();
i = _i; // <-- This line is not currently generated
}
return _result;
})();
alert(i);

@satyr
Copy link
Collaborator

satyr commented Oct 1, 2010

+1 for removing this kind of magic.
Creating closures within loop is itself a bad pattern.

for li in lis then li.click -> do_something li

should be written as:

bind = (li) -> li.click -> do_something li
for li in lis then bind li

@jashkenas
Copy link
Owner

I know that we've all learned to avoid this bugaboo in JavaScript -- but it's an extremely common one. Beginners try to generate functions in for loops all the time, and are quickly surprised by JS behavior. let is a proposal to replace var for ECMA Harmony, just to address this very issue.

What do folks think of sethaurus's proposal to purposefully leak assigned variables to outer scope, after the closure wrapper?

@akva
Copy link
Author

akva commented Oct 12, 2010

leak assigned variables to outer scope

Does it make sense to do the opposite, make ALL loop variables not visible outside the loop?

@TrevorBurnham
Copy link
Collaborator

I think sethaurus's proposal is the sanest solution. New scopes should only be created by the -> operator. Any variable assigned in a for loop, including the loop variable itself, should belong to the surrounding scope.

The one thing that gives me pause is that i = 3 after for i in [0..2]. That's only intuitive if you're thinking in terms of the C-style for loop that the syntax replaces. I'd like to see this behavior changed. Maybe that should be raised as a separate issue.

@zmthy
Copy link
Collaborator

zmthy commented Oct 19, 2010

Original issue was #118. That takes me back.

That issue never answered this problem - there's magic that isn't explicit, and therefore is often unexpected. Note that it's impossible to affect the variables defined in the loop from inside the scope, in case you want to control which indices to pull out beyond the ability of the by keyword. Exposing the inner variable as an outer one suffers from this problem as well. At the very least to be the most accurate the code would need to read:
var _i, _result, i, memos;
memos = (function() {
_result = [];
for (_i = 1; _i <= 100; _i++) {
(function() {
var i = _i;
return _result.push(function() {
return i;
});
_i = i; // <-- Ensures both of the outer variables are also affected
})();
i = _i;
}
return _result;
})();
alert(i);
The same attention would need to be paid to not only indices and values, but every variable defined in the loop, in order to preserve correct scoping. I've implemented a way around this that precompiles the body and then wraps it. Note that this magic magically disappears when you include any pure statements in the loop, though. Is this the sort of thing we really want in the language?

@zmthy
Copy link
Collaborator

zmthy commented Oct 19, 2010

The hairiest method in Coffeescript just got hairier. All variables assigned in a for loop now make their way outside of it, but each run-through gets its own index and value variables when enclosed functions are present. Closing the ticket.

@jashkenas
Copy link
Owner

[Applause]...

@satyr
Copy link
Collaborator

satyr commented Oct 21, 2010

Broken with destructuring:

$ bin/coffee -bpe '(->) for [a, b] in c'
var [a, b], _i, _j, _len, _ref, _ref2, a, b;
for (_i = 0, _len = (_ref = c).length; _i < _len; _i++) {
  (function() {
    var _ref2 = _ref[_i], a = _ref2[0], b = _ref2[1];
    _j = [a, b];
    return function() {};
  })();
  [a, b] = _j;
}

@zmthy
Copy link
Collaborator

zmthy commented Oct 21, 2010

Given the concerns in #781 and how the implementation is still somewhat buggy, can we get rid of the implicit magic, and make it explicit instead with the scoped keyword (or similar)? Like the all keyword, it only needs to be a keyword in the specific context it can appear:
for scoped value, index in array
Would that be better?

@jashkenas
Copy link
Owner

No -- a special keyword for this is the worst of both worlds... We either should make it work transparently, or we should axe it and maintain JS semantics...

@zmthy
Copy link
Collaborator

zmthy commented Oct 21, 2010

Alright then. I'm torn on the issue, but magic it is. Closing this up, destructor bug fixed: 880c5c8

@satyr
Copy link
Collaborator

satyr commented Oct 21, 2010

How about wrapping only relevant closures rather than the whole body?

Before

var _i, _j, _len, _ref, li;
for (_i = 0, _len = (_ref = lis).length; _i < _len; _i++) {
  (function() {
    var li = _ref[_i];
    _j = li;
    return li.click(function() {
      return do_something(li);
    });
  })();
  li = _j;
}

After

var _i, _len, _ref, li;
for (_i = 0, _len = (_ref = lis).length; _i < _len; _i++) {
  li = _ref[_i];
  li.click((function(li) {
    return function() {
      return do_something(li);
    }
  })(li));
}

@zmthy
Copy link
Collaborator

zmthy commented Oct 21, 2010

I considered that, but decided it was too verbose. But passing them in as arguments was something I hadn't thought of. This is probably a better idea.

@jashkenas
Copy link
Owner

Nice. It reminds me that we should move the _ref = lis assignment out of the for declaration as well, and put it on the line above...

@zmthy
Copy link
Collaborator

zmthy commented Oct 21, 2010

Would a function that is instantly called be expected to modify the above values, though?
(-> i = 0)() for i in [0..5]
This looks like it should be 0 afterwards. Perhaps we should leave it as it is, but move the vars that the new scope creates into the scope's arguments.

@satyr
Copy link
Collaborator

satyr commented Oct 21, 2010

(-> i = 0)() for i in [0..5]

Why would one want to do that? I'd rather forbid modification of loop variables than allow it.
Either way though, it's a magic that breaks normal behavior.

@zmthy
Copy link
Collaborator

zmthy commented Oct 21, 2010

The point is that if someone chose to do this, the magic shouldn't be applied. And we can't tell if it's going to happen.

@akva
Copy link
Author

akva commented Oct 21, 2010

What about the while loop? Even if an acceptable solution for the for loop is found, while will not work the same way. Isn't it a bit inconsistent?

@satyr
Copy link
Collaborator

satyr commented Oct 21, 2010

Or how about an independent mechanism that makes capturing easy. Like so:

let i
  blah()

(function(i) {
  blah();
})(i);

Then we could write:

while li = lis.pop()
  let li
    setTimeout -> do_something li

@jashkenas
Copy link
Owner

Nah -- this case doesn't deserve new special syntax. It's either a feature that CoffeeScript provides -- because you never want to generate a function in a loop, without closing over the loop variables. Or, it's a place where we should give up the attempt, and just keep JS semantics...

We can't correctly emulate the proposed behavior for let, so we shouldn't try to replace it...

@satyr
Copy link
Collaborator

satyr commented Oct 21, 2010

you never want to generate a function in a loop

There are cases where you want to use simple predicates in a loop:

for node in nodes
  @add node if node.contains ((n) -> n instanceof Code)

which currently compiles to:

var _i, _j, _len, _ref, node;
for (_i = 0, _len = (_ref = nodes).length; _i < _len; _i++) {
  (function() {
    var node = _ref[_i];
    _j = node;
    return node.contains(function(n) {
      return n instanceof Code;
    }) ? this.add(node) : undefined;
  }).call(this);
  node = _j;
}

@jashkenas
Copy link
Owner

Excellent point. Let's kill the magic scoping.

@akva
Copy link
Author

akva commented Oct 22, 2010

Does it mean #423 is being undone?

@foot
Copy link

foot commented Oct 27, 2010

@jashkenas thanks for the ping, sorry for the big delay.

The original thought behind #423 was as a special case only for list comprehensions. E.g. a very narrow scope while you are generating a new list.

However the coffeescript syntax for this is very similar to the general for loop case. (as opposed to something like python/javascript which wrap the comprehension in square brackets).

You could still differentiate scoping based on whether the block comes before or after the for statement... but agreed, it could still be a source of confusion.

@satyr
Copy link
Collaborator

satyr commented Dec 21, 2010

As of 47fe5c2, arguments doesn't work under the transformation.

$ coffee -e 'for i in [0] then (->); arguments[i]'
TypeError: Function.prototype.apply: Arguments list has wrong type
...

@TrevorBurnham
Copy link
Collaborator

Good point about arguments, satyr. I've proposed removing the transformation (and revisiting do) at issue 959.

@jashkenas
Copy link
Owner

Closing this ticket, as #959 has now been fixed.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants