Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python-like slicing #272

Closed
StanAngeloff opened this issue Mar 20, 2010 · 37 comments
Closed

Python-like slicing #272

StanAngeloff opened this issue Mar 20, 2010 · 37 comments

Comments

@StanAngeloff
Copy link
Contributor

I have just pushed a commit that implements Python-like slice literals. The syntax is similar to Python but rather than using colons (which is an assignment in CoffeeScript) it uses commas:

list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
last: list[-1, ]    # last is now [9]

The comma at the end is mandatory to indicate a slice, rather than perform an index lookup. Slicing works on both strings and arrays and it should also work on numbers and objects implementing toString as well:

string: 'Here comes the man!'
every_other_letter: string[,,2]
puts every_other_letter    # prints 'Hr oe h a!'

I am interested to see how many of you find this useful?

There is a downside, however. The current implementation requires about 13 lines of additional JavaScript. I have tried to keep it short and this seems to be the shortest it can get. I have looked at other places in the core where we are using pre-defined functions and there are a few (__hasProp and __extends, 7 lines). In my view, it is a good trade-off between gained functionality and generated JavaScript.

@weepy
Copy link

weepy commented Mar 20, 2010

It's certainly an often used thing. Maybe there's another syntax to avoid colloision with index lookup? E.g I don't think </code> is used much

last: list \-1\
every_other_letter: string \,,2\

@matehat
Copy link
Contributor

matehat commented Mar 20, 2010

I find this incredibly useful. I've always found list[1..list.length] to be awkward, where list[1,] is a lot more natural. If the additional function only occurs when the index brackets contains a slicing construct, then I think this is a great trade-off.

One thing: this, if accepted, should replace the other slicing construct to avoid confusion. Then, maybe, the added utility function could be just as large as it needs to be, somehow.

I think we should stick to the traditional [] operator. It's used in every language I can think of.

@weepy
Copy link

weepy commented Mar 21, 2010

I agree it's very useful. Perhaps a better syntax would be :

string[1..2]
string[..-1]
string[-1..]

Also: is it worth the extra complexity to allow a stepwise parameter ? - it's strikes me that it's not often used.

@StanAngeloff
Copy link
Contributor Author

weepy, I was going for this syntax when I started the patch, but since it's a complete rewrite on Array#slice, I am kind of worried about performance. I haven't tested it myself, but I imagine the native slicing would work faster. The step argument wouldn't be something you would use everyday, I agree, but it's very useful for reversing arrays:

list: ['apples', 'bananas', 'pears']
reversed: list[,,-1]

At one point I had some very basic optimisations, when step === 1 and when step === -1. In these cases it's safe to call Array#slice and Array#reverse. I may look into bringing those back and adopting the dotted syntax.

@weepy
Copy link

weepy commented Mar 21, 2010

how about list.reverse() :D:D

@matehat
Copy link
Contributor

matehat commented Mar 26, 2010

I was just keen to know how that issue was progressing. I'd really like to see it into trunk one day.

Also, I'm wondering if the negative step could be replaced by just comparing the first and second slice argument (if the first is greater, then assumed negative walk through the list).

@StanAngeloff
Copy link
Contributor Author

I have been using this feature on my slice2 brach for about a week. It's pretty darn nice, but I am still not sure it's core material. It would require three additional functions __slice, __splice and __range:

a: b[1..10]
  .. produces ..
a: __slice(b, 1, 10);

a[1..10]: b[1..10]
  .. produces ..
__splice(a, 1, 10, __slice(b, 1, 10));

Where __range would be called within __slice and __splice to determine the correct offsets.

So to recap, I am not sure whether it is worth patching up the slice2 branch since it may never land in the core. I am pretty happy with what I've got there at the moment.

@weepy
Copy link

weepy commented Mar 27, 2010

good for an extension ?

@matehat
Copy link
Contributor

matehat commented Mar 27, 2010

If we get the wrapping of utility functions accepted, adding those 3 would not be a problem, I think, as simple clean functions, such as slice, splice and range. They would never appear more that once and would not be repulsive in the output.

weepy: An extension could never distinguish between its use as a variable to which something is assigned (when splicing) and when it needs its values (slicing).

@jashkenas
Copy link
Owner

Stan and I talked about this a week or so ago. It would be great to get into core, but we need a slightly different implementation that helps clean up our internals.

The fundamental issue is passing a (object, start, end, exclusive) call to determine the start and end indexes of an array slice or range. If we had this function, we could use it for slices and splices with negative indices, as well as for our range-to-array conversion, and for range comprehensions. It could shorten the code that CoffeeScript currently produces for all of those cases.

So a __range(object, start, end, exclusive) special function that returns the correct start and end indices would be a great way to get this landed, if anyone wants to take a stab.

@weepy
Copy link

weepy commented Mar 27, 2010

How about this:

__range: (object, start, end, xclusive) ->
  L: object.length  
  start: or 0
  end: or L - 1
  start: + L if start < 0 
  end: + L if end < 0
  e: - 1 if xclusive
  [start, end]

a :[1,2,3,4,5]

__range a, 0, 3      # [0,3]
__range a, 0, -1     # [0,4]
__range a, -1, -2    # [4,3]
__range a, undefined, -2     # [0,3]
__range a, undefined, undefined     #  [0,4]
__range a, 0, 2, true     # [0,1]

@StanAngeloff
Copy link
Contributor Author

The __range function can be extracted from the existing __slice implementation. Jeremy and I exchanged a few gists a while back and got this at the end:

function __range(array, from, to, exclusive) {
  return [
    (from < 0 ? from + array.length : from || 0),
    (to < 0 ? to + array.length : to || array.length) + (exclusive ? 0 : 1)
  ];
}

@matehat
Copy link
Contributor

matehat commented Mar 30, 2010

Here, I have a branch that contains your recent work, with the factoring out of __range from __slice and __splice, all factored out into properties of Coffeescript object.

@jashkenas
Copy link
Owner

Stan and Matehat's branches are now merged to master, but a certain amount of the work has been reverted. The Coffeescript object was removed for reasons of speed and clarity...

__slice, __splice, and __range are gone as well. As wonderful as they were for slice and splice compilation, it doesn't make any sense to have a language where you can do array[0...-1], but not array[-1]. It's just too inconsistent.

Check out the compare view to see the changes from the branches that are now on master:

http://github.com/jashkenas/coffee-script/compare/a934cf4...master

Closing the ticket.

@weepy
Copy link

weepy commented Mar 31, 2010

Is array[4..] gone as well ?

@StanAngeloff
Copy link
Contributor Author

So... consistency? I have just pushed two commits to my slice3 branch that deal with slicing, splicing and indexing using negative offsets.

  • Partial slices

    a: [0..9]
    b: a[5..]
    puts b  # 5,6,7,8,9
    
  • Cloning an array

    a: [0..9]
    b: a[...]
    a[0]: 1
    puts b[0]  # 0
    
  • Negative offsets in slicing/splicing

    a: [0..9]
    a[..-5]: a[5..]
    puts a  # 5,6,7,8,9,6,7,8,9
    
  • Negative offsets when accessing ValueNodes by index

    a: [0..9]
    puts a[-1]  # 9
    
  • Any expression in negative indices

    a: [0..9]
    puts a[- ( -> 1)()]  # 9
    puts a[- (1 + 1)]  # 8
    
  • Soaking negative indices

    a: [[], [0..9]]
    puts a[-1]?[0]  # 0
    

You can see the generated JavaScript in my terminal gist. The soak operations are not very smart and may leak temporary variables, but at this time of night I am too tired to fix it. Help would be appreciated.

So, what do you think?

@matehat
Copy link
Contributor

matehat commented Apr 1, 2010

My concern about this is to see how it would handle negative variables. For consistency, we would need to allow them, but then, we'd need to check for variable sign every time. Here's a quick benchmark comparing a loop with automatic sign checking (named 'smart') and one without (named 'dumb').

@StanAngeloff
Copy link
Contributor Author

@matehat, there is no easy and quick way to allow negative variables. array[i] could really be object[property] for all we know. Therefore we have the explicit grammar where the - is mandatory to invoke the new code.

Here is a quick benchmark using Chromium nightly:
Chromium benchmark

and on Firefox nightly:
Firefox benchmark

@matehat
Copy link
Contributor

matehat commented Apr 1, 2010

But then, we'd have to tell then that there's only one way to use negative index and that's when typing - explicitly.

If Jeremy is willing to push this to trunk, I'd be jumping with joy :)

@StanAngeloff
Copy link
Contributor Author

We had a short discussing on IRC last night (for me) -- either way I am happy. I have it in my branch so trunk or no trunk, it's still going to be used extensively by me.

@matehat
Copy link
Contributor

matehat commented Apr 1, 2010

what came out of that discussion?

@weepy
Copy link

weepy commented Apr 1, 2010

Is it possible to turn this into an extension ? So you can opt-in easily on a file by file basis ?

@StanAngeloff
Copy link
Contributor Author

what came out of that discussion?

I can't say anything came out of it really, it's just we had a brief chat and decided to re-open the ticket to see what the overall feeling about this feature is.

Is it possible to turn this into an extension ? So you can opt-in easily on a file by file basis ?

I am afraid the patches are too complicated to work as an extension.

I have some basic ideas for future optimisations which I might try and get to the branch at some point, but even in it's current state and with the explicit syntax, it should be fast enough in most cases. I am using it in my code and I am happy with the results.

@jashkenas
Copy link
Owner

As much as I love negative indices, and the current compiler would benefit from them in many cases, I still don't see how we could have a language where...

last: array[-1]

Works, but...

i: -1
last: array[i]

... does not work. (And injecting extra code into every indexing lookup isn't really acceptable.) This is just one of those things that needs to be supported at the JavaScript VM level, and that's all there is to it.

@matehat
Copy link
Contributor

matehat commented Apr 1, 2010

Fair enough. Could we still implement l[2..], l[..l.length] and l[...]? That would not affect language consistency and the compiler would still benefit a great deal.

@jashkenas
Copy link
Owner

We certainly could. Do you think it's problematic that it's ambiguous with splats?

method(args...)

... would expand the args, whereas ...

array[args...]

... would not. Is that acceptable? If it is, I'd be glad to take the patch.

@weepy
Copy link

weepy commented Apr 1, 2010

+1 for l[2..], l[..2]

Is l[...] useful.

@StanAngeloff
Copy link
Contributor Author

I'm open to suggestions about the syntax -- is there anything we can do that would look good and still allow us to have negative indices? I am thinking:

a: [0..9]
puts a[, 1]  # puts a[a.length - 1]
i: 1
puts a[, i]  # puts a[a.length - i]

If we all like the feature so much, let's all think of a way we can get around JS limitations. If this is clearly not possible, well it's clearly not happening :D

@StanAngeloff
Copy link
Contributor Author

We certainly could. Do you think it's problematic that it's ambiguous with splats?

This is why the grammar does not support array[args...] and there is no need to. We only support the inclusive operator array[args..].

array[...] should probably be array[..] to be consistent.

Is l[...] useful.

It certainly is in cases where you would like to duplicate an array. It might work on arguments[..] too.

@matehat
Copy link
Contributor

matehat commented Apr 1, 2010

jashkenas: semantically, I think the difference between lookup and argument passing is strong enough to break the ambiguity. Also, only one of .. or ... should be allowed and .. seems more natural to me. What do you think?

@weepy
Copy link

weepy commented Apr 1, 2010

Agree matehat; to my eyes, in line with the rest of the slice params, I read the following:
a[1...] reads as the 1st element up to, but not including the last
a[1..] reads as the 1st element up to and including the last

In this context I'd say that .. was useful, but ... is confusing.

@jashkenas
Copy link
Owner

Given the ambiguity of with splats, and given that it breaks when used as range literals, I think this ticket should be closed again.

If you can do [a..b] and [a...b], then [a..] implies [a...].

Also, if I can use a range to slice an array: array[0...] and I can use a range as the source of a comprehension, then what is this?

for num in [0...]

So, going back to wontfix. We can live with simple ranges that always have a start and end value.

@matehat
Copy link
Contributor

matehat commented Apr 2, 2010

Mmmm, I think it's a bit early to give up on something that could be really great and that is just a few steps from being consistent.

I agree there's a consistency problem with ranges and I say: let's just split them apart. Range literals don't have to look the same as index slicing, do they? What if our slicing was more python-style then ruby-style : [start:end:step] (given none of them could be negative). It could be commas instead of colons or anything.

I find indefinite end far more useful that having the possibility to include/exclude the last index. What do you think?

@matehat
Copy link
Contributor

matehat commented Apr 2, 2010

I don't know if I've been clear on that but my point is :

range: [0..5] # no indefinite ending
list[0:] # indefinite ending or start

@jashkenas
Copy link
Owner

Sure, if there's a really wonderful proposal or patch, then that's great. It's just a really tiny feature, and we shouldn't sacrifice consistency, clarity, or simplicity for it ... the negative index problem being a good example of a solution being more work to keep straight than the problem it intends to solve. Part of being "a little language" is choosing your battles.

@matehat
Copy link
Contributor

matehat commented Apr 2, 2010

I don't think that last proposal is anywhere close to hurt consistency, clarity or simplicity, does it? It's been at work and makes sense for thousands of people developing with python. IMO, the fact that the above implied an ambiguity between range literals and array slicing showed that the parallelism between the two construct was anything but adequate.

@jashkenas
Copy link
Owner

matehat: agreed, but it's quite a different idea. Feel free to open a new ticket for it if you like.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants