Python-like slicing #272

StanAngeloff · 2010-03-20T13:43:33Z

I have just pushed a commit that implements Python-like slice literals. The syntax is similar to Python but rather than using colons (which is an assignment in CoffeeScript) it uses commas:

list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
last: list[-1, ]    # last is now [9]

The comma at the end is mandatory to indicate a slice, rather than perform an index lookup. Slicing works on both strings and arrays and it should also work on numbers and objects implementing toString as well:

string: 'Here comes the man!'
every_other_letter: string[,,2]
puts every_other_letter    # prints 'Hr oe h a!'

I am interested to see how many of you find this useful?

There is a downside, however. The current implementation requires about 13 lines of additional JavaScript. I have tried to keep it short and this seems to be the shortest it can get. I have looked at other places in the core where we are using pre-defined functions and there are a few (__hasProp and __extends, 7 lines). In my view, it is a good trade-off between gained functionality and generated JavaScript.

The text was updated successfully, but these errors were encountered:

weepy · 2010-03-20T14:02:23Z

It's certainly an often used thing. Maybe there's another syntax to avoid colloision with index lookup? E.g I don't think </code> is used much


last: list \-1\
every_other_letter: string \,,2\

matehat · 2010-03-20T15:25:37Z

I find this incredibly useful. I've always found list[1..list.length] to be awkward, where list[1,] is a lot more natural. If the additional function only occurs when the index brackets contains a slicing construct, then I think this is a great trade-off.

One thing: this, if accepted, should replace the other slicing construct to avoid confusion. Then, maybe, the added utility function could be just as large as it needs to be, somehow.

I think we should stick to the traditional [] operator. It's used in every language I can think of.

weepy · 2010-03-21T01:05:41Z

I agree it's very useful. Perhaps a better syntax would be :

string[1..2]
string[..-1]
string[-1..]

Also: is it worth the extra complexity to allow a stepwise parameter ? - it's strikes me that it's not often used.

StanAngeloff · 2010-03-21T08:44:33Z

weepy, I was going for this syntax when I started the patch, but since it's a complete rewrite on Array#slice, I am kind of worried about performance. I haven't tested it myself, but I imagine the native slicing would work faster. The step argument wouldn't be something you would use everyday, I agree, but it's very useful for reversing arrays:

list: ['apples', 'bananas', 'pears']
reversed: list[,,-1]

At one point I had some very basic optimisations, when step === 1 and when step === -1. In these cases it's safe to call Array#slice and Array#reverse. I may look into bringing those back and adopting the dotted syntax.

weepy · 2010-03-21T08:47:15Z

how about list.reverse() :D:D

matehat · 2010-03-26T19:25:54Z

I was just keen to know how that issue was progressing. I'd really like to see it into trunk one day.

Also, I'm wondering if the negative step could be replaced by just comparing the first and second slice argument (if the first is greater, then assumed negative walk through the list).

StanAngeloff · 2010-03-26T19:43:41Z

I have been using this feature on my slice2 brach for about a week. It's pretty darn nice, but I am still not sure it's core material. It would require three additional functions __slice, __splice and __range:

a: b[1..10]
  .. produces ..
a: __slice(b, 1, 10);

a[1..10]: b[1..10]
  .. produces ..
__splice(a, 1, 10, __slice(b, 1, 10));

Where __range would be called within __slice and __splice to determine the correct offsets.

So to recap, I am not sure whether it is worth patching up the slice2 branch since it may never land in the core. I am pretty happy with what I've got there at the moment.

weepy · 2010-03-27T11:02:10Z

good for an extension ?

matehat · 2010-03-27T15:48:32Z

If we get the wrapping of utility functions accepted, adding those 3 would not be a problem, I think, as simple clean functions, such as slice, splice and range. They would never appear more that once and would not be repulsive in the output.

weepy: An extension could never distinguish between its use as a variable to which something is assigned (when splicing) and when it needs its values (slicing).

jashkenas · 2010-03-27T19:54:49Z

Stan and I talked about this a week or so ago. It would be great to get into core, but we need a slightly different implementation that helps clean up our internals.

The fundamental issue is passing a (object, start, end, exclusive) call to determine the start and end indexes of an array slice or range. If we had this function, we could use it for slices and splices with negative indices, as well as for our range-to-array conversion, and for range comprehensions. It could shorten the code that CoffeeScript currently produces for all of those cases.

So a __range(object, start, end, exclusive) special function that returns the correct start and end indices would be a great way to get this landed, if anyone wants to take a stab.

weepy · 2010-03-27T20:41:37Z

How about this:

__range: (object, start, end, xclusive) ->
  L: object.length  
  start: or 0
  end: or L - 1
  start: + L if start < 0 
  end: + L if end < 0
  e: - 1 if xclusive
  [start, end]

a :[1,2,3,4,5]

__range a, 0, 3      # [0,3]
__range a, 0, -1     # [0,4]
__range a, -1, -2    # [4,3]
__range a, undefined, -2     # [0,3]
__range a, undefined, undefined     #  [0,4]
__range a, 0, 2, true     # [0,1]

StanAngeloff · 2010-03-27T20:48:00Z

The __range function can be extracted from the existing __slice implementation. Jeremy and I exchanged a few gists a while back and got this at the end:

function __range(array, from, to, exclusive) {
  return [
    (from < 0 ? from + array.length : from || 0),
    (to < 0 ? to + array.length : to || array.length) + (exclusive ? 0 : 1)
  ];
}

matehat · 2010-03-30T19:48:07Z

Here, I have a branch that contains your recent work, with the factoring out of __range from __slice and __splice, all factored out into properties of Coffeescript object.

jashkenas · 2010-03-31T00:23:16Z

Stan and Matehat's branches are now merged to master, but a certain amount of the work has been reverted. The Coffeescript object was removed for reasons of speed and clarity...

__slice, __splice, and __range are gone as well. As wonderful as they were for slice and splice compilation, it doesn't make any sense to have a language where you can do array[0...-1], but not array[-1]. It's just too inconsistent.

Check out the compare view to see the changes from the branches that are now on master:

http://github.com/jashkenas/coffee-script/compare/a934cf4...master

Closing the ticket.

weepy · 2010-03-31T07:25:27Z

Is array[4..] gone as well ?

StanAngeloff · 2010-03-31T23:27:53Z

So... consistency? I have just pushed two commits to my slice3 branch that deal with slicing, splicing and indexing using negative offsets.

Partial slices

a: [0..9]
b: a[5..]
puts b  # 5,6,7,8,9

Cloning an array

a: [0..9]
b: a[...]
a[0]: 1
puts b[0]  # 0

Negative offsets in slicing/splicing

a: [0..9]
a[..-5]: a[5..]
puts a  # 5,6,7,8,9,6,7,8,9

Negative offsets when accessing ValueNodes by index
```
a: [0..9]
puts a[-1]  # 9
```

Any expression in negative indices

a: [0..9]
puts a[- ( -> 1)()]  # 9
puts a[- (1 + 1)]  # 8

Soaking negative indices
```
a: [[], [0..9]]
puts a[-1]?[0]  # 0
```

You can see the generated JavaScript in my terminal gist. The soak operations are not very smart and may leak temporary variables, but at this time of night I am too tired to fix it. Help would be appreciated.

So, what do you think?

matehat · 2010-04-01T02:49:37Z

My concern about this is to see how it would handle negative variables. For consistency, we would need to allow them, but then, we'd need to check for variable sign every time. Here's a quick benchmark comparing a loop with automatic sign checking (named 'smart') and one without (named 'dumb').

StanAngeloff · 2010-04-01T08:28:53Z

@matehat, there is no easy and quick way to allow negative variables. array[i] could really be object[property] for all we know. Therefore we have the explicit grammar where the - is mandatory to invoke the new code.

Here is a quick benchmark using Chromium nightly:

and on Firefox nightly:

matehat · 2010-04-01T08:36:16Z

But then, we'd have to tell then that there's only one way to use negative index and that's when typing - explicitly.

If Jeremy is willing to push this to trunk, I'd be jumping with joy :)

StanAngeloff · 2010-04-01T08:44:11Z

We had a short discussing on IRC last night (for me) -- either way I am happy. I have it in my branch so trunk or no trunk, it's still going to be used extensively by me.

matehat · 2010-04-01T08:51:01Z

what came out of that discussion?

weepy · 2010-04-01T09:11:27Z

Is it possible to turn this into an extension ? So you can opt-in easily on a file by file basis ?

StanAngeloff · 2010-04-01T11:57:55Z

what came out of that discussion?

I can't say anything came out of it really, it's just we had a brief chat and decided to re-open the ticket to see what the overall feeling about this feature is.

Is it possible to turn this into an extension ? So you can opt-in easily on a file by file basis ?

I am afraid the patches are too complicated to work as an extension.

I have some basic ideas for future optimisations which I might try and get to the branch at some point, but even in it's current state and with the explicit syntax, it should be fast enough in most cases. I am using it in my code and I am happy with the results.

jashkenas · 2010-04-01T12:44:03Z

As much as I love negative indices, and the current compiler would benefit from them in many cases, I still don't see how we could have a language where...

last: array[-1]

Works, but...

i: -1
last: array[i]

... does not work. (And injecting extra code into every indexing lookup isn't really acceptable.) This is just one of those things that needs to be supported at the JavaScript VM level, and that's all there is to it.

matehat · 2010-04-01T12:52:13Z

Fair enough. Could we still implement l[2..], l[..l.length] and l[...]? That would not affect language consistency and the compiler would still benefit a great deal.

jashkenas · 2010-04-01T12:55:29Z

We certainly could. Do you think it's problematic that it's ambiguous with splats?

method(args...)

... would expand the args, whereas ...

array[args...]

... would not. Is that acceptable? If it is, I'd be glad to take the patch.

weepy · 2010-04-01T12:55:38Z

+1 for l[2..], l[..2]

Is l[...] useful.

StanAngeloff · 2010-04-01T12:57:25Z

I'm open to suggestions about the syntax -- is there anything we can do that would look good and still allow us to have negative indices? I am thinking:

a: [0..9]
puts a[, 1]  # puts a[a.length - 1]
i: 1
puts a[, i]  # puts a[a.length - i]

If we all like the feature so much, let's all think of a way we can get around JS limitations. If this is clearly not possible, well it's clearly not happening :D

StanAngeloff · 2010-04-01T12:59:29Z

We certainly could. Do you think it's problematic that it's ambiguous with splats?

This is why the grammar does not support array[args...] and there is no need to. We only support the inclusive operator array[args..].

array[...] should probably be array[..] to be consistent.

Is l[...] useful.

It certainly is in cases where you would like to duplicate an array. It might work on arguments[..] too.

matehat · 2010-04-01T13:03:06Z

jashkenas: semantically, I think the difference between lookup and argument passing is strong enough to break the ambiguity. Also, only one of .. or ... should be allowed and .. seems more natural to me. What do you think?

weepy · 2010-04-01T13:07:00Z

Agree matehat; to my eyes, in line with the rest of the slice params, I read the following:
a[1...] reads as the 1st element up to, but not including the last
a[1..] reads as the 1st element up to and including the last

In this context I'd say that .. was useful, but ... is confusing.

jashkenas · 2010-04-02T02:47:31Z

Given the ambiguity of with splats, and given that it breaks when used as range literals, I think this ticket should be closed again.

If you can do [a..b] and [a...b], then [a..] implies [a...].

Also, if I can use a range to slice an array: array[0...] and I can use a range as the source of a comprehension, then what is this?

for num in [0...]

So, going back to wontfix. We can live with simple ranges that always have a start and end value.

matehat · 2010-04-02T02:58:37Z

Mmmm, I think it's a bit early to give up on something that could be really great and that is just a few steps from being consistent.

I agree there's a consistency problem with ranges and I say: let's just split them apart. Range literals don't have to look the same as index slicing, do they? What if our slicing was more python-style then ruby-style : [start:end:step] (given none of them could be negative). It could be commas instead of colons or anything.

I find indefinite end far more useful that having the possibility to include/exclude the last index. What do you think?

matehat · 2010-04-02T03:00:54Z

I don't know if I've been clear on that but my point is :

range: [0..5] # no indefinite ending
list[0:] # indefinite ending or start

jashkenas · 2010-04-02T03:07:46Z

Sure, if there's a really wonderful proposal or patch, then that's great. It's just a really tiny feature, and we shouldn't sacrifice consistency, clarity, or simplicity for it ... the negative index problem being a good example of a solution being more work to keep straight than the problem it intends to solve. Part of being "a little language" is choosing your battles.

matehat · 2010-04-02T03:17:12Z

I don't think that last proposal is anywhere close to hurt consistency, clarity or simplicity, does it? It's been at work and makes sense for thousands of people developing with python. IMO, the fact that the above implied an ambiguity between range literals and array slicing showed that the parallelism between the two construct was anything but adequate.

jashkenas · 2010-04-02T03:19:34Z

matehat: agreed, but it's quite a different idea. Feel free to open a new ticket for it if you like.

This issue was closed.

Python-like slicing #272

Python-like slicing #272

Comments

StanAngeloff commented Mar 20, 2010

weepy commented Mar 20, 2010

matehat commented Mar 20, 2010

weepy commented Mar 21, 2010

StanAngeloff commented Mar 21, 2010

weepy commented Mar 21, 2010

matehat commented Mar 26, 2010

StanAngeloff commented Mar 26, 2010

weepy commented Mar 27, 2010

matehat commented Mar 27, 2010

jashkenas commented Mar 27, 2010

weepy commented Mar 27, 2010

StanAngeloff commented Mar 27, 2010

matehat commented Mar 30, 2010

jashkenas commented Mar 31, 2010

weepy commented Mar 31, 2010

StanAngeloff commented Mar 31, 2010

matehat commented Apr 1, 2010

StanAngeloff commented Apr 1, 2010

matehat commented Apr 1, 2010

StanAngeloff commented Apr 1, 2010

matehat commented Apr 1, 2010

weepy commented Apr 1, 2010

StanAngeloff commented Apr 1, 2010

jashkenas commented Apr 1, 2010

matehat commented Apr 1, 2010

jashkenas commented Apr 1, 2010

weepy commented Apr 1, 2010

StanAngeloff commented Apr 1, 2010

StanAngeloff commented Apr 1, 2010

matehat commented Apr 1, 2010

weepy commented Apr 1, 2010

jashkenas commented Apr 2, 2010

matehat commented Apr 2, 2010

matehat commented Apr 2, 2010

jashkenas commented Apr 2, 2010

matehat commented Apr 2, 2010

jashkenas commented Apr 2, 2010