Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecated syntax using the value of .= in the REPL. #26516

Closed
fredrikekre opened this issue Mar 19, 2018 · 41 comments
Closed

Deprecated syntax using the value of .= in the REPL. #26516

fredrikekre opened this issue Mar 19, 2018 · 41 comments
Labels
broadcast Applying a function over a collection REPL Julia's REPL (Read Eval Print Loop)
Milestone

Comments

@fredrikekre
Copy link
Member

Example from the manual

julia> x = 1.0:3.0; y = similar(x);

julia> @. y = x + 3 * sin(x)
┌ Warning: Deprecated syntax `using the value of `.=``.
└ @ nothing none:0
3-element Array{Float64,1}:
 3.5244129544236893
 4.727892280477045 
 3.4233600241796016

I suspect this warning comes from setting of the ans variable in the REPL.

@fredrikekre fredrikekre added the REPL Julia's REPL (Read Eval Print Loop) label Mar 19, 2018
@martinholters
Copy link
Member

Hm, it's not the assignment to ans (nor showing the result):

julia> a = zeros(4);

julia> "foo"
"foo"

julia> a .= ones(4); ans
┌ Warning: Deprecated syntax `using the value of `.=``.
└ @ nothing none:0
"foo"

julia> ans
"foo"

Also, the work-around is rather ugly: (a .= ones(4); nothing)

Does the REPL split the input at the ; if not in parentheses? Because:

julia> Meta.lower(Main, :(a .= ones(4)))
┌ Warning: Deprecated syntax `using the value of `.=``.
└ @ nothing none:0
:($(Expr(:thunk, CodeInfo(:(begin
      Core.SSAValue(0) = ones(4)
      Core.SSAValue(1) = (Base.broadcast!)(Base.identity, a, Core.SSAValue(0))
      return Core.SSAValue(1)
  end)))))

julia> Meta.lower(Main, :(a .= ones(4); nothing))
:($(Expr(:thunk, CodeInfo(:(begin
      Core.SSAValue(0) = ones(4)
      (Base.broadcast!)(Base.identity, a, Core.SSAValue(0))
      #= REPL[24]:1 =#
      return nothing
  end)))))

Anyway, what would we want to show and assign to ans here?

@stevengj
Copy link
Member

I think we should just revert #26088 and return the LHS (see also #25954) as we used to.

@JeffBezanson
Copy link
Member

I would also much rather just make a final decision here. This deprecation is a hassle and I'd rather not sink more time into it.

@Keno
Copy link
Member

Keno commented Mar 19, 2018

Frankly, I always find the RHS returning behavior at the REPL confusing in any case where the RHS and LHS are not ==, e.g.:

julia> (a,b,c) = Iterators.filter(x->x>3, 1:1000)
Base.Iterators.Filter{getfield(Main, Symbol("##9#10")),UnitRange{Int64}}(getfield(Main, Symbol("##9#10"))(), 1:1000)

@JeffBezanson
Copy link
Member

...so we should change the result value of a = b too?

@JeffBezanson JeffBezanson added the triage This should be discussed on a triage call label Mar 20, 2018
@Keno
Copy link
Member

Keno commented Mar 20, 2018

I don't know, but I wanted to mention it, since it seems related. For completeness of examples and to play devil's advocate, do we have a good example (ignoring the mutating assignment variants for now) where returning the LHS would look super strange?

@JeffBezanson
Copy link
Member

Are you suggesting a = b showing a different value in the REPL than what the expression actually returns?

@StefanKarpinski
Copy link
Member

One option would be to define a = b = c to do a = c; b = c; a and similarly for .=.

@Keno
Copy link
Member

Keno commented Mar 20, 2018

No, I'm not suggesting anything. I think my main point is that assignment sometimes does non-trivial transformations (e.g. perform the iteration protocol), esp in the broadcasting cast, and esp in the REPL, you're often more interested in what the result of those transformation is than the value that it was transformed from.

@StefanKarpinski
Copy link
Member

@KristofferC has pointed out that mutating operations generally return the thing that's mutating and .= is a mutation operation so the same principle can be argued to apply.

@KristofferC
Copy link
Member

(after discussion with @fredrikekre earlier today)

@JeffBezanson
Copy link
Member

I'm frankly fine with either choice. However, @mbauman pointed out that the left side is more likely to be a "weird" value, like a view, or a special kind of view we introduce for performance (in the near future, for BitArray), which might make it less useful to return.

This is a special case of a general property of assignment: in a = b, the right side is a normal expression that's evaluated like anything else, but the left side is something weird that's not evaluated, and instead refers to a location in some way. So it's not clear what "the value of the left side" even means, whereas the value of the right side is clear. This is one of the reasons a = b returns b.

@mbauman
Copy link
Member

mbauman commented Mar 22, 2018

General consensus from triage after an extended discussion:

.= (and .op=) becomes special-cased to return their left hand-side. When indexing is involved, like A[I] .= B, we return A[I] when it is requested. Note that this isn't the same as the "output" array that gets passed to the implementation (broadcast!) nor is it what broadcast returns. It is an additional getindex call that is inserted when necessary.

@mbauman
Copy link
Member

mbauman commented Mar 22, 2018

As I was writing up the summary from triage, I realized it'd be nice if that getindex call would be able to be transformed into a view with the @views macro.

@mbauman mbauman removed the triage This should be discussed on a triage call label Mar 22, 2018
@mbauman mbauman added this to the 1.0 milestone Mar 22, 2018
@stevengj
Copy link
Member

stevengj commented Mar 22, 2018

As I was writing up the summary from triage, I realized it'd be nice if that getindex call would be able to be transformed into a view with the @views macro.

Which getindex calls aren't transformed into views already? Oh, I see, you mean the "implicit" getindex of the returned object.

@stevengj
Copy link
Member

stevengj commented Mar 22, 2018

So you are suggesting that, unless you do @views, then A[I] .= B will return a copy of A[I]? Worse, it will call getindex (and hence make the copy) whether you use the return value or not? That all seems pretty terrible to me.

Why not just define it to return whatever broadcast! returns, and then document that broadcast! should return the mutated object? It seems like you are introducing additional complexity for no benefit.

@mbauman
Copy link
Member

mbauman commented Mar 22, 2018

document that broadcast! should return the mutated object?

What is the mutated object, though? In A[I] .= B, is it A? Or is it view(A, I)? Or is it something else entirely?

In A[I] .= B, A[I] isn't really a value at all — it's just describing locations in A, as selected by I. Everywhere else in this language, when you write A[I], it means getindex.

Doing it this way allows us to lower A[I] to highly specialized views where advantageous — one such example is:

B = falses(10)::BitArray
B[bitrand(end)] .= true

In this case, going through a SubArray is terribly slow. I'm implementing precisely this view type for #26347, allowing us to operate on the parent bitarray in whole chunks at a time. But I don't even support indexing into it — as indexing would be O(n). I never expect this type to escape the broadcasting machinery… but it would were we to return the "output" argument to broadcast! by convention or lowering.

@stevengj
Copy link
Member

stevengj commented Mar 22, 2018

In A[I] .= B, is it A? Or is it view(A, I)

It's whatever gets passed to broadcast!, which is dotview(A, I), which defaults to view(A, I).

Everywhere else in this language, when you write A[I], it means getindex

Not on the left-hand-side of an assignment.

Doing it this way allows us to lower A[I] to highly specialized views

That is what dotview already allows.

@stevengj
Copy link
Member

I guess you're worried about dotview being returned and exposing broadcast!-specific view objects?

In that case, fine, make both A .= B and A[I] .= B return A.

Returning A[I], i.e. returning getindex(A,I) and forcing a copy to be made for all A[I] .= B assignments, seems strictly worse than either of the two options (returning dotview or returning A) above.

@mbauman
Copy link
Member

mbauman commented Mar 22, 2018

Yes, that's part of it. What about A[1] .= B? Does that return A or the object that got assigned into at A[1]?

Do you really use the return value of A[I] .= B so much? We'd only insert the extra getindex if necessary — like at the REPL. We're not forcing a copy to be made for all A[I] .= assignments by any stretch of the imagination. Beyond the interactive REPL case, it's very rare.

I'm still not terribly thrilled with how the semantics of A[I] = B subtly change with an @. macro, but returning getindex(A, I) if necessary seems to do what I'd expect in most cases.

@stevengj
Copy link
Member

It makes me uneasy to think that whether a statement A[I] .= B makes a copy of A[I] should depend on the context in which it is used, and in particular on whether the compiler detects that the return value is unused. It seems preferable to me — much easier to reason about and explain — to just always return A (even for A[1] .= ..., for consistency).

With your suggestion, A[I] .= B becomes a trap for unwary users, and its allocation behavior becomes hard to explain. e.g. if you put it as the last line of a function (or the last line of a try block or something at the end of a function), then it suddenly becomes allocating even if the caller of the function does not use the return value.

@mbauman
Copy link
Member

mbauman commented Mar 22, 2018

That's fair. Even worse: an attempted @inbounds A[I] .= X wouldn't only fail to remove bounds checks, it'd also result in a spurious allocation.

@JeffBezanson
Copy link
Member

I've seen a suggestion to return nothing, which seems like a reasonable option at this point.

@stevengj
Copy link
Member

stevengj commented Mar 23, 2018

Returning nothing means that you can't chain in-place expressions, e.g you can't do sort!(A .= f.(A)). Returning A for both A .= B and A[I] .= B is more useful than nothing, no?

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Mar 23, 2018

I don't understand what's so bad about just returning the LHS. Sure, it will sometimes be a slightly strange dotview type but that should be fine if we make those print decently. Making them more first class in general seems like a good thing to me.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Mar 23, 2018

Returning A for both A .= B and A[I] .= B is more useful than nothing, no?

This strikes me as violating the principle of least surprise quite badly. An assignment expression that evaluates to something that is neither its left hand side nor its right hand side?! At least returning B or Base.dotview(A, I) or A[I] is returning something related to one side of the expression or the other. Returning A for the sake of returning something is just weird.

@mbauman
Copy link
Member

mbauman commented Mar 23, 2018

The other argument against the LHS was that Keno's immutable-field-mutation might also lower to something funny, but that doesn't apply to .= since it's inherently a mutating operation. My arguments against the LHS have been:

  • it's different from = everywhere else
  • @. introduces a subtle change in semantics
  • Base.dotview is more free to return something optimized for a specific broadcast implementation (and pessimized for general purpose use)

None of them are all that terrible. I'm convinced that .= cannot return its RHS — it's simply too fiddly — so the first point is moot. The second point is roughly the same, but the difference between binding and mutable assignment is so huge you can't just willy-nilly throw @. on a scalar assignment and expect it to work. The third point isn't just about printing. It could add a surprising performance difference between A[I] .= f.(x); sort!(view(A, I)) and sort!(A[I] .= f.(x)).

On the other hand, I don't find the golfing sort!(A .= f.(A)) argument compelling in the first place — it's jarring against kwargs to my eyes. I'm also very staunchly against returning what broadcast! returns — this is a syntax, let's make it have a well-defined result that's not defined by the implementation.

@StefanKarpinski
Copy link
Member

Data point: Python uses the meaning of = that I proposed above, i.e. a = b = f() means this:

tmp = f(); a = tmp; b = tmp; a

Of course, in Python the question of what this assignment evaluates to is moot since assignment is a statement, not an expression. I know that @JeffBezanson really dislikes this meaning, but I do think it has the advantage of least surprise for the user – programmers seem to use a = b = f() to assign the result of f() to a and then to b. Our current interpretation is effectively this:

tmp = f(); b = tmp; a = tmp; tmp

There are a couple of things that we've seen that people find surprising about this:

  1. That b is assigned before a;
  2. That the whole expression evaluates to tmp rather than a.

The Python-like interpretation above addresses both of these and doesn't introduce any spooky action at a distance in the presence of typed variables:

  1. a is assigned first;
  2. The whole expression evaluates to a;
  3. The type of b does not affect what is assigned to a.

I know that @JeffBezanson really hates this interpretation, but I'd like to at least discuss it and understand what the concrete drawbacks are.

@StefanKarpinski
Copy link
Member

Another possible approach that @Keno suggested on Slack is to make a .= b .= c an error if the results aren't isequal but I was worried that this would be too expensive to check. Another rule that would be easier to enforce for a = b = c would be that a and b have to have the same declared type (where no declared type is equivalent to a declared type of Any. The same rule could be applied to a .= b .= c but it's harder to see how it would apply to a[i] = b[i] = c or a[i] .= b[i] .= c for completely general indexing expressions or worse still a[i] = b = c, a[i] .= b .= c, a = b[i] = c and a .= b[i] .= c. Presumably if we adopted that rule we would also want to change assignment-like expressions to evaluate to their LHS instead of their RHS?

@stevengj
Copy link
Member

I'm also very staunchly against returning what broadcast! returns — this is a syntax, let's make it have a well-defined result that's not defined by the implementation.

This doesn't make sense to me. The .= syntax is defined as sugar for broadcast!. Isn't that a well-defined result?

Would you also object to having a[i] return what getindex returns? Since you can redefine getindex to mean anything at all, the result of a[i] is "defined by the implementation" too.

@KristofferC
Copy link
Member

Would you also object to having a[i] return what getindex returns?

Isn't this the same thing:

julia> setindex!(a, 3, 2)
3-element Array{Float64,1}:
 0.0
 3.0
 0.0

julia> a[2] = 3
3

@mbauman
Copy link
Member

mbauman commented Mar 26, 2018

The difference between getindex and setindex!/broadcast! is that the primary purpose of getindex is to return a value. The primary purpose of the mutating ones is to mutate. That's what folks will implement and test. There are at least five possible return values for A[I] .= x that have all been proposed at some point in this thread (and its progenitors) that are all at least semi-defensible. Sure, you can pick something and punt it to documentation and call everything else a bug, but when we already do special lowering for indexed and dot-assignments, let's just pick something that works reasonably well and do it for them.

@stevengj
Copy link
Member

stevengj commented Mar 26, 2018

@mbauman, we document the return value for functions likepush! and setindex! etcetera, whose "primary purpose is to mutate," precisely so that mutations can be chained with other functions. Why is broadcast! so different?

@KristofferC, my point is that .= is defined to be sugar for broadcast!, so exposing the broadcast! return value isn't necessarily exposing a user-defined implementation detail any more than any other syntactic sugar for an overloaded function.

@mbauman
Copy link
Member

mbauman commented Mar 26, 2018

setindex! is actually a great example:

julia> A = rand(2,2);
       A[:] = 1:4
1:4

julia> A[:] .= 1:4
┌ Warning: Deprecated syntax `using the value of `.=``.
└ @ nothing none:0
4-element view(::Array{Float64,1}, :) with eltype Float64:
 1.0
 2.0
 3.0
 4.0

The fact that we achieve the mutation through a SubArray feels like an implementation detail.

I think this may be a difference in perspective. When I look at .= I don't immediately see "broadcast! into the view or value". I've typically been finding myself thinking "assign into." Or I might think "broadcast the values into."

@stevengj
Copy link
Member

When I look at .= I don't immediately see "broadcast! into the view or value". I've typically been finding myself thinking "assign into." Or I might think "broadcast the values into."

Isn't learning that it is actually "just" a broadcast! call a good thing?

I just don't see how returning something that is always useless (nothing) is an improvement on returning something that is sometimes useful (the mutated container).

@JeffBezanson
Copy link
Member

@StefanKarpinski If a = b is a statement (returning nothing), then I think Python's behavior for a = b = c becomes more reasonable: assignment normally isn't allowed in a value context, so it's not so bad to give it a special behavior in the context of a = b = c.

But in julia assignment is a right-associative operator that returns its right argument. a = b = c parses as a = (b = c), and b = c returns c, so that's what gets assigned to a. So this is a simple rule that explains how assignment parses and evaluates, and how a = b = c works. It also fits well with the asymmetry I mentioned above: the right side is a normal expression that gets evaluated, but the left side is not evaluated at all (e.g. in a[i] = b, a[i] is not evaluated but rather causes the expression to be transformed into a setindex! call). If you're looking for a value to return, it makes sense to use the only argument that actually computed a value.

So two drawbacks to returning the left side are: (1) We have to evaluate a whole extra expression that we did not have to evaluate before, which could only be less efficient, (2) it could prevent us from having fancy assignment left sides that don't correspond to values in a clear way (for example @Keno 's proposed feature for "mutating immutables"). I also think the alternatives are just more complicated than "= is a right-associative operator that returns its right argument".

@mbauman
Copy link
Member

mbauman commented Mar 26, 2018

If we had a broadcast!(f, A, args, indices) API, we would be using that instead of @views A[indices].

Another argument against the LHS is that several of us would like to work on making indexing fuse and participate in broadcasting. We could potentially add a syntax A.[I] .= B .+ C. In such a case, neither the LHS nor the RHS will ever exist — that'd "just" be a sugar for a broadcasted setindex! into A.

@woodwm
Copy link

woodwm commented Apr 5, 2018

Sorry to disturb. I think = in Python is a little different from what @StefanKarpinski said.
For the expression a = b = c = 1, it means a = 1 b = a c = a

In[1]: import dis
In[2]: dis.dis("a=b=c=1") 
  1           0 LOAD_CONST               0 (1)
              2 DUP_TOP
              4 STORE_NAME               0 (a)
              6 DUP_TOP
              8 STORE_NAME               1 (b)
             10 STORE_NAME               2 (c)
             12 LOAD_CONST               1 (None)
             14 RETURN_VALUE

@StefanKarpinski
Copy link
Member

That's to avoid evaluating the RHS expression multiple times. Since = does not ever convert in Python, it's equivalent to assigning to a temporary and then assigning each of the LHS variables from that temporary except the bytecode is simpler and more efficient.

@KristofferC
Copy link
Member

Could someone state what the final decision on this ended up to be? What does .= return?

@mbauman
Copy link
Member

mbauman commented Apr 12, 2018

It's back to the previous behavior of just lowering directly to broadcast!, so it returns what broadcast! returns, and that's documented to be the mutated array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection REPL Julia's REPL (Read Eval Print Loop)
Projects
None yet
Development

No branches or pull requests

9 participants