Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

== for immutables should recursively call == on its fields #4648

Open
StefanKarpinski opened this issue Oct 26, 2013 · 60 comments
Open

== for immutables should recursively call == on its fields #4648

StefanKarpinski opened this issue Oct 26, 2013 · 60 comments
Assignees
Labels
breaking This change will break code equality Issues relating to equality relations: ==, ===, isequal help wanted Indicates that a maintainer wants help on an issue or pull request needs decision A decision on this change is needed
Milestone

Comments

@StefanKarpinski
Copy link
Member

This doesn't make much sense:

julia> immutable Foo{T}
         bar::T
       end

julia> Foo("baz") == Foo("baz")
false

julia> Foo("baz").bar == Foo("baz").bar
true

julia> Foo(1) == Foo(1)
true

If the fields are == to each other then the objects should be == to each other.

@StefanKarpinski
Copy link
Member Author

I would also argue that for mutable types maybe we should throw an error in the absence of specialized ==.

@StefanKarpinski
Copy link
Member Author

Tuples already do the right thing:

julia> ("baz",) == ("baz",)
true

@ivarne
Copy link
Member

ivarne commented Oct 26, 2013

Arrays also behave this way

julia> {"abc",5} == {"abc",5}
true

julia> {"abc",5} == {"abc",4}
false

I am not sure if I think it is a good thing to introduce more differences between mutable and immutable types, but a default implementation of error for == on user defined types is something I like. How would you define == for different user defined types?

@JeffBezanson
Copy link
Member

I can agree that the current default == is not clearly better than the one proposed here. Tuples are an easy case since we know they just "pass through" to their values. Throwing an error is certainly safe; in my view == depends on the abstract meaning of its arguments, so without a definition we really don't know what it means.

@ivarne
Copy link
Member

ivarne commented Oct 27, 2013

If we throw an error if the comparison function is undefined, the code for comparing arbitrary objects will be longer.

try
    return a == b
catch
    return false
end

or

applicable(==, a, b) && a == b

StefanKarpinski referenced this issue Oct 28, 2013
Perhaps I'm missing something but why not immediately return true from these checks if the two objects are identical?  If this is valid there are likely other places this could happen too...but I thought I would first check.

Thanks,

Glen
@quinnj
Copy link
Member

quinnj commented Aug 4, 2014

@StefanKarpinski, did this get implemented as part of the hashing changes? I seem to recall not needing == definitions for some immutable types after the changes.

@JeffBezanson
Copy link
Member

Immutables already call === recursively on fields, so that might have been it. This hasn't been implemented.

@quinnj
Copy link
Member

quinnj commented Aug 4, 2014

But doesn't this definition mean it is?

==(x,y) = x === y

@JeffBezanson
Copy link
Member

That calls ===, which then calls === recursively. The issue asks for == to call == recursively.

@quinnj
Copy link
Member

quinnj commented Aug 4, 2014

Ah, got it!

@StefanKarpinski
Copy link
Member Author

https://stackoverflow.com/questions/44324800/how-to-use-haskey-and-in-functions-with-complex-dictionnary-keys-in-julia/44333559#44333559

I still think we should do this. All three equality predicates should probably call themselves recursively.

@StefanKarpinski StefanKarpinski added help wanted Indicates that a maintainer wants help on an issue or pull request good first issue Indicates a good issue for first-time contributors to Julia and removed good first issue Indicates a good issue for first-time contributors to Julia labels Jun 2, 2017
@tpapp
Copy link
Contributor

tpapp commented Jul 17, 2017

Should == require equality of the type parameters? Eg is Foo(1) == Foo(1.0) in the example above?

@cossio
Copy link
Contributor

cossio commented Jul 17, 2017

Maybe a simple solution to this issue is to make "hello" === "hello" return true. This would be a patch on the function ===, making it treat string specially.

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Jul 18, 2017

Making "hello" === "hello" is certainly a life goal, but it's not so simple, unfortunately (#22193).

@JeffBezanson
Copy link
Member

I'm not a big fan of this change since I think we shouldn't try to guess what == means for an arbitrary type. For example it's not clear what to do with type parameters, and it's not clear whether == should also recur into mutable values. (It's also not easy to implement efficiently; currently we'd need a generated function.)

@vtjnash
Copy link
Member

vtjnash commented Jul 27, 2017

This seems to be close to a duplicate of #12198 (comment)?

@vtjnash vtjnash marked this as a duplicate of #12198 Jul 27, 2017
@Keno
Copy link
Member

Keno commented Jul 27, 2017

That's already fixed on master, but it's certainly a larger issue than string (e.g. BigInts).

@JeffBezanson
Copy link
Member

That's done --- we now have "abc" === "abc". But surely this is a more general issue, not specific to strings.

@yuyichao
Copy link
Contributor

Fixing only the examples in this thread is not at all what this issue is about.

@JeffBezanson
Copy link
Member

The trait idea is orthogonal; the question would remain as to what the default behavior should be in the absence of trait declarations for a new type.

@vtjnash vtjnash added the triage This should be discussed on a triage call label Jan 15, 2018
@mauro3
Copy link
Contributor

mauro3 commented Jan 16, 2018

Sorry for not having followed up on my promise, life caught up... I don't think I will have time to put work into this just now. @tpapp, you're welcome to work on the traits-stuff if you fancy.

Anyway, now that I've spent some time thinking about this, my 2-cents:

  • a rule like proposed above is too complicated. I still don't fully understand it.
  • equality should not be recursive. Each field-type needs to decide itself how to be compared. Otherwise, it could happen that a == b even though a.f != b.f or vice-versa.
  • I think it would be confusing to have default-equality behave differently on whether a type is mutable or immutable. I might be wrong, but my impression is that the motivation to have immutables and mutables comparing differently by default, is hash-table use. However, that issue is addressed adequately by removing the fall-back hash method.
  • I agree with above discussion that type parameters should not participate in equality (by default), as that is how (almost?) all == methods work.

Thus, this leaves the two solutions:

  1. default comparison field-by-field for all types, or
  2. no default method defined for comparison

@JeffBezanson
Copy link
Member

equality should not be recursive. Each field-type needs to decide itself how to be compared. Otherwise, it could happen that a == b even though a.f != b.f or vice-versa.

Do you mean equality should be recursive?

my impression is that the motivation to have immutables and mutables comparing differently by default, is hash-table use

That's not the reason. The reason is that immutable values with the same contents cannot be distinguished, while mutable values with the same contents can be distinguished by mutating one and seeing if the "other" one also changes.

@mauro3
Copy link
Contributor

mauro3 commented Jan 16, 2018

Do you mean equality should be recursive?

This is probably a miss-understanding. I mean that the following is not desirable:

struct A
  a
  b
end
==(a::A,aa::A) = a.a==a.a # disregard b

struct B
  a::A
end

# with recursion:
B(A(1,2)) != B(A(1,1))
# even though
A(1,2) == A(1,1)

Thus I think "field-by-field comparison" describes a system where B(A(1,2))==B(A(1,1)) better than "recursion".

That's not the reason. The reason is that immutable values with the same contents cannot be distinguished, while mutable values with the same contents can be distinguished by mutating one and seeing if the "other" one also changes.

But if I care about this I should use ===. When == comparing two arrays, I don't expect that they stay == forever after.

@JeffBezanson
Copy link
Member

Thus I think "field-by-field comparison" describes a system where B(A(1,2))==B(A(1,1)) better than "recursion".

Ah, I see now. I don't think anybody has proposed this. We have been talking about making == on B recursively call == on B's a field, which would call the == method defined for A. It's recursive because == would internally call == on a smaller object.

@tpapp
Copy link
Contributor

tpapp commented Jan 17, 2018

A minimally invasive yet reasonably convenient extension of the status quo could be the following:

  1. 3 possible traits for ==:
    a. requiring strict ===, this is the default
    b. requiring types to match exactly, all fields compared by == then, according to their own methods
    c. only Base.typename is required to match, all fields are compared by ==.
  2. hashing is always compatible with the above definition.

This would preserve existing behavior, avoid spurious equality when not intended (eg making a Dict-like type compare every field), yet allow very convenient one-line extensions for the most frequently requested form of equality.

This would also not be a breaking change, and could be done later. Still, if this is acceptable, I would make a try at this, I would love to see this in v1.0. It is asked for very frequently.

@Keno
Copy link
Member

Keno commented Jan 25, 2018

Triage prefers the following solution, which was spelt out above in parts, but not in one place

  • Have the default ==, recursively call isequal
  • Have it ignore any type parameters that appear in the field definitions
  • Have it compare type parameters that do not

This is basically what AutoHashEqual does, except that it seems slightly better
in the (somewhat unusual) case that you have type parameters just used for tagging,
but not in the fields.

@mbauman mbauman removed the triage This should be discussed on a triage call label Jan 25, 2018
@JeffBezanson
Copy link
Member

JeffBezanson commented Jan 25, 2018

I just remembered the original issue calls for doing this for immutable types, but it seems we're leaning towards using it for everything? That brings up (1) object cycles, (2) strange default behavior for things like Pipes.

The way I see it, there are thousands of julia types out there, call the number N. Some number E need to have boilerplate == methods defined. There's a cognitive bias: as soon as you've written that boilerplate 3 or 4 times, it seems annoying, but in fact E ≪ N. If we change the default, then N - E types need a == method that calls === to be safe. Otherwise you can get stack overflows or problematic spurious equalities.

So it was not discussed (much?) on the call, but do we want to limit this to immutables?

@vtjnash
Copy link
Member

vtjnash commented Jan 25, 2018

Pondering this some more after the call, I'm thinking that the current default of not-equal has been a much better default than field-equality (if we're going to define equals at all). For example, it's seems odd that we'll now need to define ==(a::IO, b::IO) = (a === b) to avoid spurious equivalences, such that (after this change) x in [y] might return true for x !== y.

It seems like there's probably many other cases like that too, such as Condition (will be now considered equal if Tasks with the same starting function + state are waiting on them) and Channel (will be equal if they contain the same data, for example, if they're both empty).

@Keno
Copy link
Member

Keno commented Jan 25, 2018

I would continue to favor doing this for mutable types, though I feel less strongly about that. I'm not really concerned about getting StackOverflowErrors for circular references by default, since e.g. the same happens here:

julia> a = Any[]
0-element Array{Any,1}

julia> push!(a, a)
1-element Array{Any,1}:
 Any[Any[#= circular reference @-1 =#]]

julia> a == a
ERROR: StackOverflowError:

@JeffBezanson
Copy link
Member

But it can cause stack overflows in cases where people were relying on the === fallback.

@JeffBezanson
Copy link
Member

I think this is still controversial and doesn't need to be done right now. Moving to 2.0.

@julbinb
Copy link

julbinb commented Jan 15, 2021

It might be handy to at least have structural equality as a standard library function that can be used for user-defined types. For example, I've been using this generated function for most of my types to avoid manual equality definition:

# ASSUMTION: `e1` and `e2` have the same run-time type
@generated structEqual(e1, e2) = begin
    if fieldcount(e1) == 0
        return :(true)
    end
    mkEq    = fldName -> :(e1.$fldName == e2.$fldName)
    # generate individual equality checks
    eqExprs = map(mkEq, fieldnames(e1))
    # construct &&-expression for chaining all checks
    mkAnd   = (expr, acc) -> Expr(:&&, expr, acc)
    # no need in initial accumulator because eqExprs is not empty
    foldr(mkAnd, eqExprs)
end

And then use it to define equality:

Base.:(==)(x :: MyType, y :: MyType) = structEqual(x, y)

@schlichtanders
Copy link

It might be handy to at least have structural equality as a standard library function that can be used for user-defined types. For example, I've been using this generated function for most of my types to avoid manual equality definition:

# ASSUMTION: `e1` and `e2` have the same run-time type
@generated structEqual(e1, e2) = begin
    if fieldcount(e1) == 0
        return :(true)
    end
    mkEq    = fldName -> :(e1.$fldName == e2.$fldName)
    # generate individual equality checks
    eqExprs = map(mkEq, fieldnames(e1))
    # construct &&-expression for chaining all checks
    mkAnd   = (expr, acc) -> Expr(:&&, expr, acc)
    # no need in initial accumulator because eqExprs is not empty
    foldr(mkAnd, eqExprs)
end

And then use it to define equality:

Base.:(==)(x :: MyType, y :: MyType) = structEqual(x, y)

awesome idea!

I included it into a related package I myself build, and extended it to similar cases.
Enjoy https://github.com/jolin-io/StructEquality.jl

@KristofferC
Copy link
Member

@tpapp
Copy link
Contributor

tpapp commented Apr 9, 2022

@schlichtanders: I like that you exposed the function versions as part of the API. My problem with macros like @auto_hash_equals is that they are difficult to compose in case I want multiple extensions to the basic struct functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code equality Issues relating to equality relations: ==, ===, isequal help wanted Indicates that a maintainer wants help on an issue or pull request needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests