Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A confusing bug #151

Closed
putianyi889 opened this issue Feb 20, 2024 · 9 comments
Closed

A confusing bug #151

putianyi889 opened this issue Feb 20, 2024 · 9 comments

Comments

@putianyi889
Copy link

julia> testdomain=UnionDomain(UnionDomain(EmptySpace(),EmptySpace()), Point{Int}(100))
D  Point(100.0)

D = {} (empty domain)  {} (empty domain)

julia> setdiff(testdomain, OpenInterval(68..38))
D  Point(100.0)

D = {} (empty domain)  {} (empty domain)

julia> setdiff(testdomain, OpenInterval(68..40))
D  Point(100.0)

D = {} (empty domain)  {} (empty domain)

julia> setdiff(testdomain, OpenInterval(68..39))
ERROR: MethodError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer
Stacktrace:
  [1] reduce_empty(op::Base.MappingRF{typeof(domaineltype), Base.BottomRF{typeof(promote_type)}}, ::Type{EmptySpace{Float64}})
    @ Base .\reduce.jl:361
  [2] reduce_empty_iter
    @ .\reduce.jl:384 [inlined]
  [3] reduce_empty_iter
    @ .\reduce.jl:383 [inlined]
  [4] foldl_impl
    @ .\reduce.jl:49 [inlined]
  [5] mapfoldl_impl
    @ .\reduce.jl:44 [inlined]
  [6] mapfoldl
    @ .\reduce.jl:175 [inlined]
  [7] mapreduce
    @ .\reduce.jl:307 [inlined]
  [8] promote_domains(domains::Set{EmptySpace{Float64}})
    @ DomainSets C:\Users\pty\.julia\packages\DomainSets\cPaBB\src\generic\domain.jl:27
  [9] UnionDomain(domains::Any)
    @ DomainSets C:\Users\pty\.julia\packages\DomainSets\cPaBB\src\generic\setoperations.jl:40 [inlined]
 [10] setdiffdomain1(d1::UnionDomain{Float64, Tuple{EmptySpace{Float64}, EmptySpace{Float64}}}, d2::OpenInterval{Float64})
    @ DomainSets C:\Users\pty\.julia\packages\DomainSets\cPaBB\src\generic\setoperations.jl:154
 [11] setdiffdomain
    @ DomainSets C:\Users\pty\.julia\packages\DomainSets\cPaBB\src\generic\setoperations.jl:317 [inlined]
 [12] _broadcast_getindex_evalf(::typeof(setdiffdomain), ::UnionDomain{Float64, Tuple{…}}, ::OpenInterval{Float64})
    @ Base.Broadcast .\broadcast.jl:709
 [13] _broadcast_getindex
    @ Base.Broadcast .\broadcast.jl:682 [inlined]
 [14] #31
    @ Base.Broadcast .\broadcast.jl:1118 [inlined]
 [15] ntuple(f::Base.Broadcast.var"#31#32"{Base.Broadcast.Broadcasted{}}, ::Val{2})
    @ Base .\ntuple.jl:49
 [16] copy
    @ .\broadcast.jl:1118 [inlined]
 [17] materialize
    @ .\broadcast.jl:903 [inlined]
 [18] setdiffdomain1(d1::UnionDomain{Float64, Tuple{UnionDomain{…}, Point{…}}}, d2::OpenInterval{Float64})
    @ DomainSets C:\Users\pty\.julia\packages\DomainSets\cPaBB\src\generic\setoperations.jl:155
 [19] setdiffdomain
    @ DomainSets C:\Users\pty\.julia\packages\DomainSets\cPaBB\src\generic\setoperations.jl:317 [inlined]
 [20] setdiff(d1::UnionDomain{Float64, Tuple{UnionDomain{Float64, Tuple{}}, Point{Float64}}}, d2::OpenInterval{Int64})
    @ DomainSets C:\Users\pty\.julia\packages\DomainSets\cPaBB\src\generic\setoperations.jl:315
 [21] top-level scope
    @ REPL[96]:1
Some type information was truncated. Use `show(err)` to see complete types.

julia> setdiff(testdomain, OpenInterval(67..39))
D  Point(100.0)

D = {} (empty domain)  {} (empty domain)

julia> setdiff(testdomain, OpenInterval(69..39))
D  Point(100.0)

D = {} (empty domain)  {} (empty domain)
@daanhb
Copy link
Member

daanhb commented Feb 20, 2024

That's... weird. I can reproduce it locally.

@daanhb
Copy link
Member

daanhb commented Feb 20, 2024

I'm not sure what causes the error: there mat be some random element to working with a Set. However, the underlying problem is that setdiffdomain1 is trying to be smart (in setoperations.jl), which it shouldn't. I'm pretty sure that fixing this routine will make the error go away.

Still, I'd like to chase the original error, because different types arise in intermediate steps, whereas the input types are all the same. It's likely that hashes of domains which are equal are different, but a real clash is very unlikely. More likely this is a bug somewhere.

@putianyi889
Copy link
Author

another case: setdiff(UnionDomain(EmptySpace(),EmptySpace()),OpenInterval(97..84))

@daanhb
Copy link
Member

daanhb commented Feb 20, 2024

So the culprit of the error is this function. The error indeed goes away with a different implementation of the same logic, like this one:

function setdiffdomain1(d1::UnionDomain, d2)
	if d2  components(d1)
		uniondomain(filter(x->x!=d2, components(d1))...)
	else
		UnionDomain(setdiffdomain.(components(d1), Ref(d2)))
	end
end

This might not be the best implementation if the components of the union object are a vector (because of the splatting).

Looking at this code now, I would not write these routines again (the specialization of setdiffdomain for arguments of type UnionDomain I mean). They are optimizations. They may be nice in some cases, but they make the result of setdiffdomain less predictable. Users should not be relying on such optimizations.

I say that because you're working on unions of intervals. Please just ignore whatever DomainSets does and make your own implementation, it does not matter that it's different from what happens here. It is probably easier to reason about (predictable) optimizations when working solely with intervals.

(For the record, I know your example is cooked up, but the rule in the code is that UnionDomain always returns a UnionDomain object, but uniondomain is free to simplify. The same with SetdiffDomain and setdiffdomain, etcetera. The uniondomain function would never return a union of empty sets.)

@daanhb
Copy link
Member

daanhb commented Feb 20, 2024

Thanks for reporting, this is fun. I went digging into the internals of dictionaries. The difference between the (empty) open intervals happens at this line in dict.jl in base. It appears normal for there to be a difference here and it is not usually a problem. However, a few lines down the test for equality succeeds, in spite of the hashes of the EmptySpace and the open interval being different, because the open interval is equal to an empty interval. Thus, the outcome is different for different open intervals, even though all the types are the same.

Conclusion: the problem is that hash of two domains may be different even when they are equal according to isequal. The difference between hashes and equality makes it harder for dictionaries to decide whether or not an element is a key in the dictionary. The implementation of dict has some heuristics to allow for this, but they only succeed with a certain probability. That probability is not very high, so that varying the open interval a little bit made you find it.

This is a good catch! I've been wondering about the consequences of not respecting equavalence of equality and hashes. Unfortunately, this problem is hard to fix. It may be doable for intervals, but it will be a lot harder for domains.

I am not 100% sure of my explanation because I do not understand the internals of dictionaries.

@daanhb
Copy link
Member

daanhb commented Feb 20, 2024

Here is an example of the problem with isequal and hash acting differently. I'm sure this is known.

I am creating a dictionary with a single object. I create a long vector of elements equal to that one object, but with a different hash. (The hash is different because a has a different value for each element.) I then ask the dictionary whether it has that element as a key for each of those elements. In my test it responded positively 5 times and negatively 9995 times:

julia> struct MyType
              a
              end

julia> Base.isequal(d1::MyType, d2::MyType) = true

julia> objects = [MyType(rand()) for i in 1:10000];

julia> d = Dict(MyType(2.0) => nothing)
Dict{MyType, Nothing} with 1 entry:
  MyType(2.0) => nothing

julia> sum(haskey(d, objects[i]) for i in 1:length(objects))
5

The correct answer here would be yes 10000 times.

@putianyi889
Copy link
Author

putianyi889 commented Feb 20, 2024

Thanks for the investigation!

The issue is not cooked up. It happens when I test the union of intervals. The way I check the correctness is to setdiff every component from the union and check if the final result is empty.

A case that looks more common: iv"(4,10)" \ iv"(7,9]" \ iv"(9,10)" \ iv"(4,7]"

@daanhb
Copy link
Member

daanhb commented Feb 21, 2024

Fixed by #152, could you confirm that?

@putianyi889
Copy link
Author

Fixed by #152, could you confirm that?

My tests are passing.

@daanhb daanhb closed this as completed Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants