Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance and typing of 6+ dimensional generators #21058

Closed
mbauman opened this issue Mar 16, 2017 · 8 comments
Closed

Performance and typing of 6+ dimensional generators #21058

mbauman opened this issue Mar 16, 2017 · 8 comments
Labels
collections Data structures holding multiple items, e.g. sets compiler:inference Type inference performance Must go faster

Comments

@mbauman
Copy link
Member

mbauman commented Mar 16, 2017

Type inference seems to lose the trail with a 6-dimensional generator:

julia> f5() = sum(+(a,b,c,d,e) for a in 1:1, b in 1:1, c in 1:1, d in 1:1, e in 1:1);

julia> @code_warntype f5()
Variables:
  #self#::#f5
  #21::##21#22

  end::Int64

julia> f6() = sum(+(a,b,c,d,e,f) for a in 1:1, b in 1:1, c in 1:1, d in 1:1, e in 1:1, f in 1:1);

julia> @code_warntype f6()
Variables:
  #self#::#f6
  #23::##23#24

  end::Any

This causes a fairly substantial regression from 0.4.7 when using comprehensions:

g6() = sum([+(a,b,c,d,e,f) for a in 1:4, b in 1:4, c in 1:4, d in 1:4, e in 1:4, f in 1:4])

julia> g6(); @time g6() # 0.4.7
  0.000029 seconds (158 allocations: 74.261 KB)

julia> g6(); @time g6() # 0.6.0-pre.alpha.155
  0.117300 seconds (69.49 k allocations: 7.380 MiB)

(ref http://stackoverflow.com/questions/42826871/slowdown-in-0-5-and-0-6/42827923#comment72790727_42827923)

@ararslan ararslan added collections Data structures holding multiple items, e.g. sets compiler:inference Type inference performance Must go faster labels Mar 16, 2017
@martinholters
Copy link
Member

julia> g6=(+(a,b,c,d,e,f) for a in 1:1, b in 1:1, c in 1:1, d in 1:1, e in 1:1, f in 1:1);

julia> Core.Inference.return_type(next, Tuple{typeof(g6), typeof(start(g6))})
Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple{Int64,#=Note this one!=#Tuple#==#,Nullable{Tuple{Int64,Int64}},Bool},Nullable{Tuple{Int64,Int64,Int64}},Bool},Nullable{NTuple{4,Int64}},Bool},Nullable{NTuple{5,Int64}},Bool}}

@code_warntype next(g6, start(g6)) seems to indicate that the problem originates here:

#temp#@_50::Tuple{Tuple{Int64,NTuple{5,Int64}},Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple,Nullable{Tuple{Int64,Int64}},Bool},Nullable{Tuple{Int64,Int64,Int64}},Bool},Nullable{NTuple{4,Int64}},Bool},Nullable{NTuple{5,Int64}},Bool}} = (Core.tuple)((Core.tuple)(SSAValue(80), v2@_14::NTuple{5,Int64})::Tuple{Int64,NTuple{5,Int64}}, (Core.tuple)((Core.getfield)((Core.getfield)(SSAValue(3), :a)::UnitRange{Int64}, :start)::Int64, s2@_12::Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple{Int64,Int64,Nullable{Int64},Bool},Nullable{Tuple{Int64,Int64}},Bool},Nullable{Tuple{Int64,Int64,Int64}},Bool},Nullable{NTuple{4,Int64}},Bool}, $(Expr(:new, :($(QuoteNode(Nullable{NTuple{5,Int64}}))), false)), (Base.getfield)(s2@_12::Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple{Int64,Int64,Nullable{Int64},Bool},Nullable{Tuple{Int64,Int64}},Bool},Nullable{Tuple{Int64,Int64,Int64}},Bool},Nullable{NTuple{4,Int64}},Bool}, 4)::Bool)::Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple{Int64,Int64,Nullable{Int64},Bool},Nullable{Tuple{Int64,Int64}},Bool},Nullable{Tuple{Int64,Int64,Int64}},Bool},Nullable{NTuple{4,Int64}},Bool},Nullable{NTuple{5,Int64}},Bool})::Tuple{Tuple{Int64,NTuple{5,Int64}},Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple{Int64,Tuple,Nullable{Tuple{Int64,Int64}},Bool},Nullable{Tuple{Int64,Int64,Int64}},Bool},Nullable{NTuple{4,Int64}},Bool},Nullable{NTuple{5,Int64}},Bool}}

which comes from this line. Even though the type of s2@_12 has been inferred correctly, we loose type information after wrapping it in two more layers of Tuple. This is a type-complexity limiting heuristic in inference kicking in. With

--- a/base/inference.jl
+++ b/base/inference.jl
@@ -23,7 +23,7 @@ struct InferenceParams
     function InferenceParams(world::UInt;
                     inlining::Bool = inlining_enabled(),
                     tupletype_len::Int = 15,
-                    tuple_depth::Int = 4,
+                    tuple_depth::Int = 5,
                     tuple_splat::Int = 16,
                     union_splitting::Int = 4,
                     apply_union_enum::Int = 8)

inference of f6() succeeds. So we could try to rewrite the generators to use less complicated types or try to increase tuple_depth.

@martinholters
Copy link
Member

FWIW, I've tried setting tuple_depth to 8 and the time needed for make test looked unsuspicious. Should we do that?

@vtjnash
Copy link
Member

vtjnash commented Mar 17, 2017

No, we should probably make it smaller. But we don't currently really have any tests for it.

@martinholters
Copy link
Member

No, we should probably make it smaller.

Oh. Would you then suggest to rewrite the generator/iterator code to improve its inferability (or least not make it worse) or do you have additional changes to type inference in mind that would help here?

@Jutho
Copy link
Contributor

Jutho commented Aug 12, 2017

This seems fixed by the new product iterator #22989 .

@tkelman
Copy link
Contributor

tkelman commented Aug 12, 2017

Is it sufficiently tracked by existing benchmarks that we'd notice it regressing, or should we add the needs-benchmark label?

@Jutho
Copy link
Contributor

Jutho commented Aug 12, 2017

Good question, this might need either a benchmark or an inferred test (or both), especially since inference of certain other tuple manipulations seem to have regressed on master (#22513).

@mbauman mbauman added the potential benchmark Could make a good benchmark in BaseBenchmarks label Aug 24, 2017
@KristofferC
Copy link
Member

Fixed but would still be good to add a benchmark (which is why the label is there).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets compiler:inference Type inference performance Must go faster
Projects
None yet
Development

No branches or pull requests

7 participants