-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast versions of isequal() and isless() for Nullables #18304
Conversation
4f63108
to
b2acbeb
Compare
isequal(x::Nullable, y::Nullable) | ||
|
||
If neither `x` nor `y` is null, compares them according to their values | ||
(i.e. `isless(get(x), get(y))`). Else, returns `true` if both arguments are null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isequal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny, I remember checking exactly this because I knew I was going to make this mistake. Will fix in next round of updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
## Operators | ||
|
||
""" | ||
null_safe_op(f::Any, ::Type)::Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uninit_safe_op
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there may be better names. uninit
sounds too much like an action, though. Maybe undef_safe_op
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
undef_op_is_safe
?
Don't spend too much time on this; if this feature will be useful in other places as well (map
on an array of nullables?), then we can change the name later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and this variant sounds like op
is undef to me. Finding names is hard. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a combination of three things:
- the operation is expected to be not much more expensive than a branch, so the optimization is reasonably profitable
- the type is pointer free and so cannot be #undef
- the operation is supported and will not fail on all patterns of bits possible
so perhaps isbitstotal
? Like isbits
, but also total under a certain operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Total operation" does not evoke much to me, but maybe it's just my lacking mathematical culture. Another possibility is to call this eager_op
or null_eager_op
.
Opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I think null_safe_op
is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
null_safe_op
is fine by me as well. In math terms, this is basically an assertion that an operation is uniformly safe over the space of all bit patterns, but I think that term is too mathematic to make sense in our codebase.
a375963
to
3b74f93
Compare
More comments? |
Technically breaking (but minorly so); needs review and acceptance. |
See JuliaCI/BaseBenchmarks.jl#24 for benchmarks (I'll add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good by me, but I think we might be able to simplify and generalize the implementation by using a tuple of types.
always computing the result even for null values, a branch is avoided, which helps | ||
vectorization. | ||
""" | ||
null_safe_op(f::Any, ::Type) = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we replace this definition and the one below with something more like code_llvm
, where the second argument is always a tuple of types? That way we'd hand arbitrary arity functions in one function definition, while still getting appropriate specialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a case in mind where varargs wouldn't work the same as a tuple? I followed the style of promote_op
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. Constructing varargs seems superfluous given that this doesn't happen in user-facing code, but I don't know how different the performance would be, so I'm not confident it's worth debating much.
returning `true` means that the operation may be called on any bit pattern without | ||
throwing an error (though returning invalid or nonsensical results is not a problem). | ||
In particular, this means that the operation can be applied on the whole domain of the | ||
type *and on uninitialized objects*. As a general rule, these proporties are only true for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
properties
isequal(x::Nullable, y::Nullable{Union{}}) = x.isnull | ||
|
||
""" | ||
isless(x::Nullable, y::Nullable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this elsewhere in helpdb or the rst? should it be added to the rst if not?
edit: likewise with isequal
always computing the result even for null values, a branch is avoided, which helps | ||
vectorization. | ||
""" | ||
null_safe_op(f::Any, ::Type) = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. Constructing varargs seems superfluous given that this doesn't happen in user-facing code, but I don't know how different the performance would be, so I'm not confident it's worth debating much.
vectorization. | ||
""" | ||
null_safe_op(f::Any, ::Type) = false | ||
null_safe_op(f::Any, ::Type, ::Type) = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is really to remove this line and the precedent it seems to set for an ever-increasing number of customized-by-arity functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could write these as null_safe_op(f::Any, ::Type...) = false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's ok by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, null_safe_op(f::Any, ::Type, ::Type...) = false
.
Introduce the null_safe_op() function to allow declaring which combinations of operators and types are safe (i.e. can be computed without branching even when null). This function will be used for other operators in the future.
3b74f93
to
689e139
Compare
The new version should address all comments. |
I don't see any methods right now that would make |
Wait, I'm confused: isn't |
This method follows the semantics of the already implemented isequal(::Nullable, ::Nullable). It is needed to sort arrays of Nullable.
689e139
to
2660cb0
Compare
Good catch, looks like I forgot to add these (or more precisely lost them when rebasing my old branch). Should be good now. Unfortunately, tests cannot check whether the fast path was used for these operations; for those which return a |
This enables SIMD for basic types when null_safe_op() is true. Inlining of the isequal() call on values still depends on the element type.
Let's see whether the new benchmarks are ready: @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
This is great! 👍 |
OK, as expected the tests for @jrevels Maybe the |
This is a minimal version of #16988, including only its less controversial parts. From the numerous discussions we've had on the subject, it seems nobody disagrees about the semantics of
isequal(::Nullable, ::Nullable)::Bool
(which is already implemented, but slow) and ofisless(::Nullable, ::Nullable)::Bool
. These basically mirror those ofNaN