-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broadcast and linear indexing #32051
Comments
Broadcast supports non-AbstractArrays, so we can't ask all arguments for their At the same time, we should also add support for more array-like |
We could specialize on I am half of a mind to propose
Ok, maybe I'm just blind, but how? |
Oh nevermind, I'm wrong. This can't be done type-stably because the decision to extrude is not in the type domain but on runtime values (sizes). Thus we cannot have a type stable |
Just to be clear (because obviously I forgot what restrictions we had here myself):
There is still a case where this would be type-stable: it's where there is only one non-zero-dimensional argument and it's |
Merde, this is bad. After reading up on that PR of yours, I don't know what to do about this. Thanks for the explanation! In 2.0, we should consider making broadcast extrusion explicit (i.e. missing dims get silently extruded such that |
I mean, it's a tradeoff. It would suck to lose the ability to do things like It is something I've considered, but without the ability to encode singleton dimensions in the type system I think it's a non-starter. Doing this generally was something we talked about in JuliaLang/LinearAlgebra.jl#42 but dismissed as being far too complicated. The introduction of an orthogonal syntax like |
Too bad, I had hoped for something similar for GPU arrays since the run-time index calculations are pretty costly there too. |
You're right, some of the use cases for the current extrusion behavior are compelling. So I guess we could prepare a PoC branch that emits both linear and cartesian indexing, with a single hoisted runtime check. That would blow up codesize and compiler latency, but improve runtime perf in most cases. Having such a branch ready would allow us to make more informed decisions on the tradeoffs (code complexity, readability of compilation output, compiler latency, runtime performance), and may be useful for people who prioritize runtime over compile-time perf. Also, the tradeoff may change in the future (compiler latency could get better). It may be useful to also offer a keyword arg to |
It would be nice if we could propagate
IndexStyle
for broadcasts. The issue is the following:Currently, broadcasts always use cartesian indexing. This is slow and prevents a lot of simd.
In the relatively common case that
dest
and allargs
support linear indexing, and the only cases of dropped dimensions are zero-dimensional (as above), we should use linear indexing for a significant speedup (up to 20x), especially if the first dimension is small (which is a very common occurence).The text was updated successfully, but these errors were encountered: