-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit array conversion (e.g., array(), asarray()) #122
Comments
We kind of left that out on purpose, because there's so much variation in how libraries do that. We've got
Those also have significant variation in what they accept (e.g. do they deal with generators, objects which implement the buffer protocol, etc.). The idea was:
Perhaps we should reconsider, being able to do |
I agree. If we have other array creation functions, I expect this will feel like an obvious missing gap. Otherwise I expect we would see users writing code like For what it's worth, in
So the existence of |
Makes sense. We need only one function I think -
Some thoughts:
|
On the other hand, a
We also need builtin Python scalars: I would skip generators -- they have unknown size, which means the resulting arrays can't be allocated at once. It's easy enough to require users to cast with
I'm not sure it's worth dropping the buffer protocol. It's used all over the place, including in Python's standard library, and it works just fine (especially for numeric types). Consider a library like Pillow -- do they really gain anything from implementing |
I like the ideas of
|
@shoyer One thing you said is in conflict: If we'd like to support the buffer protocol, it seems to be the best to keep |
It doesn't hurt though, and it can help. Even contiguous-only arrays can have C and Fortran order for
I was thinking that is because numpy has
Agree, the unknown size is a good argument to drop them.
The trouble is, if we include it then we are mandating everyone to implement support for it. Which is a pain. Mpi4py and Pillow could easily document that users should convert to a numpy or cupy array as intermediate. Also considering that in downstream library functions we anyway only want to accept conforming array objects and not mpi4py/Pillow objects, that's a very minor thing to ask. On the other hand, making array libraries implement the buffer protocol just for |
That's right, it's not part of the user facing API. (More specifically, I was thinking TF/JAX only support C order arrays, but that may actually be an implementation detail that is not necessarily true on all platforms...)
TensorFlow seems to do just fine without either. Another reason for why |
Pretty sure they're passing Fortran-ordered arrays to LAPACK implementations, better for performance.
That's good to know. It's easy enough to do some other way, e.g. with
|
Okay, so I think we're arriving at:
|
Actually, I take that back. They're documented as part of the C API, however (from here): |
I'm not sure either way now, the docs are very bad. It claims they're simple C structures, but then all signatures contain |
Got a good answer thanks to Pearu: using |
So we drop order? |
I don't have a strong opinion either way, but finds @shoyer's argument mildly convincing - if it's a feature that JAX and TensorFlow do not expose on purpose, then they'd have a keyword that they will just ignore. On the other hand, NumPy/CuPy/PyTorch/Dask/MXNet all support it just fine, and there's no user-noticeable effort if JAX/TF ignored it. |
Re buffer protocol: if we support it, then we should probably also support |
|
The trouble with If a library only supports C-contiguous arrays, there can only be one value (default), and the keyword wont get used. For libraries with strided support |
Not superceded, the buffer protocol is C-only and |
It is possible for JAX/TF to accept and just ignore |
Also, if a library only has C ordered arrays, then this all still makes sense:
|
Opened gh-130 to add |
Reading through the standard, it appears that we may have missed an important feature: the ability to explicit coerce objects into a desired array type, either from builtin Python types like float/list or other array libraries. In other words, we need something like NumPy's
array()
and/orasarray()
functions.The text was updated successfully, but these errors were encountered: