-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: implement at
#53
base: main
Are you sure you want to change the base?
Conversation
376ad49
to
4a0943a
Compare
Rebased on top of #58 |
|
Any idea how to fix this? Looks like the ubuntu_latest VM has an obsolete driver (or more likely no driver)
|
src/array_api_extra/_funcs.py
Outdated
if copy is False: | ||
if is_array_api_obj(self.idx): | ||
# Boolean index. Note that the array API spec | ||
# https://data-apis.org/array-api/latest/API_specification/indexing.html | ||
# does not allow for list, tuple, and tuples of slices plus one or more | ||
# one-dimensional array indices, although many backends support them. | ||
# So this check will encounter a lot of false negatives in real life, | ||
# which can be caught by testing the user code vs. array-api-strict. | ||
msg = "get() with an array index always returns a copy" | ||
raise ValueError(msg) | ||
|
||
# Prevent scalar indices together with copy=False. | ||
# Even if some backends may return a scalar view of the original, we chose to be | ||
# strict here beceause some other backends, such as numpy, definitely don't. | ||
tup_idx = self.idx if isinstance(self.idx, tuple) else (self.idx,) | ||
if any( | ||
i is not None and i is not Ellipsis and not isinstance(i, slice) | ||
for i in tup_idx | ||
): | ||
msg = "get() with a scalar index typically returns a copy" | ||
raise ValueError(msg) | ||
|
||
# Note: this is not the same list of backends as is_writeable_array() | ||
if is_dask_array(x) or is_jax_array(x) or is_pydata_sparse_array(x): | ||
msg = f"get() on {array_namespace(x)} arrays always returns a copy" | ||
raise ValueError(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks are very brittle and I'm not comfortable going on with get()
as it is.
I can see two ways forward:
- Write a new function in
array-api-compat
,getitem_returns_view(x: Array, idx: Index) -> bool
that explores all the nooks and crannies of all the possible intersections of arrays and indices, then use that function here - Remove
get()
altogether. I have a strong suspicion it may not be needed in scipy to begin with.get(copy=False)
is not portable anyway, andget(copy=True)
can be trivially rewritten asasarray(x[idx], copy=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only use I can think of out of get(copy=True)
would be to avoid an unnecessary double copy when idx is a boolean mask, and that's assuming that the user either
a. doesn't know ahead of time that they're using a boolean mask (possible, but not terribly likely), or
b. doesn't trust that a x[bool_mask_idx]
will always return a deep copy no matter what, with all backends, e.g. no backend will try to coerce the mask into a slice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed get(), at least for the time being.
I don't think we can get GPU CI without paying someone for it, cc @rgommers . |
Can cupy run on a CPU-only host? |
Yep, that's on my radar to push forward this month, on multiple projects. Please feel free to open a new issue and assign it to me. I think we can hook up a shared GPU runner between this project and
I don't think so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave a final round of polish. This is ready for final review and approval. |
|
I resolved the merge conflicts after adding |
@lucascolley if you don't mind building against array-api-compat git tip until their next release, this PR is ready for final review and merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lucascolley if you don't mind building against array-api-compat git tip until their next release, this PR is ready for final review and merge
That's fine with me, but might be cleaner to get an array-api-compat release out first unless there are any big blockers.
Let's give @rgommers and @jakevdp time to take a look if they would like to.
Once (or before) this is merged, do you think you could make a PR to my branch at scikit-learn/scikit-learn#30340? I think we will have to transition scikit-learn from using array-api-compat as an optional dependency to vendoring it, but that sounds feasible based on scikit-learn/scikit-learn#30367 (comment).
|
msg = "Index has already been set" | ||
raise ValueError(msg) | ||
self._idx = idx | ||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would make sense to return a shallow copy rather than mutating self
. Otherwise, someone might write something like this and be surprised by the behavior:
getter = at(x)
y = getter[0].add(1)
z = getter[1].add(2)
if res is not None: | ||
return res | ||
assert x is not None | ||
x[self._idx] = elwise_op(x[self._idx], y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As currently implemented, the _iop
path has different semantics for repeated indices between JAX and NumPy:
>>> import numpy as np
>>> x = np.zeros(4)
>>> idx = np.array([1, 2, 2, 3, 3, 3])
>>> x[idx] = x[idx] + 1
>>> x
array([0., 1., 1., 1.])
>>> import jax.numpy as jnp
>>> x = jnp.zeros(4)
>>> idx = jnp.array([1, 2, 2, 3, 3, 3])
>>> x.at[idx].add(1)
Array([0., 1., 2., 3.], dtype=float32)
At the very least, the difference should be documented.
Implement a new
at(x, idx)
orat(x)[idx]
function, mocking the syntax of JAX's omonymous method .This is propaedeutic to JAX support in libraries that support the Array API, e.g. scipy.
Moved from data-apis/array-api-compat#205
Blockers