Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-40384: [Python] Expand the C Device Interface bindings to support import on CUDA device #40385

Merged

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Mar 6, 2024

Rationale for this change

Follow-up on #39979 which added _export_to_c_device/_import_from_c_device methods, but for now only for CPU devices.

What changes are included in this PR?

  • Ensure pyarrow.cuda is imported before importing data through the C Interface, to ensure the CUDA device is registered
  • Add tests for exporting/importing with the device interface on CUDA

Are these changes tested?

Yes, added tests for CUDA.

Copy link

github-actions bot commented Mar 6, 2024

⚠️ GitHub issue #40384 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Mar 19, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Mar 19, 2024
@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

Copy link

Revision: 05dd7a5

Submitted crossbow builds: ursacomputing/crossbow @ actions-b585ced356

Task Status
test-cuda-python GitHub Actions

@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

Copy link

Revision: 83b0d58

Submitted crossbow builds: ursacomputing/crossbow @ actions-17a7524191

Task Status
test-cuda-python GitHub Actions

@jorisvandenbossche jorisvandenbossche marked this pull request as ready for review March 19, 2024 15:32
@jorisvandenbossche jorisvandenbossche changed the title GH-40384: [Python] Expand the the C Device Interface bindings to support CUDA device GH-40384: [Python] Expand the the C Device Interface bindings to support import on CUDA device Mar 19, 2024
@@ -965,6 +965,56 @@ def read_record_batch(object buffer, object schema, *,
return pyarrow_wrap_batch(batch)


def _import_device_array_cuda(in_ptr, type):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we decide at some point that we would have a registration system on the C++ side, so that per-device type memory mappers could be added at library initialization time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point, I forgot about that given this conditional import works in python .. Will look into the registry!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registry was added in #40699

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Mar 27, 2024
Copy link

Revision: 73bfca1

Submitted crossbow builds: ursacomputing/crossbow @ actions-f86b23b51b

Task Status
test-cuda-python GitHub Actions

@jorisvandenbossche jorisvandenbossche changed the title GH-40384: [Python] Expand the the C Device Interface bindings to support import on CUDA device GH-40384: [Python] Expand the C Device Interface bindings to support import on CUDA device Apr 1, 2024
@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

Copy link

github-actions bot commented Apr 8, 2024

Revision: 78a7fb5

Submitted crossbow builds: ursacomputing/crossbow @ actions-bba6bd0230

Task Status
test-cuda-python GitHub Actions

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Apr 8, 2024
try:
import pyarrow.cuda # no-cython-lint
except ImportError:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to silence the error? The error message could be more informative than the error the user would later get when a device array fails importing...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Pushed a change to capture the error message, and to actually raise a more informative error message here about pyarrow not being built with CUDA support (embedding the original import error message), instead of using the message from the C++ registry.

void* c_type_ptr
shared_ptr[CArray] c_array

if c_device_array.device_type == 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could perhaps instead add kDLCUDA to

ctypedef enum DLDeviceType:
kDLCPU = 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's for DLPack, and also those definitions are the same, I would prefer to not mix them in our code. But added a similar line to libarrow.pxd to expose ARROW_DEVICE_CUDA, which also makes the above easier to read

@pitrou
Copy link
Member

pitrou commented Apr 8, 2024

Are the AppVeyor failures unrelated?

@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Apr 9, 2024
@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

Copy link

github-actions bot commented Apr 9, 2024

Revision: 3e9f913

Submitted crossbow builds: ursacomputing/crossbow @ actions-80c1cb7576

Task Status
test-cuda-python GitHub Actions

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 18, 2024
@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

Copy link

Revision: eca6869

Submitted crossbow builds: ursacomputing/crossbow @ actions-246d24263e

Task Status
test-cuda-python GitHub Actions

@jorisvandenbossche jorisvandenbossche merged commit 89d6354 into apache:main Jun 19, 2024
11 of 12 checks passed
@jorisvandenbossche jorisvandenbossche removed the awaiting change review Awaiting change review label Jun 19, 2024
@jorisvandenbossche jorisvandenbossche deleted the device-interface-cuda branch June 19, 2024 09:46
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 89d6354.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them.

jorisvandenbossche added a commit that referenced this pull request Jun 26, 2024
…yArrow (#40717)

### Rationale for this change

PyArrow implementation for the specification additions being proposed in
#40708

### What changes are included in this PR?

New `__arrow_c_device_array__` method to `pyarrow.Array` and
`pyarrow.RecordBatch`, and support in the `pyarrow.array(..)`,
`pyarrow.record_batch(..)` and `pyarrow.table(..)` functions to consume
objects that have those methods.

### Are these changes tested?

Yes (for CPU only for now, #40385 is
a prerequisite to test this for CUDA)


* GitHub Issue: #38325
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Jul 9, 2024
…a in PyArrow (apache#40717)

### Rationale for this change

PyArrow implementation for the specification additions being proposed in
apache#40708

### What changes are included in this PR?

New `__arrow_c_device_array__` method to `pyarrow.Array` and
`pyarrow.RecordBatch`, and support in the `pyarrow.array(..)`,
`pyarrow.record_batch(..)` and `pyarrow.table(..)` functions to consume
objects that have those methods.

### Are these changes tested?

Yes (for CPU only for now, apache#40385 is
a prerequisite to test this for CUDA)


* GitHub Issue: apache#38325
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants