GH-39979: [Python] Low-level bindings for exporting/importing the C Device Interface #39980

jorisvandenbossche · 2024-02-07T16:53:54Z

Rationale for this change

We have low-level methods _import_from_c/_export_to_c for the C Data Interface, we can add similar methods for the C Device data interface.

Expanding the Arrow PyCapsule protocol (i.e. a better public API for other libraries) is covered by #38325. Because of that, we might not want to keep those low-level methods long term (or at least we need to have the equivalents using capsules), but for testing it's useful to already add those.

What changes are included in this PR?

Added methods to Array and RecordBatch classes. Currently import only works for CPU devices.

GitHub Issue: [Python] Add low-level bindings for exporting/importing the C Device Interface #39979

…method

jorisvandenbossche · 2024-02-08T15:19:33Z

cpp/src/arrow/c/bridge.cc

+  if (device_type != ARROW_DEVICE_CPU) {
+    return Status::NotImplemented("Only importing data on CPU is supported");
+  }


This could later be expanded to also allow CUDA device for CUDA enabled builds

Yes, there probably should be some kind of registry so that "default" device mappers can be added separately.

@pitrou you mention a "registry", but AFAIK that's what we ideally would have (so external device implementations could register themselves) and that doesn't exist yet, right?
In that case, is the function above an OK short-term default?

Yes and yes!

jorisvandenbossche · 2024-02-08T15:24:36Z

cpp/src/arrow/c/bridge.cc

+Result<std::shared_ptr<Array>> ImportDeviceArray(struct ArrowDeviceArray* array,
+                                                 std::shared_ptr<DataType> type) {
+  return ImportDeviceArray(array, type, DefaultDeviceMapper);
+}


Do we want to provide such an API that uses a default DeviceMapper?

With the current APIs here, I assume the idea is that it's the responsibility of the user (i.e. the library or application using Arrow C++ to consume data through the C Device interface) to provide the device mapping as they see fit.
In the case of exposing this in pyarrow, it's pyarrow that is the user of those APIs and I think pyarrow certainly wants to have a default mapping provided (not to be specified by the user of pyarrow). In theory I could write this DefaultDeviceMapper function in cython to keep this on the pyarrow side, but this might also be useful for other users of the C++ APIs?

(I suppose when we add a default in C++, I could also give the existing signatures a default parameter value for mapper, instead of adding those two additional signatures)

cc @zeroshade @pitrou

Do we want to provide such an API that uses a default DeviceMapper?

Yes, that sounds reasonable to me. I think that in many (most?) cases, users will want to use whatever device mapper is registered for the given device type.

Also:

I could also give the existing signatures a default parameter value for mapper, instead of adding those two additional signatures

Yes, that would reduce the proliferation of different functions. You could simply have something like const DeviceMemoryMapper& mapper = {}.

The difficulty with providing a default device mapper here is that it created a circular dependency due to the ArrowDeviceType being defined in abi.h and required linking against libarrow_cuda.so

That might be a reason to keep this default on the pyarrow side? (we can implement the mapper function in C++, but only provide it as the default argument on the Python side)

In Python, we can more easily dynamically check if pyarrow.cuda module is available, and if so provide a different default mapper (that includes GPU devices).

cpp/src/arrow/c/bridge.h

zeroshade · 2024-02-08T15:38:16Z

python/pyarrow/tests/test_cffi.py

+    # verify exported struct
+    assert c_array.device_type == 1  # ARROW_DEVICE_CPU 1
+    assert c_array.device_id == -1
+    assert c_array.array.length == 2


could we add a test that uses the arrow cuda lib and verify the device etc.?

I was planning to add actual cuda tests later in a separate PR (with proper roundtrip tests, not just export, but roundtrip doesn't work yet for non-cpu right now)

zeroshade · 2024-02-08T15:39:33Z

python/pyarrow/cffi.py

@@ -64,6 +64,16 @@
      // Opaque producer-specific data
      void* private_data;
    };
+
+    typedef int32_t ArrowDeviceType;


should we expose the constants in pyarrow somehow?

If we don't use them ourselves, I don't know if that is needed (although it might still be useful for other users of pyarrow.cffi?)

…-interface

zeroshade · 2024-02-20T16:16:01Z

python/pyarrow/array.pxi

+                c_array = GetResultValue(
+                    ImportDeviceArray(<ArrowDeviceArray*> c_ptr,
+                                      <ArrowSchema*> c_type_ptr)
+                )


The default mapper is only allowing CPU arrays, but pyarrow does have a cuda lib, shouldn't we allow and enable importing at least CUDA arrays too?

Certainly, but as mentioned earlier (#39980 (comment)), I was planning to tackle CUDA in a follow-up, and this PR indeed only properly supports and tests CPU.

zeroshade · 2024-02-20T16:20:50Z

python/pyarrow/array.pxi

+        Be careful: if you don't pass the ArrowDeviceArray struct to a consumer,
+        array memory will leak.  This is a low-level function intended for
+        expert users.


Should this explicitly mention the release callback on the struct?

I copied this from the existing docstrings. We could mention the release callback explicitly, but essentially then you are a "consumer". This functions returns an integer, you can't call the release callback on the return value as such. Only when you actually interpret it as an ArrowArray struct, you can do that (and at that point, you are a consumer who should be aware of those details?)
I could also point to the general page about the C Data Interface.

I agree with @jorisvandenbossche that the release callback need not be mentioned here. This is all in the spec.

zeroshade · 2024-02-20T16:23:38Z

python/pyarrow/table.pxi

+                c_batch = GetResultValue(ImportDeviceRecordBatch(
+                    <ArrowDeviceArray*> c_ptr, <ArrowSchema*> c_schema_ptr))


same comment as before, don't we want to allow using the pyarrow.cuda lib to provide a device mapper and hallow handling cuda-based gpu memory arrays?

jorisvandenbossche · 2024-02-22T09:28:33Z

@zeroshade @pitrou are you OK with merging this PR with only CPU support for now, and leave CUDA integration for a separate follow-up PR?

pitrou · 2024-02-22T15:02:19Z

@zeroshade @pitrou are you OK with merging this PR with only CPU support for now, and leave CUDA integration for a separate follow-up PR?

That sounds ok to me.

pitrou · 2024-02-22T15:03:24Z

python/pyarrow/array.pxi

+        Be careful: if you don't pass the ArrowDeviceArray struct to a consumer,
+        array memory will leak.  This is a low-level function intended for
+        expert users.


I agree with @jorisvandenbossche that the release callback need not be mentioned here. This is all in the spec.

pitrou · 2024-02-22T15:04:48Z

python/pyarrow/array.pxi

@@ -1778,6 +1778,70 @@ cdef class Array(_PandasConvertible):

        return pyarrow_wrap_array(array)

+    def _export_to_c_device(self, out_ptr, out_schema_ptr=0):


out_schema_ptr=None would feel slightly more Pythonic IMHO, though that's debatable.

I would propose to leave this as is, to keep it consistent with the other _export_to_c definitions (and the _as_c_pointer helper also requires an integer at the moment)

pitrou · 2024-02-22T15:07:46Z

python/pyarrow/tests/test_cffi.py

+
+
+@needs_cffi
+def test_export_import_device_array():


We're copy-pasting a lot of code in those tests, can we try to reduce duplication by factoring common functionality out?

Did an attempt to refactor this. In any case it's adding less code now ;)

…-interface

jorisvandenbossche · 2024-02-28T10:33:54Z

Appveyor seems to be an actual related failure now.

lib.cpp.obj : error LNK2019: unresolved external symbol "class arrow::Result<class std::shared_ptr<class arrow::MemoryManager> > __cdecl arrow::DefaultDeviceMapper(int,__int64)" (?DefaultDeviceMapper@arrow@@YA?AV?$Result@V?$shared_ptr@VMemoryManager@arrow@@@std@@@1@H_J@Z) referenced in function "struct _object * __cdecl __pyx_pf_7pyarrow_3lib_11RecordBatch_48_import_from_c_device(struct _object *,struct _object *)" (?__pyx_pf_7pyarrow_3lib_11Re

So it complains about not knowing DefaultDeviceMapper ( which I added as default argument value on the C++ side, but didn't add this to the signature on the cython side). @pitrou do you know of such issue with windows that you can't leave out arguments with defaults on the cython side?

cpp/src/arrow/c/bridge.h

jorisvandenbossche · 2024-02-28T13:50:56Z

Thanks for the reviews!

…he C Device Interface (apache#39980) ### Rationale for this change We have low-level methods `_import_from_c`/`_export_to_c` for the C Data Interface, we can add similar methods for the C Device data interface. Expanding the Arrow PyCapsule protocol (i.e. a better public API for other libraries) is covered by apache#38325. Because of that, we might not want to keep those low-level methods long term (or at least we need to have the equivalents using capsules), but for testing it's useful to already add those. ### What changes are included in this PR? Added methods to Array and RecordBatch classes. Currently import only works for CPU devices. * GitHub Issue: apache#39979 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

conbench-apache-arrow · 2024-02-29T01:06:09Z

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 99c5412.

There were 2 benchmark results indicating a performance regression:

Commit Run on ursa-i9-9960x at 2024-02-28 19:17:22Z
- dataframe-to-table (R) with dataset=type_integers, language=R
- tpch (R) with engine=arrow, format=parquet, language=R, memory_map=False, query_id=TPCH-07, scale_factor=1

The full Conbench report has more details. It also includes information about 7 possible false positives for unstable benchmarks that are known to sometimes produce them.

…he C Device Interface (apache#39980) ### Rationale for this change We have low-level methods `_import_from_c`/`_export_to_c` for the C Data Interface, we can add similar methods for the C Device data interface. Expanding the Arrow PyCapsule protocol (i.e. a better public API for other libraries) is covered by apache#38325. Because of that, we might not want to keep those low-level methods long term (or at least we need to have the equivalents using capsules), but for testing it's useful to already add those. ### What changes are included in this PR? Added methods to Array and RecordBatch classes. Currently import only works for CPU devices. * GitHub Issue: apache#39979 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

…ryManager in C Device Data import (#40699) ### Rationale for this change Follow-up on #39980 (comment) Right now, the user of `ImportDeviceArray` or `ImportDeviceRecordBatch` needs to provide a `DeviceMemoryMapper` mapping the device type and id to a MemoryManager. We provide a default implementation of that mapper that just knows about the default CPU memory manager (and there is another implementation in `arrow::cuda`, but you need to explicitly pass that to the import function) To make this easier, this PR adds a registry such that default device mappers can be added separately. ### What changes are included in this PR? This PR adds two new public functions to register device types (`RegisterDeviceMemoryManager`) and retrieve the mapper from the registry (`GetDeviceMemoryManager`). Further, it provides a `RegisterCUDADevice` to optionally register the CUDA devices (by default only CPU device is registered). ### Are these changes tested? ### Are there any user-facing changes? * GitHub Issue: #40698 Lead-authored-by: Joris Van den Bossche <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>

…o MemoryManager in C Device Data import (apache#40699) ### Rationale for this change Follow-up on apache#39980 (comment) Right now, the user of `ImportDeviceArray` or `ImportDeviceRecordBatch` needs to provide a `DeviceMemoryMapper` mapping the device type and id to a MemoryManager. We provide a default implementation of that mapper that just knows about the default CPU memory manager (and there is another implementation in `arrow::cuda`, but you need to explicitly pass that to the import function) To make this easier, this PR adds a registry such that default device mappers can be added separately. ### What changes are included in this PR? This PR adds two new public functions to register device types (`RegisterDeviceMemoryManager`) and retrieve the mapper from the registry (`GetDeviceMemoryManager`). Further, it provides a `RegisterCUDADevice` to optionally register the CUDA devices (by default only CPU device is registered). ### Are these changes tested? ### Are there any user-facing changes? * GitHub Issue: apache#40698 Lead-authored-by: Joris Van den Bossche <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>

jorisvandenbossche added 2 commits February 7, 2024 15:27

Expose the device interface low-level with Array._export_to_c_device …

4dfd0d6

…method

add import

3b28616

jorisvandenbossche marked this pull request as draft February 7, 2024 16:54

github-actions bot added Component: C++ Component: Python awaiting committer review Awaiting committer review labels Feb 7, 2024

jorisvandenbossche added 2 commits February 8, 2024 16:16

add export/import for RecordBatch

596175d

undo address change, handled in separate PR

6ffc6b8

jorisvandenbossche marked this pull request as ready for review February 8, 2024 15:19

jorisvandenbossche commented Feb 8, 2024

View reviewed changes

github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Feb 8, 2024

jorisvandenbossche commented Feb 8, 2024

View reviewed changes

cpp/src/arrow/c/bridge.h Outdated Show resolved Hide resolved

zeroshade reviewed Feb 8, 2024

View reviewed changes

use default argument

864a52c

github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Feb 14, 2024

jorisvandenbossche added 2 commits February 14, 2024 11:29

Merge remote-tracking branch 'upstream/main' into test-pyarrow-device…

7f78d83

…-interface

fixup linting after merge

d64f0e0

github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Feb 20, 2024

zeroshade reviewed Feb 20, 2024

View reviewed changes

pitrou reviewed Feb 22, 2024

View reviewed changes

Merge remote-tracking branch 'upstream/main' into test-pyarrow-device…

6e0870f

…-interface

refactor tests

5e6c3d5

github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Feb 28, 2024

linting

51064e3

github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Feb 28, 2024

pitrou reviewed Feb 28, 2024

View reviewed changes

cpp/src/arrow/c/bridge.h Show resolved Hide resolved

add export

2afbc63

github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Feb 28, 2024

pitrou merged commit 99c5412 into apache:main Feb 28, 2024
34 of 35 checks passed

pitrou removed the awaiting changes Awaiting changes label Feb 28, 2024

pitrou mentioned this pull request Feb 28, 2024

[Python] Add low-level bindings for exporting/importing the C Device Interface #39979

Closed

jorisvandenbossche deleted the test-pyarrow-device-interface branch February 28, 2024 13:50

This was referenced Mar 21, 2024

[C++] Create registry for Devices to map DeviceType to MemoryManager in C Device Data import #40698

Closed

GH-40698: [C++] Create registry for Devices to map DeviceType to MemoryManager in C Device Data import #40699

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-39979: [Python] Low-level bindings for exporting/importing the C Device Interface #39980

GH-39979: [Python] Low-level bindings for exporting/importing the C Device Interface #39980

jorisvandenbossche commented Feb 7, 2024 •

edited

Loading

jorisvandenbossche Feb 8, 2024

pitrou Feb 8, 2024 •

edited

Loading

jorisvandenbossche Feb 13, 2024

pitrou Feb 13, 2024

jorisvandenbossche Feb 8, 2024

pitrou Feb 8, 2024

pitrou Feb 8, 2024

zeroshade Feb 20, 2024

jorisvandenbossche Feb 21, 2024

zeroshade Feb 8, 2024

jorisvandenbossche Feb 13, 2024 •

edited

Loading

zeroshade Feb 8, 2024

jorisvandenbossche Feb 13, 2024

zeroshade Feb 20, 2024

jorisvandenbossche Feb 21, 2024

zeroshade Feb 20, 2024

jorisvandenbossche Feb 22, 2024

pitrou Feb 22, 2024

zeroshade Feb 20, 2024

jorisvandenbossche commented Feb 22, 2024

pitrou commented Feb 22, 2024

pitrou Feb 22, 2024

pitrou Feb 22, 2024

jorisvandenbossche Feb 28, 2024

pitrou Feb 22, 2024

jorisvandenbossche Feb 28, 2024

jorisvandenbossche commented Feb 28, 2024

jorisvandenbossche commented Feb 28, 2024

conbench-apache-arrow bot commented Feb 29, 2024

		c_batch = GetResultValue(ImportDeviceRecordBatch(
		<ArrowDeviceArray> c_ptr, <ArrowSchema> c_schema_ptr))

		@@ -1778,6 +1778,70 @@ cdef class Array(_PandasConvertible):

		return pyarrow_wrap_array(array)

		def _export_to_c_device(self, out_ptr, out_schema_ptr=0):

GH-39979: [Python] Low-level bindings for exporting/importing the C Device Interface #39980

GH-39979: [Python] Low-level bindings for exporting/importing the C Device Interface #39980

Conversation

jorisvandenbossche commented Feb 7, 2024 • edited Loading

Rationale for this change

What changes are included in this PR?

Choose a reason for hiding this comment

pitrou Feb 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche Feb 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented Feb 22, 2024

pitrou commented Feb 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented Feb 28, 2024

jorisvandenbossche commented Feb 28, 2024

conbench-apache-arrow bot commented Feb 29, 2024

jorisvandenbossche commented Feb 7, 2024 •

edited

Loading

pitrou Feb 8, 2024 •

edited

Loading

jorisvandenbossche Feb 13, 2024 •

edited

Loading