-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-40698: [C++] Create registry for Devices to map DeviceType to MemoryManager in C Device Data import #40699
GH-40698: [C++] Create registry for Devices to map DeviceType to MemoryManager in C Device Data import #40699
Conversation
…o MemoryManager in C Device Data import
|
@pitrou I would appreciate a preliminary review to check if this is going in the right direction (of course still need to add tests, docs, clean-up naming, etc, and testing it now with CUDA) For now I didn't go for a dual public / And I added it to device.h/cc right now, but actually if this will only be used for the C Device interface, could also move it to bridge.h/cc |
@github-actions crossbow submit test-cuda-python |
Revision: e16e24d Submitted crossbow builds: ursacomputing/crossbow @ actions-3c9b11581b
|
Yes, this is looking ok, though the implementation can be simplified a bit.
Agreed. No need to expose the registry class itself for now.
Since the API is minimal and doesn't require any addition includes, we can keep it in |
You mean to not even use the internal class to store the mapping, but just have the register/get functions and store the unordered_map in a global variable? |
No, the class is ok, but the |
@github-actions crossbow submit test-cuda-python |
Revision: f33872d Submitted crossbow builds: ursacomputing/crossbow @ actions-2f2ec1b2f6
|
cpp/src/arrow/gpu/cuda_memory.cc
Outdated
Result<std::shared_ptr<MemoryManager>> DefaultGPUMemoryMapper(int64_t device_id) { | ||
ARROW_ASSIGN_OR_RAISE(auto device, arrow::cuda::CudaDevice::Make(device_id)); | ||
return device->default_memory_manager(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably include some sort of customizations on the cuda device to ensure it uses the appropriate allocation type (HOST/MANAGED/etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somewhat blindly copied this from an existing function:
arrow/cpp/src/arrow/gpu/cuda_memory.cc
Lines 488 to 497 in 1781b32
Result<std::shared_ptr<MemoryManager>> DefaultMemoryMapper(ArrowDeviceType device_type, | |
int64_t device_id) { | |
switch (device_type) { | |
case ARROW_DEVICE_CPU: | |
return default_cpu_memory_manager(); | |
case ARROW_DEVICE_CUDA: | |
case ARROW_DEVICE_CUDA_HOST: | |
case ARROW_DEVICE_CUDA_MANAGED: { | |
ARROW_ASSIGN_OR_RAISE(auto device, arrow::cuda::CudaDevice::Make(device_id)); | |
return device->default_memory_manager(); |
But yes, that just ignores the device type at the moment. Looking at the code, it seems that CudaDevice currently only supports CUDA allocation type?
Then for this PR I can just remove the other two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems reasonable for now. We can update and handle the other types in a future update
cpp/src/arrow/gpu/cuda_memory.cc
Outdated
|
||
std::once_flag cuda_registered; | ||
|
||
Status RegisterCUDADevice() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for not doing this automatically at initialization rather than have the user call it explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I am not fully sure on what possible use cases would be. So from my point of view of enabling importing CUDA data in pyarrow, registering the CUDA device automatically is perfectly fine.
I assume it's quite unlikely that someone might want to register a different CUDA device from C++?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if they have a need to a dedicated CUDA mapper, they can just pass their own DeviceMemoryMapper
when importing, AFAICT. What do you think @zeroshade ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. I was going to suggest that with this registration mechanism, we don't necessarily need to keep the device mapper keyword, but that's actually a reason to keep it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit that registers the CUDA device by default and therefore removes the public RegisterCudaDevice
function.
Co-authored-by: Antoine Pitrou <[email protected]>
@github-actions crossbow submit test-cuda-python |
Revision: 92ece26 Submitted crossbow builds: ursacomputing/crossbow @ actions-a8e191b52d
|
This should be ready for another (final?) review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but one minor suggestion still. Feel free to merge when done!
@@ -363,4 +363,33 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager { | |||
ARROW_EXPORT | |||
std::shared_ptr<MemoryManager> default_cpu_memory_manager(); | |||
|
|||
using MemoryMapper = | |||
std::function<Result<std::shared_ptr<MemoryManager>>(int64_t device_id)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but a couple more suggestions to unify naming:
- rename
MemoryMapper
toDeviceMemoryMapper
? - rename
RegisterDeviceMemoryManager
toRegisterDeviceMemoryMapper
- rename
GetDeviceMemoryManager
toGetDeviceMemoryMapper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points, that naming is definitely more consistent.
There is however one problem that we already define a DeviceMemoryMapper
for the keyword type in the actual bridge.h Import methods:
arrow/cpp/src/arrow/c/bridge.h
Lines 218 to 219 in 434f872
using DeviceMemoryMapper = | |
std::function<Result<std::shared_ptr<MemoryManager>>(ArrowDeviceType, int64_t)>; |
and we should probably find a distinct name, given that both are slight different (the one takes device_type+device_id and returns a MemoryManager, while the other is a function already tied to a specific device_type and thus only takes a device_id, returning again a MemoryManager)
It's of course a subtle difference that might be difficult to embody in a name. But at least using distinct names seems best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps DeviceIdMapper
then? Not terribly pretty I admit...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that sounds good for the function type alias, but then I would personally leave the register/get functions as is? I would find RegisterDeviceIdMapper
a bit strange with the focus on the id
, because you are also registering a device type, it's just that the value you store for the registered type is the DeviceIdMapper ..
Anyway, in the end it doesn't matter that much, happy to go with whatever we come up with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or DeviceMapper
/ RegisterDeviceMapper
/ GetDeviceMapper
? (that's a bit more generic, but keeps the three consistent with each other)
@github-actions crossbow submit test-cuda-cpp |
Revision: 7a9e30d Submitted crossbow builds: ursacomputing/crossbow @ actions-b7970e2559
|
@github-actions crossbow submit test-cuda-cpp |
Revision: a5a6f6c Submitted crossbow builds: ursacomputing/crossbow @ actions-f7506cdb39
|
Thanks for the reviews! |
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit a407a6b. There were 10 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…o MemoryManager in C Device Data import (apache#40699) ### Rationale for this change Follow-up on apache#39980 (comment) Right now, the user of `ImportDeviceArray` or `ImportDeviceRecordBatch` needs to provide a `DeviceMemoryMapper` mapping the device type and id to a MemoryManager. We provide a default implementation of that mapper that just knows about the default CPU memory manager (and there is another implementation in `arrow::cuda`, but you need to explicitly pass that to the import function) To make this easier, this PR adds a registry such that default device mappers can be added separately. ### What changes are included in this PR? This PR adds two new public functions to register device types (`RegisterDeviceMemoryManager`) and retrieve the mapper from the registry (`GetDeviceMemoryManager`). Further, it provides a `RegisterCUDADevice` to optionally register the CUDA devices (by default only CPU device is registered). ### Are these changes tested? ### Are there any user-facing changes? * GitHub Issue: apache#40698 Lead-authored-by: Joris Van den Bossche <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
Rationale for this change
Follow-up on #39980 (comment)
Right now, the user of
ImportDeviceArray
orImportDeviceRecordBatch
needs to provide aDeviceMemoryMapper
mapping the device type and id to a MemoryManager. We provide a default implementation of that mapper that just knows about the default CPU memory manager (and there is another implementation inarrow::cuda
, but you need to explicitly pass that to the import function)To make this easier, this PR adds a registry such that default device mappers can be added separately.
What changes are included in this PR?
This PR adds two new public functions to register device types (
RegisterDeviceMemoryManager
) and retrieve the mapper from the registry (GetDeviceMemoryManager
).Further, it provides a
RegisterCUDADevice
to optionally register the CUDA devices (by default only CPU device is registered).Are these changes tested?
Are there any user-facing changes?