-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package request: pytorch on darwin with GPU (MPS) support #243868
Comments
Very cool! cc maintainers @tscholak @teh @thoughtpolice. Also cc @samuela. |
curious if python3Packages.torch-bin works? |
@samuela torch-bin gives me unsupported system ( |
hmmm that's odd. aarch64-darwin is a supported platform and there are srcs available here: nixpkgs/pkgs/development/python-modules/torch/binary-hashes.nix Lines 40 to 54 in 04f7903
what version of python are you using? |
I made a However I'm not seeing any speed difference with a small test file (full of particularly witty content): $ nix shell github:n8henrie/whisper/nix -c time whisper 20220922\ 084923.m4a
/nix/store/0fyl0qi21ljiswg05qz5p2bpnl016k0l-python3.10-whisper/lib/python3.10/site-packages/whisper/transcribe.py:114: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Detected language: English
[00:00.000 --> 00:04.640] This is an example of voice recording. Something might actually say if I
[00:04.640 --> 00:09.440] recording something on my watch. I think this is how it would turn out.
19.93user 11.08system 0:14.23elapsed 217%CPU (0avgtext+0avgdata 2062800maxresident)k
0inputs+0outputs (3246major+212332minor)pagefaults 0swaps
$
$ nix shell github:n8henrie/whisper/nix-gpu -c time whisper 20220922\ 084923.m4a
/nix/store/fvpkcahvngmsssm3yfchvgw11hdic6sx-python3.10-whisper/lib/python3.10/site-packages/whisper/transcribe.py:114: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Detected language: English
[00:00.000 --> 00:04.640] This is an example of voice recording. Something might actually say if I
[00:04.640 --> 00:09.440] recording something on my watch. I think this is how it would turn out.
19.81user 11.16system 0:14.02elapsed 220%CPU (0avgtext+0avgdata 2177600maxresident)k
0inputs+0outputs (3955major+212007minor)pagefaults 0swaps Not sure if that's going to be a training vs inference issue (or more likely whisper isn't configured to use the GPU, it has a |
Huh, looks like I just need to pass $ nix shell github:n8henrie/whisper/nix-gpu -c time whisper --device mps 20220922\ 084923.m4a
Traceback (most recent call last):
File "/nix/store/fvpkcahvngmsssm3yfchvgw11hdic6sx-python3.10-whisper/bin/.whisper-wrapped", line 9, in <module>
sys.exit(cli())
File "/nix/store/fvpkcahvngmsssm3yfchvgw11hdic6sx-python3.10-whisper/lib/python3.10/site-packages/whisper/transcribe.py", line 444, in cli
model = load_model(model_name, device=device, download_root=model_dir)
File "/nix/store/fvpkcahvngmsssm3yfchvgw11hdic6sx-python3.10-whisper/lib/python3.10/site-packages/whisper/__init__.py", line 154, in load_model
return model.to(device)
File "/nix/store/0y2cidaighjvzwlw31k6fkcragba3jki-python3.10-torch-2.0.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1145, in to
return self._apply(convert)
File "/nix/store/0y2cidaighjvzwlw31k6fkcragba3jki-python3.10-torch-2.0.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 844, in _apply
self._buffers[key] = fn(buf)
File "/nix/store/0y2cidaighjvzwlw31k6fkcragba3jki-python3.10-torch-2.0.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'SparseMPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty.memory_format' is only available for these backends: [CPU, MPS, Meta, QuantizedCPU, QuantizedMeta, MkldnnCPU, SparseCPU, SparseMeta, SparseCsrCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMeta, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].
CPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31034 [kernel]
MPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMPS.cpp:22748 [kernel]
Meta: registered at /dev/null:241 [kernel]
QuantizedCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedCPU.cpp:929 [kernel]
QuantizedMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedMeta.cpp:105 [kernel]
MkldnnCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMkldnnCPU.cpp:507 [kernel]
SparseCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCPU.cpp:1379 [kernel]
SparseMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseMeta.cpp:249 [kernel]
SparseCsrCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCsrCPU.cpp:1128 [kernel]
BackendSelect: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:726 [kernel]
Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:144 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:491 [backend fallback]
Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:280 [backend fallback]
Named: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:21 [kernel]
Negative: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:23 [kernel]
ZeroTensor: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:90 [kernel]
ADInplaceOrView: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:63 [backend fallback]
AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradHIP: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradVE: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradMTIA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17484 [autograd kernel]
Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:16726 [kernel]
AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:487 [backend fallback]
AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:354 [backend fallback]
FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:815 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1073 [backend fallback]
VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:210 [backend fallback]
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:152 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:487 [backend fallback]
PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:148 [backend fallback]
Command exited with non-zero status 1
3.87user 1.23system 0:03.39elapsed 150%CPU (0avgtext+0avgdata 1210224maxresident)k
0inputs+0outputs (19major+112713minor)pagefaults 0swaps |
timing based benchmarks are tricky to debug... does |
Yeah this error looks like it's an issue with whisper as opposed to nix's packaging of torch/torch-bin. We can leave this open as a tracking issue for getting MPS support in our source build of torch however. |
Tried 39-311. I'm pinned to 23.05, I wonder if it's different in unstable? Yup, that was it! $ nix-shell -I nixpkgs=channel:nixpkgs-unstable -p 'python39.withPackages (ps: with ps; [ pytorch-bin ])' --command "python -c 'import torch; print(torch.backends.mps.is_available())'"
True
@samuela yes, I just patched this into the whisper code (in the |
yup, that makes for two TODOs
|
I found that torchvision-bin support on aarch64-darwin effectively prevented MPS from being usable. Submitted #244716 to fix. |
I can confirm that MPS support works with torch-bin and torchvision-bin now |
I'm still trying to get it working. Tried adding the
|
Can any of the aarch64-darwin folks help me build this locally via Currently builds fine with Steps I'm taking / have tried: Attempt 1
It looks like Attempt 2
After reading this issue, it looks like some of these are functions and some are variables, so I might need to e.g. Attempt 3
Here it fails with the same errors about protobuf. I think these are relevant convos (for my future reference):
Can anyone point out what I'm getting wrong here? |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/nix-develop-fails-with-command-bash-norc/31896/1 |
According to the pytorch cmakelists, it requires SDK >= 12.3 for MPS support, currently we only have: $ nix eval --json --apply 'builtins.attrNames' github:nixos/nixpkgs/master#darwin | jq -r '.[] | select(contains("sdk"))'
apple_sdk
apple_sdk_10_12
apple_sdk_11_0 I've tried anyways, adding the MetalPerformanceShaders and MetalPerformanceShadersGraph to the build inputs and substituting |
hmmm interesting... what's the process for adding a new |
People looking for a work-around might find this flake helpful. It's based on @n8henrie's wheel approach. nix run "github:david-r-cox/pytorch-darwin-env#verificationScript" |
After #346043 lands, this should be doable. Just add |
I tried to enable this with:
I'm still getting the xcrun error:
|
I was waiting on @reckenrode's recent massive SDK refactor to land (at least in master, still staging as of this AM) and will retry then. |
My mistake, didn't realize #346043 was not yet in the master. |
A couple sites helpful for tracking -- the best known is https://nixpk.gs/pr-tracker.html?pr=346043 (I really wish we could get an RSS feed for when user-specified PRs land in a specific branch.) |
It’s in |
I rebased my hack on top of staging-next, now I got different errors:
It now says SDK version is 11.3, maybe I need to change something else as well in that package, as I was trying to specify I want version 12.3 |
This was one of my test cases. Add It has to be the 13.3 SDK because MPS isn’t available in older SDKs, and the version of pytorch in nixpkgs does not build with the 14.4 (or presumably 15.0) SDK. Edit: Apple’s article on MPS in PyTorch suggests the 12.3 SDK should work, so I could be wrong about the minimum supported SDK, but the requirements may have changed since it was posted. You can try the 12.3 SDK by adding
After the update, the old SDK frameworks are stubs that do nothing on their own . If you used |
Looks like success!
|
Just for tests I also tried to use apple-sdk_12 but I got these errors:
So I guess 13 is way to go |
I made a PR to staging-next now: #351778 |
Project description
Not sure if this is a "packaging request" but didn't seem like a bug report, just wanted a tracking issue, I might try to hack on this eventually.
Pytorch on aarch64-darwin now supports GPU:
https://pytorch.org/docs/stable/notes/mps.html
The currently packaged pytorch runs but doesn't use my Mac's GPU:
In contrast, using the pytorch-provided wheel:
Metadata
cc @NixOS/darwin-maintainers
The text was updated successfully, but these errors were encountered: