Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't install apex #1603

Closed
zp2459 opened this issue Mar 3, 2023 · 3 comments
Closed

Can't install apex #1603

zp2459 opened this issue Mar 3, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@zp2459
Copy link

zp2459 commented Mar 3, 2023

Describe the Bug
I followed the tutorials,but when I pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ the errors happenned below:
error log

Using pip 22.3.1 from /home/panz/anaconda3/envs/gpt/lib/python3.8/site-packages/pip (python 3.8)
WARNING: Implying --no-binary=:all: due to the presence of --build-option / --global-option / --install-option. Consider using --config-settings for more flexibility.
DEPRECATION: --no-binary currently disables reading from the cache of locally built wheels. In the future --no-binary will not influence the wheel cache. pip 23.1 will enforce this behaviour change. A possible replacement is to use the --no-cache-dir option. You can use the flag --use-feature=no-binary-enable-wheel-cache to test the upcoming behaviour. Discussion can be found at pypa/pip#11453
Processing /home/panz/project/ColossalAI/examples/language/gpt/gemini/apex
Running command python setup.py egg_info

torch.version = 1.12.0+cu113

running egg_info
creating /tmp/pip-pip-egg-info-7w8pc_qc/apex.egg-info
writing /tmp/pip-pip-egg-info-7w8pc_qc/apex.egg-info/PKG-INFO
writing dependency_links to /tmp/pip-pip-egg-info-7w8pc_qc/apex.egg-info/dependency_links.txt
writing requirements to /tmp/pip-pip-egg-info-7w8pc_qc/apex.egg-info/requires.txt
writing top-level names to /tmp/pip-pip-egg-info-7w8pc_qc/apex.egg-info/top_level.txt
writing manifest file '/tmp/pip-pip-egg-info-7w8pc_qc/apex.egg-info/SOURCES.txt'
reading manifest file '/tmp/pip-pip-egg-info-7w8pc_qc/apex.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file '/tmp/pip-pip-egg-info-7w8pc_qc/apex.egg-info/SOURCES.txt'
Preparing metadata (setup.py) ... done
Requirement already satisfied: packaging>20.6 in /home/panz/anaconda3/envs/gpt/lib/python3.8/site-packages (from apex==0.1) (23.0)
Installing collected packages: apex
DEPRECATION: apex is being installed using the legacy 'setup.py install' method, because the '--no-binary' option was enabled for it and this currently disables local wheel building for projects that don't have a 'pyproject.toml' file. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at pypa/pip#11451
Running command Running setup.py install for apex

torch.version = 1.12.0+cu113

Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0
from /home/panz/anaconda3/envs/gpt/bin

Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/home/panz/project/ColossalAI/examples/language/gpt/gemini/apex/setup.py", line 171, in
check_cuda_torch_binary_vs_bare_metal(CUDA_HOME)
File "/home/panz/project/ColossalAI/examples/language/gpt/gemini/apex/setup.py", line 33, in check_cuda_torch_binary_vs_bare_metal
raise RuntimeError(
RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries. Pytorch binaries were compiled with Cuda 11.3.
In some cases, a minor-version mismatch will not cause later errors: #323 (comment). You can try commenting out this check (at your own risk).
error: subprocess-exited-with-error

× Running setup.py install for apex did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /home/panz/anaconda3/envs/gpt/bin/python -u -c '
exec(compile('"'"''"'"''"'"'
This is -- a caller that pip uses to run setup.py

  • It imports setuptools before invoking setup.py, to enable projects that directly
    import from distutils.core to work with newer packaging standards.
  • It provides a clear error message when setuptools is not installed.
  • It sets sys.argv[0] to the underlying setup.py, when invoking setup.py so
    setuptools doesn'"'"'t think the script is -c. This avoids the following warning:
    manifest_maker: standard file '"'"'-c'"'"' not found".
  • It generates a shim setup.py, for handling setup.cfg-only projects.
    import os, sys, tokenize

try:
import setuptools
except ImportError as error:
print(
"ERROR: Can not execute setup.py since setuptools is not available in "
"the build environment.",
file=sys.stderr,
)
sys.exit(1)

file = %r
sys.argv[0] = file

if os.path.exists(file):
filename = file
with tokenize.open(file) as f:
setup_py_code = f.read()
else:
filename = ""
setup_py_code = "from setuptools import setup; setup()"

exec(compile(setup_py_code, filename, "exec"))
'"'"''"'"''"'"' % ('"'"'/home/panz/project/ColossalAI/examples/language/gpt/gemini/apex/setup.py'"'"',), "", "exec"))' --cpp_ext --cuda_ext install --record /tmp/pip-record-ss3s_tnl/install-record.txt --single-version-externally-managed --compile --install-headers /home/panz/anaconda3/envs/gpt/include/python3.8/apex
cwd: /home/panz/project/ColossalAI/examples/language/gpt/gemini/apex/
Running setup.py install for apex ... error
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> apex

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

A helpful guide on on how to craft a minimal bug report http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports.

Environment

CUDA:11.7
torch:1.12.0+cu113

@zp2459 zp2459 added the bug Something isn't working label Mar 3, 2023
@scutfrank
Copy link

You can use this version of apex:https://github.com/ptrblck/apex

@DaveBGld
Copy link

Also note you are reporting CUDA 11.7 but Pytorch was installed for CUDA 11.3...

@bit-scientist
Copy link

://github.com/ptrblck/apex

I could install with pip install -v --no-cache-dir . command after installing the https://github.com/ptrblck/apex version of apex.

@zp2459 zp2459 closed this as completed Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants