Skip to content
This repository has been archived by the owner on Aug 15, 2020. It is now read-only.

Build issues #221

Open
miketempleman opened this issue Jan 20, 2019 · 1 comment
Open

Build issues #221

miketempleman opened this issue Jan 20, 2019 · 1 comment

Comments

@miketempleman
Copy link

I have encountered two separate issues when trying to build the Docker version of dsstne. I am using the DSSTNE CUDA 9.1 (ami-fe173884) ami on a g2.8xlarge instance in us-east-1.

First, I cannot run the driver information app. Whenever I try to run:

nvidia-docker run --rm nvidia/cuda nvidia-smi

The response is:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:296: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH": unknown.

I know the nvidia-smi app is there:

whereis nvidia-smi
nvidia-smi: /usr/bin/nvidia-smi /usr/share/man/man1/nvidia-smi.1.gz

And if I simply run the nvidia-smi app from bash I see that the driver is installed:

`+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.26 Driver Version: 387.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|==========================+======================+======================|
| 0 GRID K520 Off | 00000000:00:03.0 Off | N/A |
| N/A 28C P8 17W / 125W | 11MiB / 4036MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|========================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
`
But somehow the $PATH for nvidia-docker does not point to it. Do I need to build nvidia-docker to resolve this $PATH problem? Or use another ami? I see that there are other dsstne amis in us-east-1.

Second, if I go ahead and try to build dsstne from the repository using latest, I see a warning from the makefile:

Step 13/15 : RUN cd /opt/amazon/dsstne/src/amazon/dsstne && make install ---> Running in 8545a8860f50 Makefile:6: ************************************************************************************** Makefile:7: ****************** USE OF DEPRECATED MAKEFILE ****************** Makefile:8: ****************** PLEASE USE THE ONE AT THE ROOT OF THE REPOSITORY ****************** Makefile:9: **************************************************************************************

And the make fails with the error:
mkdir -p /opt/amazon/dsstne/src/amazon/../../amazon-dsstne cp -rfp /opt/amazon/dsstne/src/amazon/../../../build/lib /opt/amazon/dsstne/src/amazon/../../amazon-dsstne/lib cp: cannot stat '/opt/amazon/dsstne/src/amazon/../../../build/lib': No such file or directory make: *** [install] Error 1 Makefile:26: recipe for target 'install' failed The command '/bin/sh -c cd /opt/amazon/dsstne/src/amazon/dsstne && make install' returned a non-zero code: 2

When I change the Dockerfile to use the Makefile at the root of the repo, the build fails with:

In file included from src/main/native/com_amazon_dsstne_Dsstne.cpp:20:0: src/main/native/jni_util.h:21:17: fatal error: jni.h: No such file or directory compilation terminated. make[1]: *** [target/native/build/com_amazon_dsstne_Dsstne.o] Error 1

At this point I am at an impasse. I did try following the setup instructions using the community ami Amazon DSSTNE (nvidia-docker) - ami-25c0eb32 but encountered the same error.

My next step is to try to rebuild nvidia-docker and then continue to grind through the dsstne docker build. But I hope that someone can let me know what I am doing wrong before I spend another day on this task instead of working with dsstne.

Mike Templeman

@mmwillet
Copy link

mmwillet commented Jan 28, 2019

@miketempleman I encountered those same issues. We addressed the former issue by using the non-deprecated makefile (which you did). Specifically I changed this run command in the docker file to RUN cd /opt/amazon/dsstne && \ make install. The latter problem seems to have to do with the fact that we are missing "jni.h" and "jni_md.h". We addressed this problem (probably improperly) by adding RUN apt-get --yes install openjdk-8-jdk before the make command in the dockerfile. Finally the predict script that the documentation suggests should be used cannot be found in the entry path and has to be called with its absolute path like so $ nvidia-docker run --rm -it amazon-dsstne /opt/amazon/dsstne/build/bin/predict.
The documentation in this repository should probably be changed and the dockerfile should probably be fixed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants