Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How use the nccl? #71

Closed
getengqing opened this issue Mar 2, 2017 · 8 comments
Closed

How use the nccl? #71

getengqing opened this issue Mar 2, 2017 · 8 comments

Comments

@getengqing
Copy link

When I use the nccl to the cifar10 example, I find the time is no change. How use it?
Using $TOOLS/caffe train --solver=examples/cifar10/cifar10_full_solver.prototxt -gpu 0
I0302 03:19:23.473100 8238 caffe.cpp:197] Using GPUs 0
I0302 03:19:23.473930 8238 caffe.cpp:202] GPU 0: Tesla P100-SXM2-16GB
I0302 03:19:23.825523 8238 solver.cpp:48] Initializing solver from parameters:
I0302 03:28:01.841064 8238 solver.cpp:362] Iteration 55000, Testing net (#0)
I0302 03:28:02.094408 8238 solver.cpp:429] Test net output #0: accuracy = 0.7896
I0302 03:28:02.094431 8238 solver.cpp:429] Test net output #1: loss = 0.620957 (* 1 = 0.620957 loss)
I0302 03:28:02.104485 8238 solver.cpp:242] Iteration 55000 (102.234 iter/s, 1.9563s/200 iter), loss = 0.370225
I0302 03:28:02.104549 8238 solver.cpp:261] Train net output #0: loss = 0.370225 (* 1 = 0.370225 loss)

Using $TOOLS/caffe train --solver=examples/cifar10/cifar10_full_solver.prototxt -gpu all ,the time is no little change!
I0302 03:31:23.499303 8361 solver.cpp:362] Iteration 5000, Testing net (#0)
I0302 03:31:23.723893 8361 solver.cpp:429] Test net output #0: accuracy = 0.6884
I0302 03:31:23.723942 8361 solver.cpp:429] Test net output #1: loss = 0.890609 (* 1 = 0.890609 loss)
I0302 03:31:23.733755 8361 solver.cpp:242] Iteration 5000 (99.2158 iter/s, 2.01581s/200 iter), loss = 0.571789
I0302 03:31:23.733794 8361 solver.cpp:261] Train net output #0: loss = 0.571788 (* 1 = 0.571788 loss)

@sjeaugey
Copy link
Member

sjeaugey commented Mar 2, 2017

Hi,

NCCL is an inter-GPU communication library ; it is therefore accelerating computation on multiple GPUs.
That means it won't make a difference with only one GPU.

@getengqing
Copy link
Author

Hello,I begin use one GPU, but then use four GPUs! But the result has no different.

@nluehr
Copy link
Contributor

nluehr commented Mar 3, 2017

The CIFAR-10 example is not very computationally intensive. The GPU(s) are probably limited by kernel launch latencies. Perhaps try something like AlexNet, or better ResNet-50, on ImageNet 1000.

@getengqing
Copy link
Author

OK!Thanks,I have a try the AlexNet!

@getengqing
Copy link
Author

@nluehr @sjeaugey When I use the caffe-0.15.9/build/tools/caffe time --model=./imagenet_winners/alexnet.prototxt --iterations=1000 --gpu all, the result said no use multi-gpu. Why?
I0303 04:52:48.248244 21608 caffe.cpp:334] Not using GPU #3 for single-GPU function
I0303 04:52:48.248602 21608 caffe.cpp:334] Not using GPU #2 for single-GPU function
I0303 04:52:48.248613 21608 caffe.cpp:334] Not using GPU #1 for single-GPU function
I0303 04:52:48.359235 21608 caffe.cpp:341] Use GPU with device ID 0
I0303 04:52:48.359807 21608 caffe.cpp:345] GPU 0: Tesla P100-SXM2-16GB

@nluehr
Copy link
Contributor

nluehr commented Mar 3, 2017

@getengqing, I'm not that familiar with Caffe. You probably want to direct this question to the caffe project.

@sergey-serebryakov
Copy link

@getengqing Please, see int time() function in tools/caffe.cpp. Multi-GPU performance measurement is not supported.

@getengqing
Copy link
Author

@sergey-serebryakov OK!Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants