Skip to content

Commit

Permalink
improve readme, explain what various flags add
Browse files Browse the repository at this point in the history
  • Loading branch information
inoryy authored Nov 26, 2018
1 parent 807e8a8 commit 1a80861
Showing 1 changed file with 12 additions and 1 deletion.
13 changes: 12 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,14 @@ The TensorFlow library wasn't compiled to use AVX2 instructions, but these are a
The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
```
## Available wheels

## Introduction

The builds enable CPU optimizations such as `SSE4`, `AVX2`, and `FMA`. If you have a CPU released after ~2013 then you'll benefit from them. Note that you will benefit from these even if you do all your training on GPU due to i/o pipeline optimizations. I think I've gained about 10-15% performance boost even on most straightforward supervised learning tasks. And of course in CPU only setting they give significant improvement, sometimes matching GPU speeds on smaller neural networks (especially true for laptops where even in higher end models GPUs tend to lag behind).

Additionally, build enables [XLA](https://www.tensorflow.org/xla/) - an Accelerated Linear Algebra domain-specific just-in-time compiler, and [MPI](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/mpi) - a faster way to run distributed TensorFlow than what is offered built-in.

## Available Wheels
|TensorFlow|CUDA|CuDNN|Python|NCCL|Compute Capability|OS|Link|
|---:|---:|---:|---:|---:|---:|---:|:---:|
|1.12.0|10.0|7.3|3.6|2.3|5.0, 6.1, 7.0|Linux|[tensorflow-1.12.0-cp36-cp36m-linux_x86_64.whl](https://github.com/inoryy/tensorflow-optimized-wheels/releases/download/v1.12.0/tensorflow-1.12.0-cp36-cp36m-linux_x86_64.whl)|
Expand All @@ -34,3 +41,7 @@ Type "help", "copyright", "credits" or "license" for more information.
>>> tf.__version__
'1.12.0'
```

## Requests

If you need a different TensorFlow / CUDA / CuDNN / Python combination feel free to open a GitHub ticket.

0 comments on commit 1a80861

Please sign in to comment.