Skip to content

Commit

Permalink
Merge pull request #8704 from ArchRobison/adr/simddoc
Browse files Browse the repository at this point in the history
Revise `@simd` documentation.
  • Loading branch information
nolta committed Oct 23, 2014
2 parents f104352 + a3a0d6c commit 3183601
Showing 1 changed file with 14 additions and 4 deletions.
18 changes: 14 additions & 4 deletions doc/manual/performance-tips.rst
Original file line number Diff line number Diff line change
Expand Up @@ -603,14 +603,24 @@ properties of the loop:
possibly causing different results than without ``@simd``.
- No iteration ever waits on another iteration to make forward progress.

A loop containing ``break``, ``continue``, or ``goto`` will cause a
compile-time error.

Using ``@simd`` merely gives the compiler license to vectorize. Whether
it actually does so depends on the compiler. To actually benefit from the
current implementation, your loop should have the following additional
properties:

- The loop must be an innermost loop.
- The loop body must be straight-line code. This is why ``@inbounds`` is currently needed for all array accesses.
- Accesses must have a stride pattern and cannot be "gathers" (random-index reads) or "scatters" (random-index writes).
- The stride should be unit stride.
- In some simple cases, for example with 2-3 arrays accessed in a loop, the LLVM auto-vectorization may kick in automatically, leading to no further speedup with ``@simd``.
- The loop body must be straight-line code. This is why ``@inbounds`` is
currently needed for all array accesses. The compiler can sometimes turn
short ``&&``, ``||``, and ``?:`` expressions into straight-line code,
if it is safe to evaluate all operands unconditionally. Consider using
``ifelse`` instead of ``?:`` in the loop if it is safe to do so.
- Accesses must have a stride pattern and cannot be "gathers" (random-index reads)
or "scatters" (random-index writes).
- The stride should be unit stride.
- In some simple cases, for example with 2-3 arrays accessed in a loop, the
LLVM auto-vectorization may kick in automatically, leading to no further
speedup with ``@simd``.

0 comments on commit 3183601

Please sign in to comment.