A deep learning framework built on an autograd engine with high level abstractions and low level control.
zigrad-demo.mp4
Fast
2.5x+ speedup over a compiled PyTorch model on Apple Silicon, 1.5x on x86. Expect similar performance gains across more architectures and platforms as MKL/CUDA support improves and Zigrad's ML graph compiler is operational.*
*Tensorflow excluded for scaling purposes (too slow). A hermetic, reproducible benchmarking pipeline built on Bazel will allow testing across more platforms (in progress, testers needed).
Built for specialized optimization
Zigrad's design enables deep control and customization
- Fine-grained control over memory management
- Flexible tradeoffs between performance characteristics like latency vs throughput
- Optimize for your specific hardware, use case, and system requirements
- No abstraction layers or build systems that make aggressive optimizations challenging or complex
But wait, there's more..
- Tiny binaries: binaries for the MNIST tests shown are under 400kb in
ReleaseFast
mode and under 200kb inReleaseSmall
. - Graph tracing
- Tensorboard integration*
- Cross platform
- Statically linked executables
- Minimal and transparent heap allocations
*Not yet merged
An example of tracing the computation graph generated by a fully connected neural network for MNIST.
- Input: Batch of images 28x28 pixel samples.
- Flatten:
28x28 -> 784
- FC1: Linear layer
784 -> 128
- ReLU
- FC2: Linear layer
128 -> 64
- ReLU
- FC3: Linear layer
64 -> 10
- Output: Value for each of the 10 classes
We did not have to use Zigrad's modules to write this network at all, as Zigrad is backed by a capable autograd engine. Even when using the autograd backend to dynamically construct the same neural network Zigrad can still trace the graph and render it.
Note: Since the graph is generated from the autograd information, we set the labels for the nodes by naming the tensors for the sake of the diagram.
Only dependency is a BLAS library.
On linux (or intel mac) you have some options,
- MKL (recommended for best performance)
- See https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-download.html
- Reccommend a system installation for simplicity although this can work with
conda
for example, just make sure you adjust the library paths as necessary.
- OpenBLAS
- See https://github.com/OpenMathLib/OpenBLAS/wiki/Precompiled-installation-packages
- Likely available through your package manager as
libopenblas-dev
oropenblas-devel
- Nothing :)
The examples/
directory has some standalone templates you can take and modify, the zon files are pinned to commit hashes.
Hello world example shows how to run a backward pass using the GraphManager.
Note that in this very simple example, we do not need the GraphManager
and the script could be simplified but this is designed to get you familiar with the workflow.
git clone https://github.com/Marco-Christiani/zigrad/
cd zigrad/examples/hello-world
zig build run
Run the mnist demo
cd zigrad/examples/mnist
make help
make
A lot is planned and hoping for support from the Zig community so we can accomplish some of the more ambitious goals.
- More comprehensive MKL support
- More parallelization (e.g. activation functions)
- CUDA support
- Lazy tensors
- Static graph optimization
- Dynamic graph compiler
- MLIR
- Support for popular formats like ONNX and ggml.
- ZML translation for inference
- Lack of GPU support for now
- Effort has been directed towards performant primitives, not many layer types have been implemented
- e.g. conv, pooling, etc are test implementations for verification, they are slow and unoptimized, I would not use them
- In addition to the above list, anything in in docs/roadmap.norg is planned
- Any open issue is available for development, just leave a comment mentioning your interest and I can provide support to help get you started if necessary
- Otherwise, please open an issue first, before working on a PR
- If you are interested in contributing but do not know where to start then open an issue or leave a comment