This is the repository for the paper:
Michael A. Alcorn. AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly Estimating Complex SO(3) Distributions. arXiv. 2023.
AQuaMaM effectively models the uncertainty of different viewpoints of objects, as demonstrated here for a die and a cylinder (note, separate models were trained for each object).
In the plots, each point corresponds to a rotation vector
An overview of the AQuaMaM architecture.
Given an image/rotation matrix pair [START]
embedding
Because there is a bijective mapping between the unit disk
(a) | (b) |
AQuaMaM can model distributions on other kinds of manifolds too. The left plots in (a) and (b) show the true density on a sphere for a mixture of two von Mises–Fisher distributions with the viewpoints being centered on the two modes. The right plots show the density learned by AQuaMaM Jr. (a slight modification of the AQuaMaM architecture; see the Colab notebook here).
If you use this code for your own research, please cite:
@article{alcorn2023aquamam,
title={AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly Estimating Complex SO(3) Distributions},
author={Alcorn, Michael A.},
journal={arXiv preprint arXiv:2301.08838},
year={2023}
}
pip3 install --upgrade -r requirements.txt
python3 generate_datasets.py {cube|cylinder}
This script renders 520,000 images of the selected object (cube
or cylinder
), so it takes a little while to run.
The script creates a directory named cube
or cylinder
that contains three CSVs: metadata_train.csv
, metadata_valid.csv
, and metadata_test.csv
, and a folder images
with three subfolders: train
(which contains 500,000 images), valid
(which contains 10,000 images), and test
(which also contains 10,000 images).
Run the following script, editing the variables as appropriate. To change the model/training hyperparameters for a run, edit configs.py
.
MODEL=aquamam
DATASET=cube
nohup python3 train.py ${MODEL} ${DATASET} > ${MODEL}_${DATASET}.log &
Run the following script, editing the variables as appropriate.
MODEL=aquamam
DATASET=cube
python3 evaluate.py ${MODEL} ${DATASET}
Following "Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling", I trained AQuaMaM on a "peak" distribution (Colab notebook here). On this distribution, the discrete normalizing flow model reached a log-likelihood of 13.93 with the next closest baseline model reaching 13.47 (see Table 1 in their paper). In comparison, AQuaMaM reached a log-likelihood of 29.51. This performance is a direct consequence of AQuaMaM's formulation, and is discussed in Section 2.4 of the manuscript:
As a final point, it is worth noting that the last term in Equation 4 is bounded below such that
$\frac{N q_{w}}{2 ω_{q_{y}} ω_{q_{z}}} \geq \frac{N^{3} q_{w}}{8}$ , i.e., for a given$π_{q_{x}} π_{q_{y}} π_{q_{z}}$ , the likelihood increases at least cubically with the number of bins. For 50,257 bins (i.e., the size of GPT-2/3's vocabulary),$N^{3} = 1.26 \times 10^{14}$ .