-
Notifications
You must be signed in to change notification settings - Fork 38
Developer Notes
Mo Tiwari edited this page Feb 19, 2023
·
17 revisions
Welcome to the BanditPAM wiki!
This is a space for code contributors to keep track of notes and learnings that don't belong in Github issues.
- An R implementation of BanditPAM
- An MATLAB implementation of BanditPAM
- An integration with PySpark
-
setuptools
will always, at least partly, use the compiler that Python was compiled with. This causes a problem, e.g., when trying to installclang
-compiled BanditPAM ongcc
-compiled Python and was resulting in errors. This CANNOT be fixed by modifying theCC
environment variable. See https://github.com/pypa/setuptools/issues/1732 - You may occasionally get a bug like
(Producer: 'LLVM13.0.0' Reader: 'LLVM 12.0.0')
; somehow this was the case inbase
after uninstalling and reinstalling somebrew
packages. Weirdly, it was resolved by creating a new Python 3.8conda environment
, in which BanditPAM could be installed successfully, and then somehow (?!) fixed inbase
- Building the PyPy wheels on MacOS via
cibuildwheel
does not work properly; see install_mac.md. We get an error in the Github Actions like the one below. I separately tried adding this gist to the.yml
, as well as this suggestion, but neither worked. A future possibility is to a) upgrade the Accelerate framework on the runner, b) avoid using the Accelerate framework for the PyPy builds, c) try a version ofmacos
on the runner that's later thanmacos 10.15
(but this might hurt backwards compatibility), d) suggestions from here likepython -mpip install numpy
, or e) try to modify the PyPy build'snumpy
installation once it has been instantiated
RuntimeError: Polyfit sanity test emitted a warning, most likely due to using a buggy Accelerate backend. If you compiled yourself, more information is available at https://numpy.org/doc/stable/user/building.html#accelerated-blas-lapack-libraries Otherwise report this to the vendor that provided NumPy.
RankWarning: Polyfit may be poorly conditioned
- It appears that CPython >= 3.10 is compiled with clang in cibuildwheel, whereas CPython <= 3.9 is compiled with
gcc
. This affects how libraries likeomp
vs.gomp
should be linked.
- potentially transpose cache to avoid false sharing
- Move to multi-producer single-consumer queue for cache so that cache can be dynamically resized
- Give each thread a local copy of cache
- Helpful resource: Lecture 9 of series in OMP
- Good practice to have default(none) inside all omp parallel workspace constructs
- Prevent false sharing among threads for better speedups (This is dependent on local cache line size and datatype sizes)
- Consider using loop reductions via OpenMP
- Right now, we compile with system python on the MacOS Github runners. It appears to work, though I'm not sure if the runners are using
gcc
orclang
-- or if it matters, since thesetup.py
should detect it properly.
- pbr
-
Cython -- Cython will likely be MUCH faster. It's how the
sklearn
implementation ofKMeans
is written - Numba
- Eigen (
pybind11
supports it out of the box, and we will likely no longer needcarma
orarmadillo
) - Boost
- Folly