SIMBA
is an R package for the Simulation of Microbiome data
with Biological Accuracy.
Based on real data, the package simulates new metagenomic data by re-sampling
real samples and then implants differentially abundant features. Additionally,
the package can simulate confounding factors based on metadata variables in
the real data. The simulations are stored in an .h5
file, which is then
the basis for downstream benchmarking, involving i) reality assessment of the
simulations, ii) testing for differential abundance, and iii) evaluation of
the output from differential abundance testing methods.
SIMBA
was build using R version 4.0 and should run on any operating system
that supports R. It is available via Github and can be installed via devtools
require("devtools")
devtools::install_github(repo = 'zellerlab/SIMBA')
estimated time for installation on a common desktop computer: 12 seconds
Additionally, the package has been submitted to CRAN under the name simbaR
.
A typical SIMBA
workflow consists of four steps, which are explained in more
detail in the vignette, using a toy example:
- Using a real dataset,
SIMBA
simulates data for benchmarking SIMBA
performs a reality assessment of the simulated data- Various differential abundance testing methods are applied to the simulations
- The output of the differential abundance testing methods are evaluated
Please see the vignette for more detail.
Additionally, check out the BAMBI repository on Github, which contains scripts for a large benchmarking effort as reported in our preprint.
If you have any question about SIMBA
, if you run into any issue,
or if you would like to make a feature request, please:
- create an issue in this repository or
- email Morgan Essex or Jakob Wirbel.
SIMBA
is distributed under the
GPL-3 license.
If you use SIMBA
, please cite us by
Wirbel J, Essex M, Foslund, SK Zeller G Evaluation of microbiome association models under realistic and confounded conditions bioRxiv (2022) https://doi.org/10.1101/2022.05.09.491139