PEPPER
is a genome inference module based on recurrent neural networks that enables long-read variant calling and nanopore assembly polishing in the PEPPER-Margin-DeepVariant pipeline. This pipeline enables nanopore-based variant calling with DeepVariant.
PEPPER-Margin-deepvariant v0.6 supports:
- Oxford Nanopore Variant calling for Guppy 5.0.7 "Sup" basecaller.
- Oxford Nanopore Variant calling for R10.4 Q20.
- PacBio-HiFi variant calling.
- Assembly-based structural variant calling method HapDup.
Please cite the following manuscript if you are using PEPPER-Margin-DeepVariant
:
Nature Methods: Haplotype-aware variant calling enables high accuracy in nanopore long-reads using deep neural networks.
Authors: Kishwar Shafin, Trevor Pesout, Pi-Chuan Chang, Maria Nattestad, Alexey Kolesnikov, Sidharth Goel,Gunjan Baid, Mikhail Kolmogorov, Jordan M. Eizenga, Karen H. Miga, Paolo Carnevali, Miten Jain, Andrew Carroll & Benedict Paten.
PEPPER-Margin-DeepVariant can be run using Docker or Singularity. A simple docker command looks like:
sudo docker run \
-v "${INPUT_DIR}":"${INPUT_DIR}" \
-v "${OUTPUT_DIR}":"${OUTPUT_DIR}" \
kishwars/pepper_deepvariant:r0.6 \
run_pepper_margin_deepvariant call_variant \
-b "${INPUT_DIR}/${BAM}" \
-f "${INPUT_DIR}/${REF}" \
-o "${OUTPUT_DIR}" \
-t "${THREADS}" \
--ont_r9_guppy5_sup
# --ont_r9_guppy5_sup is preset for ONT R9.4.1 Guppy 5 "Sup" basecaller
# for ONT R10.4 Q20 reads: --ont_r10_q20
# for PacBio-HiFi reads: --hifi
The variant calling pipeline can be run on Docker or Singularity. The case studies are designed on chr20
of HG002
sample for ONT and HG003
for PacBio-HiFi.
The case-studies include input data and benchmarking of the run:
- Nanopore variant calling using Docker: Link
- Nanopore variant calling using Singularity: Link
- Nanopore R10.4 Q20 variant calling: Link
PEPPER license, Margin License and DeepVariant License extend to the trained models (PEPPER, Margin and DeepVariant) and container environment (Docker and Singularity).
We are thankful to the developers of these packages:
PEPPER-Margin-DeepVariant pipeline is developed in a collaboration between UC Santa Cruz genomics institute and the Genomics team in Google Health.
The name "P.E.P.P.E.R." is inspired from an A.I. created by Tony Stark in the Marvel Comics (Earth-616).
PEPPER is named after Tony Stark's then friend and the CEO of Resilient, Pepper Potts.