-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #4 from calico/revision-upd-2
Revision update
- Loading branch information
Showing
64 changed files
with
4,228 additions
and
97 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,14 @@ | ||
# Borzoi Model Evaluation & Analyses | ||
This repository contains shell scripts, notebooks, commands, etc. related to the analyses performed in the [Borzoi manuscript](https://www.biorxiv.org/content/10.1101/2023.08.30.555582v1). These analyses invoke functionality from both the [borzoi repository](https://github.com/calico/borzoi.git) and the [baskerville repository](https://github.com/calico/baskerville.git). Visit those links for general install instructions. | ||
# Borzoi Model Training & Evaluation | ||
|
||
This repository contains shell scripts, notebooks, commands, etc. related to the analyses performed in the [Borzoi paper](https://www.biorxiv.org/content/10.1101/2023.08.30.555582v1), including data processing, model training, and evaluation. These analyses invoke functionality from the [borzoi](https://github.com/calico/borzoi.git), [baskerville](https://github.com/calico/baskerville.git), and [westminster](https://github.com/calico/westminster.git) repositories. Visit those links for general install instructions. | ||
|
||
*Tip*: When executing .sh scripts found in this directory structure, we recommend first navigating in the terminal to the 'borzoi/examples' directory from the [borzoi repository](https://github.com/calico/borzoi), since all file paths are relative to this root directory. | ||
|
||
For example, assuming *borzoi-paper* and *borzoi* are cloned to your home folder, issue commands of the form: | ||
```sh | ||
conda activate <my_conda_env> | ||
cd ~/borzoi/examples | ||
. ~/borzoi-paper/analysis/<some_folder>/<some_script>.sh | ||
``` | ||
|
||
Contact *drk (at) @calicolabs.com* or *jlinder (at) @calicolabs.com* for questions. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
## Analyses | ||
|
||
This directory contains model evaluation scripts and other downstream analyses. | ||
|
||
*Notes*: | ||
- Run the script 'setup_data.sh' to organize the multi-fold hg38 and mm10 data folders, which are required in order to run some evaluations. The hg38 and mm10 data must first be downloaded from the Borzoi training data bucket [here](https://storage.googleapis.com/borzoi-paper/data/) (GCP). | ||
- Some scripts require the QTL data, which can be downloaded [here](https://storage.googleapis.com/borzoi-paper/qtl/) (GCP). | ||
<br/> | ||
|
||
As an example, to evaluate the model on gene-level test set predictions, issue the following commands: | ||
```sh | ||
conda activate borzoi_py310 | ||
cd ~/borzoi/examples | ||
. ~/borzoi-paper/analysis/setup_data.sh | ||
. ~/borzoi-paper/analysis/test_expression/testg.sh | ||
``` | ||
|
||
As another example, to evaluate the model on sQTL variant effect predictions, issue these commands: | ||
```sh | ||
conda activate borzoi_py310 | ||
cd ~/borzoi/examples | ||
. ~/borzoi-paper/analysis/sqtl/bench_sqtl.sh | ||
``` | ||
|
||
The examples assume that you have | ||
- installed a conda environment named 'borzoi_py310', | ||
- cloned the 'borzoi' and 'borzoi-paper' repositories to your home folder, | ||
- downloaded the borzoi training data to '~/borzoi/examples/data', | ||
- downloaded the QTL data to '~/borzoi/examples/data/qtl_cat', | ||
- and configured the borzoi repository ([instructions](https://github.com/calico/borzoi?tab=readme-ov-file#installation)). |
2 changes: 1 addition & 1 deletion
2
analysis/crispr/flowfish/run_gradients_flowfish.sh
100644 → 100755
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
#!/bin/sh | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_k562_undo_clip -f 0,1,2,3 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/borzoi_v2/targets_k562.txt /home/jlinder/borzoi_v2/params_pred.json /home/jlinder/borzoi_v2 /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o saved_models/flowfish_k562 -f 3 -c 0,1,2,3 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t targets_k562.txt params_pred.json saved_models flowfish/crispr_genes.gtf |
18 changes: 9 additions & 9 deletions
18
analysis/crispr/flowfish/run_gradients_flowfish_miborzoi_ablations.sh
100644 → 100755
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,19 @@ | ||
#!/bin/sh | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_k562_all_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/k562_all/targets_k562_subset.txt /home/jlinder/mini_borzois_v2/k562_all/params_pred.json /home/jlinder/mini_borzois_v2/k562_all /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_k562_all -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/k562_all/targets_k562_subset.txt mini_borzois_v2/k562_all/params_pred.json mini_borzois_v2/k562_all flowfish/crispr_genes.gtf | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_k562_dnase_atac_rna_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/k562_dnase_atac_rna/targets_k562_dnase_atac_rna_subset.txt /home/jlinder/mini_borzois_v2/k562_dnase_atac_rna/params_pred.json /home/jlinder/mini_borzois_v2/k562_dnase_atac_rna /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_k562_dnase_atac_rna -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/k562_dnase_atac_rna/targets_k562_dnase_atac_rna_subset.txt mini_borzois_v2/k562_dnase_atac_rna/params_pred.json mini_borzois_v2/k562_dnase_atac_rna flowfish/crispr_genes.gtf | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_k562_rna_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/k562_rna/targets_k562_rna_subset.txt /home/jlinder/mini_borzois_v2/k562_rna/params_pred.json /home/jlinder/mini_borzois_v2/k562_rna /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_k562_rna -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/k562_rna/targets_k562_rna_subset.txt mini_borzois_v2/k562_rna/params_pred.json mini_borzois_v2/k562_rna flowfish/crispr_genes.gtf | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_baseline_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/baseline/targets_subset.txt /home/jlinder/mini_borzois_v2/baseline/params_pred.json /home/jlinder/mini_borzois_v2/baseline /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_baseline -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/baseline/targets_subset.txt mini_borzois_v2/baseline/params_pred.json mini_borzois_v2/baseline flowfish/crispr_genes.gtf | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_human_all_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/human_all/targets_subset.txt /home/jlinder/mini_borzois_v2/human_all/params_pred.json /home/jlinder/mini_borzois_v2/human_all /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_human_all -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/human_all/targets_subset.txt mini_borzois_v2/human_all/params_pred.json mini_borzois_v2/human_all flowfish/crispr_genes.gtf | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_human_dnase_atac_rna_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/human_dnase_atac_rna/targets_human_dnase_atac_rna_subset.txt /home/jlinder/mini_borzois_v2/human_dnase_atac_rna/params_pred.json /home/jlinder/mini_borzois_v2/human_dnase_atac_rna /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_human_dnase_atac_rna -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/human_dnase_atac_rna/targets_human_dnase_atac_rna_subset.txt mini_borzois_v2/human_dnase_atac_rna/params_pred.json mini_borzois_v2/human_dnase_atac_rna flowfish/crispr_genes.gtf | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_multisp_dnase_atac_rna_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/multispecies_dnase_atac_rna/targets_human_dnase_atac_rna_subset.txt /home/jlinder/mini_borzois_v2/multispecies_dnase_atac_rna/params_pred.json /home/jlinder/mini_borzois_v2/multispecies_dnase_atac_rna /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_multisp_dnase_atac_rna -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/multispecies_dnase_atac_rna/targets_human_dnase_atac_rna_subset.txt mini_borzois_v2/multispecies_dnase_atac_rna/params_pred.json mini_borzois_v2/multispecies_dnase_atac_rna flowfish/crispr_genes.gtf | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_multisp_rna_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/multispecies_rna/targets_human_rna_subset.txt /home/jlinder/mini_borzois_v2/multispecies_rna/params_pred.json /home/jlinder/mini_borzois_v2/multispecies_rna /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_multisp_rna -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/multispecies_rna/targets_human_rna_subset.txt mini_borzois_v2/multispecies_rna/params_pred.json mini_borzois_v2/multispecies_rna flowfish/crispr_genes.gtf | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu.py -o flowfish_miborzoi_multisp_no_unet_undo_clip -f 0,1 --rc 1 --shifts 0 --span 0 --smoothgrad 0 --clip_soft 384.0 -t /home/jlinder/mini_borzois_v2/multispecies_no_unet/targets_subset.txt /home/jlinder/mini_borzois_v2/multispecies_no_unet/params_pred.json /home/jlinder/mini_borzois_v2/multispecies_no_unet /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene.py -o mini_borzois_v2/flowfish_miborzoi_multisp_no_unet -f 0,1 -c 0 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 -t mini_borzois_v2/multispecies_no_unet/targets_subset.txt mini_borzois_v2/multispecies_no_unet/params_pred.json mini_borzois_v2/multispecies_no_unet flowfish/crispr_genes.gtf |
2 changes: 1 addition & 1 deletion
2
analysis/crispr/flowfish/run_ism_shuffle_flowfish.sh
100644 → 100755
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
#!/bin/sh | ||
|
||
python /home/jlinder/basenji/bin/borzoi_satg_gene_gpu_crispr_ism_shuffle.py -o flowfish_k562_ism_shuffle_undo_clip -f 0,1,2,3 --rc 1 --shifts 0 --span 0 --clip_soft 384.0 --aggregate_tracks 10 --ism_size 1 --window_size 2048 --n_samples 16 --mononuc_shuffle 0 --dinuc_shuffle 1 --crispr_file /home/jlinder/flowfish/crispr_table.tsv -t /home/jlinder/borzoi_v2/targets_k562.txt /home/jlinder/borzoi_v2/params_pred.json /home/jlinder/borzoi_v2 /home/jlinder/flowfish/crispr_genes.gtf | ||
borzoi_satg_gene_crispr_ism_shuffle.py -o saved_models/flowfish_k562_ism_shuffle -f 3 -c 0,1,2,3 --rc --untransform_old --track_scale 0.3 --track_transform 0.75 --clip_soft 384.0 --aggregate_tracks 10 --ism_size 1 --window_size 2048 --n_samples 16 --dinuc_shuffle --crispr_file flowfish/crispr_table.tsv -t targets_k562.txt params_pred.json saved_models flowfish/crispr_genes.gtf |
Oops, something went wrong.