Update alra.py #304

wes-lewis · 2021-03-31T15:03:55Z

Fix pre-processing and transformation back into the original space

Submission type

This submission adds a new dataset
This submission adds a new method
This submission adds a new metric
This submission adds a new task
This submission adds a new Docker image
This submission fixes a bug (link to related issue: )
This submission adds a new feature not listed above

Testing

This submission was written on a forked copy of SingleCellOpenProblems
GitHub Actions "Run Benchmark" tests are passing on this base branch of this pull request (include link to passed test: )
If this pull request is not ready for review (including passing the "Run Benchmark" tests), I will open this PR as a draft (click on the down arrow next to the "Create Pull Request" button)

Submission guidelines

This submission follows the guidelines in our Contributing document
I have checked to ensure there aren't other open Pull Requests for the same update/change

Fix pre-processing and transformation back into the original space

wes-lewis · 2021-03-31T15:05:47Z

This fixes pre-processing (normalization and sqrt transform) and inversion of pre-processing into the original space.

dburkhardt

So how did we decide to handle code versioning? Do you want to pin the code in ALRA.R to a release on GitHub? Can you copy the code over here?

LuckyMD · 2021-04-06T22:40:24Z

So how did we decide to handle code versioning? Do you want to pin the code in ALRA.R to a release on GitHub? Can you copy the code over here?

Generally, we could always use docker build dates as versioning, no? This is the least clean way... but it should work (if github repos aren't changed retrospectively that we depend on)

* master: Update alra.py (openproblems-bio#304) updated template for PR with PR evaluation checks (openproblems-bio#314)

* Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]>

* master: Update alra.py (openproblems-bio#304) updated template for PR with PR evaluation checks (openproblems-bio#314)

…ction * dimred-methods: (34 commits) Update alra.py (openproblems-bio#304) updated template for PR with PR evaluation checks (openproblems-bio#314) pre-commit Fix preprocessing Fix preprocessing Add new preprocessing Add new preprocessing fix region fix nf wkdir use env.BRANCH rm echo BRANCH -> WKDIR for s3 set branch variable in S3 setup job Fix s3 bucket clash Dimred methods preprocessing (openproblems-bio#301) Add trustworthiness score for dimred task (openproblems-bio#258) Add Method SCOT to Multi-modal data Integration Task (openproblems-bio#298) Change batch size to 1k cells for aff. matrix Add preprocessing Fix some small bugs ...

* add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]>

* add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]>

* Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]>

* Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) * Install libgeos-dev Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]>

* Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]>

* label docker images * fix syntax * Delete run_benchmark.yml * Update from main (#378) * Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> * Install libgeos-dev (#377) * Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) * Install libgeos-dev Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> * Install libgeos-dev * Update test_docker (#379) * Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> * clean up dockerfile Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]>

* Fix rgeos install (#380) * label docker images * fix syntax * Delete run_benchmark.yml * Update from main (#378) * Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> * Install libgeos-dev (#377) * Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) * Install libgeos-dev Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> * Install libgeos-dev * Update test_docker (#379) * Label docker images based on build location (#351) * label docker images * fix syntax * Run benchmark only after unittests (#349) * run benchmark after unittests * always run cleanup * cleanup * If using GH actions image, test for git diff on dockerfile (#350) * if using gh actions image, test for git diff on dockerfile * allow empty tag for now * decode * if image doesn't exist, automatically github actions * fix quotes * fix parsing and committing of results on tag (#356) * Import SCOT (#333) * import SCOT * pre-commit * scran requires R * check that aligned spaces are finite * exclude unbalanced SCOT for now Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * fix coverage badge # ci skip (#358) * fix gh actions badge link # ci skip (#359) * store results in /tmp (#361) * Remove scot unbalanced (#360) * Fix benchmark commit (#362) * store results in /tmp * add skip_on_empty * class doesn't have skip on empty * remove scot altogether (#363) * Allow codecov to fail on forks * docker images separate PR (#354) * docker images separate PR * all R requirements in r_requirements.txt * move github r packages to requirements file * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Ignore AWS warning and clean up s3 properly (#366) * ci cleanup * ignore aws batch warning * remove citeseq cbmc from DR (#367) Co-authored-by: Scott Gigante <[email protected]> * Update benchmark results # ci skip (#368) Co-authored-by: SingleCellOpenProblems <[email protected]> * Jamboree dimensionality reduction methods (#318) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Remove ivis * pre-commit Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * Only cleanup AWS on success (#371) * only cleanup on success * pre-commit Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Jamboree label_projection task (#313) * Add scvi-tools docker image * add scanvi * hvg command use 2000 * update scvi-tools version; use image * train size * scanvi mask test labels * move import * hvg on train only, fix hvg command * add scarches scanvi * use string labels in testing * enforce batch metadata in dataset * add batch metadata in pancreas random * use train adata for scarches * Add majority vote simple baseline * test_mode * use test instead of test mode, update contributing * update contributing guide * Added helper function to introduce label noise * Actually return data with label noise * Only introduce label noise on training data * Made a pancreas dataset with label nosie * Reformat docstring * Added reference to example label noise dataset in datasets __init__.py * Add cengen C elegans data loader (#2) * add CeNGEN C elegans neuron dataset * add CeNGEN C elegans dataset for global tasks and for label_projection task * fix lines being too long * Reformat cengen data loader * Create tabula_muris_senis.py Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object. If method_list or organ_list = None, do not filter based on that input. EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object. * pre-commit * Modify anndata in place in add_label_noise rather than copy * Added CSV file with tabula muris senis data links * Update tabula_muris_senis.py * Add random_labels baseline to label_projection task * Update tabula_muris_senis.py * Update tabula_muris_senis.py * pre-commit * Update tabula_muris_senis.py * pre-commit * fix missing labels at prediction time * Handle test flag through tests and docker, pass to methods * If test method run, use 1 max_epoch for scvi-tools * Use only 2 batches for sample dataset for label_projection * Remove zebrafish random dataset * Fix decorator dependency to <5.0.0 * Remove functools.wraps from docker decorator for test parameterization * Fix cengen missing batch info * Use functools.update_wrapper for docker test * Add batch to pancreas_random_label_noise * Make cengen test dataset have more cells per batch * Set span=0.8 for hvg call for scanvi_hvg methods * Set span=0.8 for HVG selection only in test mode for scvi * Revert "Handle test flag through tests and docker, pass to methods" This reverts commit 3b940c0. * Add test parameter to label proj baselines * Fix flake remove unused import * Revert "Remove zebrafish random dataset" This reverts commit 3915798. * Update scVI setup_anndata to new version * pre-commit * Reformat and rerun tests * Add code_url and code_version for baseline label proj methods * Fallback HVG flavor for label projection task * pre-commit * Fix unused import * Fix using highly_variable_genes * Pin scvi-tools to 0.15.5 * Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99 * Add scikit-misc as requirement for scvi docker * Pin jaxlib as well * pin jaxlib along with jax * Set paper_year to year of implementation * Set random zebrafish split to 0.8+0.2 * Add tabula_muris_senis_lung_random dataset to label_projection * pre-commit * Add tabula muris senis datasets csv * Fix loading tabula muris csv * pre-commit * Test loader for tabula muris senis Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> * Run `test_benchmark` on a self-hosted runner (#373) * set up cirun * use ubuntu standard AMI * run nextflow on the self-hosted machine * add to CONTRIBUTING * update ami * install unzip * set up docker * install docker from curl * use t2.micro not nano * use custom AMI * pythonLocation * add scripts to path * larger disk size * new image again * chown for now * chmod 755 * fixed permissions * use tower workspace * test nextflow * try again * nextflow -q * redirect stderr * increase memory * cleanup * sudo install * name * try setting pythonpath * fix branch env * another fix * fix run name * typo * fix pythonpath: * don't use pushd * pass pythonpath * set nousersite * empty * sudo install * run attempt * revert temporary changes * cleanup * fix contributing * add instructions for tower * fix repo name * move ami setup into script * Import Olsson 2016 dataset for dimred task (#352) * Import Olsson 2016 dataset for dimred task * Fix path to Olsson dataset loader * Filter genes cells before subsetting Olsson data in test * Use highly expressed genes for test Olsson dataset Test dataset is now 700 genes by 300 cells (was 500 x 500) * Add ivis dimred method (#369) * add densMAP package to python-extras * pre-commit * Add Ivis method * Explicitly mention it's CPU implementation * Add forgotten import in __init__ * Remove redundant filtering * Move ivis inside the function * Make var names unique, add ivis[cpu] to README * Pin tensorflow version * Add NeuralEE skeleton * Implement method * added densmap and densne * Fix typo pytoch -> torch * pre-commit * remove densne * Add forgotten detach/cpu/numpy * formatting * pre-commit * formatting * formatting * pre-commit * formatting * formatting * formatting * pre-commit * formatting * umap-learn implementation * pre-commit * Add docker image * Add skeleton method * formatting * Implement method * Fix some small bugs * Add preprocessing * Change batch size to 1k cells for aff. matrix * Add new preprocessing * Add new preprocessing * Fix preprocessing * Fix preprocessing * pre-commit * updated template for PR with PR evaluation checks (#314) * Update alra.py (#304) * Update alra.py Fix pre-processing and transformation back into the original space * pre-commit * Update alra.py * make sure necessary methods are imported Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Daniel Burkhardt <[email protected]> * Add scanpy preprocessing to densmap dimred method * Rename preprocess_scanpy() to preprocess_logCPM_1kHVG() * Add preprocessing suffix to dimred methods * Subset object in preprocess_logCPM_1kHVG() * Use standard names for input * Add neuralee_logCPM_1kHVG method * Add densmap_pca method * Fix preprocess_logCPM_1kHVG() Now returns an AnnData rather than acting in place - Subsetting wasn't working in place Also set HVG flavor to "cell_ranger" * Add test argument to dimred methods * Move preprocess_logCPM_1kHVG() to tools.normalize * Change name in python-method-scvis Docker README * Rename openproblems-python-method-scvis container Now called open-problems-python36 * Fix AnnData ref in merge * Copy object when subsetting in preprocess_logCPM_1kHVG() * Move PCA to dimred methods * Use preprocess_logCPM_1kHVG() in nn_ranking metrics * Fix path in python36 dockerfile * Add test kwarg to neuralee_default method * Add check for n_var to preprocess_logCPM_1kHVG() Should fix tests that were failing due to scverse/scanpy#2230 * Store raw counts in NeuralEE default method * Update dimred README * Replace X_input with PCA in ivis dimred method * Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg() * Re-add ivis Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Scott Gigante <[email protected]> * hotfix timeout-minutes (#374) * use branch of scprep to provide R traceback (#376) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> * clean up dockerfile Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]> * only skip CI if command is in commit headline (#381) * only skip if ci skip is in commit headline * try using endsWith instead # ci skip * Fix CI skip (#382) * only skip if ci skip is in commit headline * try using endsWith instead # ci skip * make actions run * upgrade AMI (#384) * upgrade AMI * uncomment docker * uncomment tests * Revert "Run test_benchmark on a self-hosted runner (#373)" (#386) * revert 2d57868 * bash -x * /bin/bash * Bugfix CI (#387) * upgrade AMI * uncomment docker * uncomment tests * clean up testing * tighter diff for testing * more memory * Revert "Bugfix CI (#387)" (#388) This reverts commit b50a909. * pass test arg to methods through CLI (#390) * make scvi run faster on test mode (#385) * make scvi run faster on test mode * pass test argument through cli * dirty hack to fix docker_build (#391) * remove ivis temporarily (#392) * neuralee fix (#383) * build images before testing * try something different * needs * fewer linebreaks * try as string * move the if * remove one condition * fix * cancel more quickly * run benchmark * don't build on main in run_benchmark Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott Gigante <[email protected]> Co-authored-by: Daniel Strobl <[email protected]> Co-authored-by: SingleCellOpenProblems <[email protected]> Co-authored-by: Luke Zappia <[email protected]> Co-authored-by: Ben DeMeo <[email protected]> Co-authored-by: Michal Klein <[email protected]> Co-authored-by: michalk8 <[email protected]> Co-authored-by: bendemeo <[email protected]> Co-authored-by: MalteDLuecken <[email protected]> Co-authored-by: Wesley Lewis <[email protected]> Co-authored-by: Daniel Burkhardt <[email protected]> Co-authored-by: Nikolay Markov <[email protected]> Co-authored-by: adamgayoso <[email protected]> Co-authored-by: Valentine Svensson <[email protected]> Co-authored-by: Eduardo Beltrame <[email protected]> Co-authored-by: atchen <[email protected]>

* fix opv1multimodal loader * fix processor

* fix opv1multimodal loader * fix processor Former-commit-id: f4c9eca

wes-lewis and others added 2 commits March 31, 2021 11:03

Update alra.py

de92513

Fix pre-processing and transformation back into the original space

pre-commit

f3c164e

wes-lewis and others added 6 commits March 31, 2021 11:05

Merge branch 'master' into task-denoising-method-alra

c32e70d

Update alra.py

e32c2d1

Update alra.py

c5f9b90

pre-commit

9b964f4

make sure necessary methods are imported

2dc5215

pre-commit

97bf1d5

dburkhardt self-requested a review April 5, 2021 18:50

dburkhardt approved these changes Apr 5, 2021

View reviewed changes

dburkhardt reviewed Apr 5, 2021

View reviewed changes

Merge branch 'master' into task-denoising-method-alra

4e48513

dburkhardt merged commit a20bbc6 into openproblems-bio:master Apr 22, 2021

lazappi added a commit to lazappi/openproblems that referenced this pull request May 12, 2021

Merge branch 'master' into dimred-datasets

1225c4d

* master: Update alra.py (openproblems-bio#304) updated template for PR with PR evaluation checks (openproblems-bio#314)

lazappi added a commit to lazappi/openproblems that referenced this pull request May 12, 2021

Merge branch 'master' into dimred-metrics

63068e2

* master: Update alra.py (openproblems-bio#304) updated template for PR with PR evaluation checks (openproblems-bio#314)

rcannood added a commit that referenced this pull request Sep 4, 2024

Fix dataset loaders and processors (#304)

f4c9eca

* fix opv1multimodal loader * fix processor

rcannood added a commit that referenced this pull request Sep 4, 2024

Fix dataset loaders and processors (#304)

26c44eb

* fix opv1multimodal loader * fix processor Former-commit-id: f4c9eca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update alra.py #304

Update alra.py #304

wes-lewis commented Mar 31, 2021 •

edited

Loading

wes-lewis commented Mar 31, 2021

dburkhardt left a comment

LuckyMD commented Apr 6, 2021

Update alra.py #304

Update alra.py #304

Conversation

wes-lewis commented Mar 31, 2021 • edited Loading

Submission type

Testing

Submission guidelines

wes-lewis commented Mar 31, 2021

dburkhardt left a comment

Choose a reason for hiding this comment

LuckyMD commented Apr 6, 2021

wes-lewis commented Mar 31, 2021 •

edited

Loading