Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update alra.py #304

Merged

Conversation

wes-lewis
Copy link
Collaborator

@wes-lewis wes-lewis commented Mar 31, 2021

Fix pre-processing and transformation back into the original space

Submission type

  • This submission adds a new dataset
  • This submission adds a new method
  • This submission adds a new metric
  • This submission adds a new task
  • This submission adds a new Docker image
  • This submission fixes a bug (link to related issue: )
  • This submission adds a new feature not listed above

Testing

  • This submission was written on a forked copy of SingleCellOpenProblems
  • GitHub Actions "Run Benchmark" tests are passing on this base branch of this pull request (include link to passed test: )
  • If this pull request is not ready for review (including passing the "Run Benchmark" tests), I will open this PR as a draft (click on the down arrow next to the "Create Pull Request" button)

Submission guidelines

  • This submission follows the guidelines in our Contributing document
  • I have checked to ensure there aren't other open Pull Requests for the same update/change

wes-lewis and others added 2 commits March 31, 2021 11:03
Fix pre-processing and transformation back into the original space
@wes-lewis
Copy link
Collaborator Author

This fixes pre-processing (normalization and sqrt transform) and inversion of pre-processing into the original space.

@dburkhardt dburkhardt self-requested a review April 5, 2021 18:50
Copy link
Member

@dburkhardt dburkhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how did we decide to handle code versioning? Do you want to pin the code in ALRA.R to a release on GitHub? Can you copy the code over here?

@LuckyMD
Copy link
Collaborator

LuckyMD commented Apr 6, 2021

So how did we decide to handle code versioning? Do you want to pin the code in ALRA.R to a release on GitHub? Can you copy the code over here?

Generally, we could always use docker build dates as versioning, no? This is the least clean way... but it should work (if github repos aren't changed retrospectively that we depend on)

@dburkhardt dburkhardt merged commit a20bbc6 into openproblems-bio:master Apr 22, 2021
lazappi added a commit to lazappi/openproblems that referenced this pull request May 12, 2021
* master:
  Update alra.py (openproblems-bio#304)
  updated template for PR with PR evaluation checks (openproblems-bio#314)
lazappi pushed a commit to lazappi/openproblems that referenced this pull request May 12, 2021
* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>
lazappi added a commit to lazappi/openproblems that referenced this pull request May 12, 2021
* master:
  Update alra.py (openproblems-bio#304)
  updated template for PR with PR evaluation checks (openproblems-bio#314)
lazappi added a commit to lazappi/openproblems that referenced this pull request May 12, 2021
…ction

* dimred-methods: (34 commits)
  Update alra.py (openproblems-bio#304)
  updated template for PR with PR evaluation checks (openproblems-bio#314)
  pre-commit
  Fix preprocessing
  Fix preprocessing
  Add new preprocessing
  Add new preprocessing
  fix region
  fix nf wkdir
  use env.BRANCH
  rm echo
  BRANCH -> WKDIR for s3
  set branch variable in S3 setup job
  Fix s3 bucket clash
  Dimred methods preprocessing (openproblems-bio#301)
  Add trustworthiness score for dimred task (openproblems-bio#258)
  Add Method SCOT to Multi-modal data Integration Task (openproblems-bio#298)
  Change batch size to 1k cells for aff. matrix
  Add preprocessing
  Fix some small bugs
  ...
scottgigante-immunai added a commit that referenced this pull request Apr 21, 2022
* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>
scottgigante-immunai added a commit that referenced this pull request Apr 26, 2022
* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>
scottgigante-immunai added a commit that referenced this pull request May 2, 2022
* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
scottgigante-immunai added a commit that referenced this pull request May 2, 2022
* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

* Install libgeos-dev

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
scottgigante-immunai added a commit that referenced this pull request May 2, 2022
* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
scottgigante-immunai added a commit that referenced this pull request May 2, 2022
* label docker images

* fix syntax

* Delete run_benchmark.yml

* Update from main (#378)

* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>

* Install libgeos-dev (#377)

* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

* Install libgeos-dev

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>

* Install libgeos-dev

* Update test_docker (#379)

* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>

* clean up dockerfile

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
scottgigante-immunai added a commit that referenced this pull request May 10, 2022
* Fix rgeos install (#380)

* label docker images

* fix syntax

* Delete run_benchmark.yml

* Update from main (#378)

* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>

* Install libgeos-dev (#377)

* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

* Install libgeos-dev

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>

* Install libgeos-dev

* Update test_docker (#379)

* Label docker images based on build location (#351)

* label docker images

* fix syntax

* Run benchmark only after unittests (#349)

* run benchmark after unittests

* always run cleanup

* cleanup

* If using GH actions image, test for git diff on dockerfile (#350)

* if using gh actions image, test for git diff on dockerfile

* allow empty tag for now

* decode

* if image doesn't exist, automatically github actions

* fix quotes

* fix parsing and committing of results on tag (#356)

* Import SCOT (#333)

* import SCOT

* pre-commit

* scran requires R

* check that aligned spaces are finite

* exclude unbalanced SCOT for now

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* fix coverage badge # ci skip (#358)

* fix gh actions badge link # ci skip (#359)

* store results in /tmp (#361)

* Remove scot unbalanced (#360)

* Fix benchmark commit (#362)

* store results in /tmp

* add skip_on_empty

* class doesn't have skip on empty

* remove scot altogether (#363)

* Allow codecov to fail on forks

* docker images separate PR (#354)

* docker images separate PR

* all R requirements in r_requirements.txt

* move github r packages to requirements file

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Ignore AWS warning and clean up s3 properly (#366)

* ci cleanup

* ignore aws batch warning

* remove citeseq cbmc from DR (#367)

Co-authored-by: Scott Gigante <[email protected]>

* Update benchmark results # ci skip (#368)

Co-authored-by: SingleCellOpenProblems <[email protected]>

* Jamboree dimensionality reduction methods (#318)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Remove ivis

* pre-commit

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* Only cleanup AWS on success (#371)

* only cleanup on success

* pre-commit

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Jamboree label_projection task (#313)

* Add scvi-tools docker image

* add scanvi

* hvg command use 2000

* update scvi-tools version; use image

* train size

* scanvi mask test labels

* move import

* hvg on train only, fix hvg command

* add scarches scanvi

* use string labels in testing

* enforce batch metadata in dataset

* add batch metadata in pancreas random

* use train adata for scarches

* Add majority vote simple baseline

* test_mode

* use test instead of test mode, update contributing

* update contributing guide

* Added helper function to introduce label noise

* Actually return data with label noise

* Only introduce label noise on training data

* Made a pancreas dataset with label nosie

* Reformat docstring

* Added reference to example label noise dataset in datasets __init__.py

* Add cengen C elegans data loader (#2)

* add CeNGEN C elegans neuron dataset

* add CeNGEN C elegans dataset for global tasks and for label_projection task

* fix lines being too long

* Reformat cengen data loader

* Create tabula_muris_senis.py

Need dataframe containing sample information in './tabula_muris_senis_data_objects/tabula_muris_senis_data_objects.csv' 

load_tabula_muris_senis(method_list, organ_list) takes in methods and organs to extract data from and combines into one anndata object.
If method_list or organ_list = None, do not filter based on that input.
EX: load_tabula_muris_senis(method_list=['facs'], organ_list = None) returns all facs experiments for all organs in one anndata object.

* pre-commit

* Modify anndata in place in add_label_noise rather than copy

* Added CSV file with tabula muris senis data links

* Update tabula_muris_senis.py

* Add random_labels baseline to label_projection task

* Update tabula_muris_senis.py

* Update tabula_muris_senis.py

* pre-commit

* Update tabula_muris_senis.py

* pre-commit

* fix missing labels at prediction time

* Handle test flag through tests and docker, pass to methods

* If test method run, use 1 max_epoch for scvi-tools

* Use only 2 batches for sample dataset for label_projection

* Remove zebrafish random dataset

* Fix decorator dependency to <5.0.0

* Remove functools.wraps from docker decorator for test parameterization

* Fix cengen missing batch info

* Use functools.update_wrapper for docker test

* Add batch to pancreas_random_label_noise

* Make cengen test dataset have more cells per batch

* Set span=0.8 for hvg call for scanvi_hvg methods

* Set span=0.8 for HVG selection only in test mode for scvi

* Revert "Handle test flag through tests and docker, pass to methods"

This reverts commit 3b940c0.

* Add test parameter to label proj baselines

* Fix flake remove unused import

* Revert "Remove zebrafish random dataset"

This reverts commit 3915798.

* Update scVI setup_anndata to new version

* pre-commit

* Reformat and rerun tests

* Add code_url and code_version for baseline label proj methods

* Fallback HVG flavor for label projection task

* pre-commit

* Fix unused import

* Fix using highly_variable_genes

* Pin scvi-tools to 0.15.5

* Unpin scvi-tools, pin jax==0.3.6, see optuna/optuna-examples#99

* Add scikit-misc as requirement for scvi docker

* Pin jaxlib as well

* pin jaxlib along with jax

* Set paper_year to year of implementation

* Set random zebrafish split to 0.8+0.2

* Add tabula_muris_senis_lung_random dataset to label_projection

* pre-commit

* Add tabula muris senis datasets csv

* Fix loading tabula muris csv

* pre-commit

* Test loader for tabula muris senis

Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>

* Run `test_benchmark` on a self-hosted runner (#373)

* set up cirun

* use ubuntu standard AMI

* run nextflow on the self-hosted machine

* add to CONTRIBUTING

* update ami

* install unzip

* set up docker

* install docker from curl

* use t2.micro not nano

* use custom AMI

* pythonLocation

* add scripts to path

* larger disk size

* new image again

* chown for now

* chmod 755

* fixed permissions

* use tower workspace

* test nextflow

* try again

* nextflow -q

* redirect stderr

* increase memory

* cleanup

* sudo install

* name

* try setting pythonpath

* fix branch env

* another fix

* fix run name

* typo

* fix pythonpath:

* don't use pushd

* pass pythonpath

* set nousersite

* empty

* sudo install

* run attempt

* revert temporary changes

* cleanup

* fix contributing

* add instructions for tower

* fix repo name

* move ami setup into script

* Import Olsson 2016 dataset for dimred task (#352)

* Import Olsson 2016 dataset for dimred task

* Fix path to Olsson dataset loader

* Filter genes cells before subsetting Olsson data in test

* Use highly expressed genes for test Olsson dataset

Test dataset is now 700 genes by 300 cells (was 500 x 500)

* Add ivis dimred method (#369)

* add densMAP package to python-extras

* pre-commit

* Add Ivis method

* Explicitly mention it's CPU implementation

* Add forgotten import in __init__

* Remove redundant filtering

* Move ivis inside the function

* Make var names unique, add ivis[cpu] to README

* Pin tensorflow version

* Add NeuralEE skeleton

* Implement method

* added densmap and densne

* Fix typo pytoch -> torch

* pre-commit

* remove densne

* Add forgotten detach/cpu/numpy

* formatting

* pre-commit

* formatting

* formatting

* pre-commit

* formatting

* formatting

* formatting

* pre-commit

* formatting

* umap-learn implementation

* pre-commit

* Add docker image

* Add skeleton method

* formatting

* Implement method

* Fix some small bugs

* Add preprocessing

* Change batch size to 1k cells for aff. matrix

* Add new preprocessing

* Add new preprocessing

* Fix preprocessing

* Fix preprocessing

* pre-commit

* updated template for PR with PR evaluation checks (#314)

* Update alra.py (#304)

* Update alra.py

Fix pre-processing and transformation back into the original space

* pre-commit

* Update alra.py


* make sure necessary methods are imported


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel Burkhardt <[email protected]>

* Add scanpy preprocessing to densmap dimred method

* Rename preprocess_scanpy() to preprocess_logCPM_1kHVG()

* Add preprocessing suffix to dimred methods

* Subset object in preprocess_logCPM_1kHVG()

* Use standard names for input

* Add neuralee_logCPM_1kHVG method

* Add densmap_pca method

* Fix preprocess_logCPM_1kHVG()

Now returns an AnnData rather than acting in place
- Subsetting wasn't working in place

Also set HVG flavor to "cell_ranger"

* Add test argument to dimred methods

* Move preprocess_logCPM_1kHVG() to tools.normalize

* Change name in python-method-scvis Docker README

* Rename openproblems-python-method-scvis container

Now called open-problems-python36

* Fix AnnData ref in merge

* Copy object when subsetting in preprocess_logCPM_1kHVG()

* Move PCA to dimred methods

* Use preprocess_logCPM_1kHVG() in nn_ranking metrics

* Fix path in python36 dockerfile

* Add test kwarg to neuralee_default method

* Add check for n_var to preprocess_logCPM_1kHVG()

Should fix tests that were failing due to scverse/scanpy#2230

* Store raw counts in NeuralEE default method

* Update dimred README

* Replace X_input with PCA in ivis dimred method

* Refactor preprocess_logCPM_1kHVG() to log_cpm_hvg()

* Re-add ivis

Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>

* hotfix timeout-minutes (#374)

* use branch of scprep to provide R traceback (#376)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>

* clean up dockerfile

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>

* only skip CI if command is in commit headline (#381)

* only skip if ci skip is in commit headline

* try using endsWith instead # ci skip

* Fix CI skip (#382)

* only skip if ci skip is in commit headline

* try using endsWith instead # ci skip

* make actions run

* upgrade AMI (#384)

* upgrade AMI

* uncomment docker

* uncomment tests

* Revert "Run test_benchmark on a self-hosted runner (#373)" (#386)

* revert 2d57868

* bash -x

* /bin/bash

* Bugfix CI (#387)

* upgrade AMI

* uncomment docker

* uncomment tests

* clean up testing

* tighter diff for testing

* more memory

* Revert "Bugfix CI (#387)" (#388)

This reverts commit b50a909.

* pass test arg to methods through CLI (#390)

* make scvi run faster on test mode (#385)

* make scvi run faster on test mode

* pass test argument through cli

* dirty hack to fix docker_build (#391)

* remove ivis temporarily (#392)

* neuralee fix (#383)

* build images before testing

* try something different

* needs

* fewer linebreaks

* try as string

* move the if

* remove one condition

* fix

* cancel more quickly

* run benchmark

* don't build on main in run_benchmark

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Daniel Strobl <[email protected]>
Co-authored-by: SingleCellOpenProblems <[email protected]>
Co-authored-by: Luke Zappia <[email protected]>
Co-authored-by: Ben DeMeo <[email protected]>
Co-authored-by: Michal Klein <[email protected]>
Co-authored-by: michalk8 <[email protected]>
Co-authored-by: bendemeo <[email protected]>
Co-authored-by: MalteDLuecken <[email protected]>
Co-authored-by: Wesley Lewis <[email protected]>
Co-authored-by: Daniel Burkhardt <[email protected]>
Co-authored-by: Nikolay Markov <[email protected]>
Co-authored-by: adamgayoso <[email protected]>
Co-authored-by: Valentine Svensson <[email protected]>
Co-authored-by: Eduardo Beltrame <[email protected]>
Co-authored-by: atchen <[email protected]>
rcannood added a commit that referenced this pull request Sep 4, 2024
* fix opv1multimodal loader

* fix processor
rcannood added a commit that referenced this pull request Sep 4, 2024
* fix opv1multimodal loader

* fix processor

Former-commit-id: f4c9eca
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants