[Dimensionality reduction] Add metrics based on NN co-ranking #300

michalk8 · 2021-03-31T12:07:25Z

What is the metric?
Describe it briefly and include a citation if applicable.
There are many metrics described here: https://core.ac.uk/download/pdf/148668147.pdf

How should it be implemented?
Include a link to the publicly available codebase if available.
https://www.sciencedirect.com/science/article/pii/S2405844021003042
contains the code. Scaling will also be an issue (for more than 3k cells [estimate]).
I'd also double-check if the implementation is correct.

Which task(s) could it be used for?
If the method is to be used for a task that is not yet included in the code base, use the issue template propose a new task instead.
Dimensionality reduction (2D), possibly

bendemeo · 2021-03-31T16:28:01Z

This is a great reference! I think co-k-nearest-neighbor size (Q_NN(k)) is especially useful, although it has a nuisance parameter k. To make it more robust, we might average Q_NN(k) for k up to some threshold, possibly weighting for lower distances. But maybe Q_NN(k) is a good place to start. Not sure if the original authors compute an entire distance matrix, but I could write code to generate a sparse Q_NN matrix using only nearest neighbor data from sklearn.neighbors.

* Add nf-tower cli for dataset loader * add mising directive labels for dataset loader * add missing directive labels process datasets * remove space in file name * update s3 bucket * increase yaml limit to 5mb * Fix dataset schema validation and remove unnecessary code to fix meta file size * Update dataset schema file path in config.vsh.yaml and main.nf * Add script for processing datasets on nf-tower in bat_int * Remove dataset_schema input from config.vsh.yaml * Add output_task_info to workflow configuration * Update publish directory in process_datasets.sh for bat_int * Update denoising process_datasets wf

* Add nf-tower cli for dataset loader * add mising directive labels for dataset loader * add missing directive labels process datasets * remove space in file name * update s3 bucket * increase yaml limit to 5mb * Fix dataset schema validation and remove unnecessary code to fix meta file size * Update dataset schema file path in config.vsh.yaml and main.nf * Add script for processing datasets on nf-tower in bat_int * Remove dataset_schema input from config.vsh.yaml * Add output_task_info to workflow configuration * Update publish directory in process_datasets.sh for bat_int * Update denoising process_datasets wf Former-commit-id: 82a6a4d

michalk8 added enhancement New feature or request metric labels Mar 31, 2021

michalk8 mentioned this issue Mar 31, 2021

NN co-ranking metrics for dimred lazappi/openproblems#8

Merged

12 tasks

lazappi assigned michalk8 Mar 31, 2021

lazappi mentioned this issue May 14, 2021

Jamboree dimensionality reduction metrics #317

Merged

19 tasks

scottgigante-immunai closed this as completed Dec 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dimensionality reduction] Add metrics based on NN co-ranking #300

[Dimensionality reduction] Add metrics based on NN co-ranking #300

michalk8 commented Mar 31, 2021

bendemeo commented Mar 31, 2021

[Dimensionality reduction] Add metrics based on NN co-ranking #300

[Dimensionality reduction] Add metrics based on NN co-ranking #300

Comments

michalk8 commented Mar 31, 2021

bendemeo commented Mar 31, 2021