-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dimensionality reduction] Add metrics based on NN co-ranking #300
Comments
This is a great reference! I think co-k-nearest-neighbor size (Q_NN(k)) is especially useful, although it has a nuisance parameter k. To make it more robust, we might average Q_NN(k) for k up to some threshold, possibly weighting for lower distances. But maybe Q_NN(k) is a good place to start. Not sure if the original authors compute an entire distance matrix, but I could write code to generate a sparse Q_NN matrix using only nearest neighbor data from sklearn.neighbors. |
* Add nf-tower cli for dataset loader * add mising directive labels for dataset loader * add missing directive labels process datasets * remove space in file name * update s3 bucket * increase yaml limit to 5mb * Fix dataset schema validation and remove unnecessary code to fix meta file size * Update dataset schema file path in config.vsh.yaml and main.nf * Add script for processing datasets on nf-tower in bat_int * Remove dataset_schema input from config.vsh.yaml * Add output_task_info to workflow configuration * Update publish directory in process_datasets.sh for bat_int * Update denoising process_datasets wf
* Add nf-tower cli for dataset loader * add mising directive labels for dataset loader * add missing directive labels process datasets * remove space in file name * update s3 bucket * increase yaml limit to 5mb * Fix dataset schema validation and remove unnecessary code to fix meta file size * Update dataset schema file path in config.vsh.yaml and main.nf * Add script for processing datasets on nf-tower in bat_int * Remove dataset_schema input from config.vsh.yaml * Add output_task_info to workflow configuration * Update publish directory in process_datasets.sh for bat_int * Update denoising process_datasets wf Former-commit-id: 82a6a4d
What is the metric?
Describe it briefly and include a citation if applicable.
There are many metrics described here: https://core.ac.uk/download/pdf/148668147.pdf
How should it be implemented?
Include a link to the publicly available codebase if available.
https://www.sciencedirect.com/science/article/pii/S2405844021003042
contains the code. Scaling will also be an issue (for more than 3k cells [estimate]).
I'd also double-check if the implementation is correct.
Which task(s) could it be used for?
If the method is to be used for a task that is not yet included in the code base, use the issue template propose a new task instead.
Dimensionality reduction (2D), possibly
The text was updated successfully, but these errors were encountered: