Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long processing time on 1.5_single_cell_statistics.py #19

Open
asjureka opened this issue Oct 4, 2022 · 1 comment
Open

Long processing time on 1.5_single_cell_statistics.py #19

asjureka opened this issue Oct 4, 2022 · 1 comment

Comments

@asjureka
Copy link

asjureka commented Oct 4, 2022

Hi,

I was recently running the SONAR workflow on our HPC cluster, and I noticed that it was taking quite a long time to finish that step. When I went through the script, it appears that this particular script isn't threaded (or not obviously, please correct me if I'm wrong). Would it be possible to add threading to this script to help it utilize available processing power more efficiently?

Thank you!

@scharch
Copy link
Owner

scharch commented Oct 28, 2022

Yes, this is a known issue. The main cause is large 10x datasets with droplets containing 10s of (usually light) chains that trigger an inordinate number of alignments trying to collapse them. Despite the function name, it is actually extremely slow, writing to disk, calling muscle, and then reading the output.

I have some ideas about how to fix this, which I hope to include as part of a major overhaul/refactor of module 1 planned for the next major version of SONAR, but I don't have any sort of timeline for release yet.

In the meantime, you are welcome to submit a pull request to add threading, though I'm not sure how much of a speed up you can get that way. A better approach would probably be to filter your rearrangements TSV prior to calling to 1.5 to remove "cells" with more than 5-10 chains present --they are likely to be background noise/unsalvageable, anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants