-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempted sourmash gather optimizations #519
Comments
Third, I tried various approaches to preloading, where you raise the match threshold based on what you've seen before (that's what's in #520). No joy - apparently the search truncation based on requested threshold isn't very effective, maybe because the threshold matches are so low. (I've been using shakya-unaligned-contigs.sig against the old genbank k=31 tree to benchmark). |
Another thought - if we did a search for containment above some fixed threshold, we could limit ourselves to the results from that search until we fell below that threshold. This would avoid going back to the full tree in some situations. Have to think about when and where that would fail, though. |
seems like @luizirber is On It with #615. |
ref #518 -
I've tried & failed at a few different optimizations for sourmash gather that I wanted to describe here.
First, I tried doing adaptive downsampling, in which we downsample the query signature to a high scaled value (5e5) for the first searches. Unfortunately, the devil is in the details: when you downsample the query signature, you end up with different enough results that in each iteration you need to re-run the search to get the exact right results, and that second search discards any speedup you get from doing the first search.
Second, I tried various ways of caching search results. In particular, you would think that we would get quite a speedup from tracking internal nodes that have 0 matches to the entire query signature and never looking at them again. Not so, it seems.
The text was updated successfully, but these errors were encountered: