-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seqwish does not transitively align through SNPs #60
Comments
Thanks. That's really weird! Will check it out.
…On Mon, Aug 10, 2020, 22:50 Glenn Hickey ***@***.***> wrote:
Here's a simple case inspired from real data from a cactus alignment:
# dummy.fa
>Anc
AAATAAA
>1
AAATAAA
>2
AAATAAA
>3
AAAGAAA
>4
AAAGAAA
# dummy.paf
1 7 0 7 + Anc 7 0 7 7 7 255 cg:Z:7M
2 7 0 7 + Anc 7 0 7 7 7 255 cg:Z:7M
3 7 0 7 + Anc 7 0 7 7 7 255 cg:Z:7M
4 7 0 7 + Anc 7 0 7 7 7 255 cg:Z:7M
seqwish -p dummy.paf -s dummy.fa -g dummy.gfa -P
Those Gs are transitively aligned in the PAF by way of the aligning to T
in Anc but they don't come out in the graph:
[image: dummy]
<https://user-images.githubusercontent.com/901102/89830034-92210100-db29-11ea-820c-3944e1503e7c.png>
This is the exact same issue as ComparativeGenomicsToolkit/hal2vg#26
<ComparativeGenomicsToolkit/hal2vg#26>, funnily
enough.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#60>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDQEPMVXW6RODVBE6ODU3SABMTHANCNFSM4P2K2RQA>
.
|
OK, these aren't transitively aligned so the result you're getting is not a bug but an exact representation of the input alignments. To try to write this out:
The upper-case characters match Anc. The lower-case An all-to-all alignment would provide this, or adding in the alignments within each SNP or variant (but that seems tedious). Alternatively, you could push the output through smoothxg:
This result matches your expectation:
|
Cool thanks. Kind of figured it was by design but thought I'd bring it up anyway, mostly to make me feel slightly less bad about forgetting to handle this in |
I don't think it's trivial to handle! Yeah, seqwish is meant to be a pretty "dumb" algorithm. It just takes the input that's given, with some optional filters ( |
Here's a simple case inspired from real data from a cactus alignment:
Those
G
s are transitively aligned in the PAF by way of the aligning toT
inAnc
but they don't come out in the graph:This is the exact same issue as ComparativeGenomicsToolkit/hal2vg#26, funnily enough.
The text was updated successfully, but these errors were encountered: