-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HaplotypeCaller and Mutect2 in 4.1.6.0 failing with Smith-Waterman cigar reference length mismatch #6533
Comments
@davidbenjamin I've spun this off into a separate ticket. I think this might be serious/widespread enough that we'll want a 4.1.6.1 release once we have a working fix. I recommend that we manually do full-scale test runs of both HaplotypeCaller and Mutect2 to confirm that the issue is resolved rather than relying on the integration tests, which are not large-scale enough to catch edge cases like this, unfortunately. |
It seems like the patch in 4.1.6 didn't go far enough and that exception needs to be replaced with a For example, suppose we have a ref haplotype ABCDD, where A, B, and C represent sequences of, say, 100 bases and D is a sequence of 50 bases. Suppose further that A and DD are the padding. Then the cigar of an alt haplotype ABCD gets aligned as a 350 base match that doesn't span the full padded reference region, leading to the error. I still need to figure out why this didn't happen in 4.1.4 (my guess is that elsewhere the code effectively skipped these haplotypes before the exception). |
That would work, but I see where I caused the regression upstream. I chopped leading and trailing deletions from haplotype cigars, same as for read cigars, but for haplotypes we want to keep these deletions because the start and end positions need to remain pegged to the reference start and end. I have a fix + regression test branch, which is running on every M2 validation. |
gatk 4.1.4.1 version working fine for |
@davidbenjamin How's the patch coming? Did the M2 validation tests pass on your branch? We'll definitely try to expedite the code review, but I'll think we'll want some additional heavy-duty testing prior to release. |
We have asked the green team to run their pipeline tests on this branch to at least limit the risk of more full sample failures. It will probably be a few more days before we have those results. @gbggrant |
Fixed by #6544 -- closing. |
As reported in this forum thread: https://gatk.broadinstitute.org/hc/en-us/community/posts/360060174372-Haplotype-Caller-4-1-6-0-java-lang-IllegalStateException-Smith-Waterman-alignment-failure- as well as in the closed ticket #6490 (comment), there appears to be a regression in GATK 4.1.6.0 causing HaplotypeCaller/Mutect2 to fail with an error like the following:
The text was updated successfully, but these errors were encountered: