-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Malformed file from to-mr
#50
Comments
Hi, Thanks for posting this issue! I'm able to reproduce your results, and it looks like there's an issue with our |
Thanks for looking into it @terencewtli ! While we are at it, it would be great to get a rundown of what
Looking at the line count, I see around 3M, which makes ballpark sense as its paired end sequencing and the MR file should be a mix of mostly pairs and some singletons:
However, its unclear which reads are included in the MR file. This sequencing run has 151 bp reads, so I looked at the MR fragment length in relation to that, to approximate the number of singletons and pairs:
So |
I'm getting a very similar error with to-mr generating invalid mr files.
I deleted that line (the first line) and still got an error:
The mr file has other issues too. Here is one of the following lines:
|
I'd like to try using bam2mr, but I can't build an old version of preseq. I get a make error: `fatal error: #include Is there a way I can download a binary for bam2mr? |
@vmkalbskopf I'll do my best to respond, but would it be possible to separate these issues? I would like to help you get things working asap, but I'd still like a record of the individual errors. Also, I can't promise a binary, but if you email me directly we might find a work-around. |
@andrewdavidsmith so kind of you to respond so quickly. I will create a separate issue for the garbled mr file. |
Hi, I am trying to troubleshoot an issue with using the
to-mr
executable to covert a bam file generated withbwa mem
. I am using preseq in a docker container build from the following dockerfile:I ran this on a file containing about 5M reads, and got the error below. for reproducing the error, I took a subset of the BAM, and it is attached.
I ran gc_extrap as follows and got the following error:
Looking through the MR file, it seems that the following reads are causing the problem:
Running with bam2mr warns of several segment length issues, but results in an output file containing the passing reads:
Those files are in the attached zip as well.
Is there something wrong with my BAM file? I have been using preseq 2.0.3 for a while (and bam2mr), and am trying to update to 3.1.1. Converting the BAM to BED with bedtools's bedToBam and running preseq results in no errors.
Thanks in advance!
tmp.zip
The text was updated successfully, but these errors were encountered: