You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear nf-core community, thank you very much for your incredible work!
I'm working with WGS samples with 100 X coverage. I have bam and bai files.
When I run sarek on these files I have to wait from 4 to 6 hour to convert them from bam to cram and I don't even need those cram files.
I was wondering if it could be possible to create and option to work exclusively with bam instead of cram.
Furthermore, looking at the code and modules, I found that BAM_MARKDUPLICATES:GATK4_MARKDUPLICATES is the main gateway to cram instead of bam outputs.
Probably with a slight modification to this modules, making it optional to convert from bam to cram after markduplicates, it'll also be possible to have bam outputs without converting cram to bam when using the full pipeline starting from fastq inputs.
Thank you in advance and best regards!
Youssef
The text was updated successfully, but these errors were encountered:
…ps, it's not necessary. (#1728)
Thank you to @SPPearce for chatting with me again to finally remove this
wasteful step.
Closes#1162
<!--
# nf-core/sarek pull request
Many thanks for contributing to nf-core/sarek!
Please fill in the appropriate checklist below (delete whatever is not
relevant).
These are the most common things requested on pull requests (PRs).
Remember that PRs should be made against the dev branch, unless you're
preparing a pipeline release.
Learn more about contributing:
[CONTRIBUTING.md](https://github.com/nf-core/sarek/tree/master/.github/CONTRIBUTING.md)
-->
## PR checklist
- [ ] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add
tests!
- [ ] If you've added a new tool - have you followed the pipeline
conventions in the [contribution
docs](https://github.com/nf-core/sarek/tree/master/.github/CONTRIBUTING.md)
- [ ] If necessary, also make a PR on the nf-core/sarek _branch_ on the
[nf-core/test-datasets](https://github.com/nf-core/test-datasets)
repository.
- [ ] Make sure your code lints (`nf-core lint`).
- [ ] Ensure the test suite passes (`nextflow run . -profile test,docker
--outdir <OUTDIR>`).
- [ ] Check for unexpected warnings in debug mode (`nextflow run .
-profile debug,test,docker --outdir <OUTDIR>`).
- [ ] Usage Documentation in `docs/usage.md` is updated.
- [ ] Output Documentation in `docs/output.md` is updated.
- [ ] `CHANGELOG.md` is updated.
- [ ] `README.md` is updated (including new tool citations and
authors/contributors).
---------
Co-authored-by: Maxime U Garcia <[email protected]>
Description of feature
Dear nf-core community, thank you very much for your incredible work!
I'm working with WGS samples with 100 X coverage. I have bam and bai files.
When I run sarek on these files I have to wait from 4 to 6 hour to convert them from bam to cram and I don't even need those cram files.
I was wondering if it could be possible to create and option to work exclusively with bam instead of cram.
Furthermore, looking at the code and modules, I found that BAM_MARKDUPLICATES:GATK4_MARKDUPLICATES is the main gateway to cram instead of bam outputs.
https://github.com/nf-core/sarek/blob/3.1.2/modules/nf-core/gatk4/markduplicates/main.nf
Probably with a slight modification to this modules, making it optional to convert from bam to cram after markduplicates, it'll also be possible to have bam outputs without converting cram to bam when using the full pipeline starting from fastq inputs.
Thank you in advance and best regards!
Youssef
The text was updated successfully, but these errors were encountered: