Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distiller #168

Open
yardenmatok203 opened this issue Jul 8, 2023 · 6 comments
Open

distiller #168

yardenmatok203 opened this issue Jul 8, 2023 · 6 comments

Comments

@yardenmatok203
Copy link

Hey,

can you please elaborate your preprocessing with distiller and cooler?

Thanks!

@gfudenberg
Copy link
Contributor

Distiller code for going from reads to binned contact maps can be found here:
https://github.com/open2c/distiller-nf
(it relies on pairtools: https://www.biorxiv.org/content/10.1101/2023.02.13.528389v1.full and cooler)
We used defaults for filters, mapq30 filter for read quality, and set the binsize used for cooler to 2048bp. This was chosen as a multiple of 2 close to the maximum resolution we felt comfortable analyzing the contact datasets in the paper given their read coverage and experimental protocol details.

Use of cooler is integrated into the distiller pipeline. If you already have mapped reads in a .pairs format, though, you can use cooler cload: https://cooler.readthedocs.io/en/latest/cli.html#cooler-cload.

@yardenmatok203
Copy link
Author

thank you!
Did you used raw files from: https://data.4dnucleome.org/files-fastq/4DNFITHCIIBX/ ?

@gfudenberg
Copy link
Contributor

Yes, but for that experiment I'd just start with the .pairs file, as they were processed with distiller anyways.

@yardenmatok203
Copy link
Author

This will do the ICA and binning?
please can you tell me which parameters should I use in order it will be close to your processing?

Thank you,
Yarden

@gfudenberg
Copy link
Contributor

Hi Yarden,
cooler handles the iterative correction (ICE) and binning, we used defaults for human data-- please see the cooler docs/manuscript for more info: https://cooler.readthedocs.io/en/latest/datamodel.html, https://academic.oup.com/bioinformatics/article-abstract/36/1/311/5530598
Hope that helps!

@yardenmatok203
Copy link
Author

I want to use .hic data, what should I use?

thanks a lot!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants