Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What training and test folds were used for models in paper #33

Open
BDEvan5 opened this issue Dec 6, 2024 · 0 comments
Open

What training and test folds were used for models in paper #33

BDEvan5 opened this issue Dec 6, 2024 · 0 comments

Comments

@BDEvan5
Copy link

BDEvan5 commented Dec 6, 2024

Hello Borzoi Team

Firstly, I am impressed with the quality of the repositories and tutorials. I was able to generate the data (both to replicate the paper and using the new 393kbp sequences) and I could train the micro, mini and full models. It is rare and thus very impressive to find open source repositories that work out of the box. Thank you for you hard work and effort in making replication easy.

I am confused regarding which train/test folds were used to produce the models in the paper. The paper states:

We trained four models, each with distinct held out test and validation folds. (Page 14, Methods/Data)

However, the repo readme, supported by the answer to issue #11 (Clarification of model fold data splits) says:

We trained a total of 4 model replicates with identical train, validation and test splits (test = fold3, validation = fold4 from sequences_human.bed.gz).

This appears to be contradictory, unless you trained 4 models per fold set and only released the models for (test = fold3, validation = fold4). If this is the case, which models did you use in the results reported in the paper?

I downloaded the four models (links from the readme, e.g. https://storage.googleapis.com/seqnn-share/borzoi/f0/model0_best.h5). I tested on the K652 RNA-seq tracks (ENCSR000AEL, plus and minus), processed with the Makefile for 524kbp sequences from the borzoi-paper repo.

The image below shows my results from testing each model on each fold. I measure a Pearson correlation above 0.83 on each of the folds, except 3 and 4, where the scores are around 0.6/0.7.
This would indicate that the repo is correct in that the models were all trained on the same train/test split.

image

Please will you help me to understand which folds were used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant