Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of model fold data splits #11

Closed
adamyhe opened this issue Nov 11, 2023 · 2 comments
Closed

Clarification of model fold data splits #11

adamyhe opened this issue Nov 11, 2023 · 2 comments

Comments

@adamyhe
Copy link

adamyhe commented Nov 11, 2023

For model f0, sequences labeled fold0 form the test set and fold1 form validation.
For model f1, sequences labeled fold1 form the test set and fold2 form validation. Etc

Originally posted by @davek44 in #1 (comment)

Hi, I just wanted to clarify the exact splits that were used for each of the model folds. My reading is that the test/val/train splits for each of the models is set up as:

f0: test=fold0, val=fold1, train=rest
f1: test=fold1, val=fold2, train=rest
f2: test=fold2, val=fold3, train=rest
f3: test=fold3, val=fold4, train=rest

Thanks!

@davek44
Copy link
Contributor

davek44 commented Nov 13, 2023

Yes, this is correct. Here's the code segment that performs that https://github.com/calico/basenji/blob/master/bin/basenji_train_folds.py#L397

@davek44 davek44 closed this as completed Nov 13, 2023
@adamyhe
Copy link
Author

adamyhe commented Nov 13, 2023

Awesome. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants