You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for releasing your model, I'm sure I can speak for many in the community to say that it's looking hugely impressive!
To use and validate it, I'd like to see know what regions of the genome are in each of the test/validation folds that were used to the four models. For Enformer/Basenji, that was easily reconstructed from the helpfully shared sequences_[human|mouse].bed files in the public Google Storage bucket with 'supplementary' small files here, but I don't believe that's available for Borzoi yet?
Of course, it could be reconstructed from the large training dataset files, but given that I'm only looking for the genomic coordinates rather than the fully processed tracks corresponding to those, I was hoping there is an easier way.
Related to that though, all files in the borzoi-paper bucket currently don't seem to be available, as it returns the following error:
<Error>
<Code>UserProjectMissing</Code>
<Message>
Bucket is a requester pays bucket but no user project provided.
</Message>
<Details>
Bucket is a requester pays bucket but no user project provided.
</Details>
</Error>
Although I'm hoping to not need those files at the moment, I figured I'd still mention it to let you know.
I'm sure the public release has left everyone swamped with questions coming in and issues popping up, so I appreciate any bit of time you are willing to spend on this!
The text was updated successfully, but these errors were encountered:
Thanks for your interest! I add the sequences and targets files into a data/ directory from the github, too, so you don't have to figure out GCP for that.
For model f0, sequences labeled fold0 form the test set and fold1 form validation.
For model f1, sequences labeled fold1 form the test set and fold2 form validation. Etc
Dear developers,
Thanks for releasing your model, I'm sure I can speak for many in the community to say that it's looking hugely impressive!
To use and validate it, I'd like to see know what regions of the genome are in each of the test/validation folds that were used to the four models. For Enformer/Basenji, that was easily reconstructed from the helpfully shared
sequences_[human|mouse].bed
files in the public Google Storage bucket with 'supplementary' small files here, but I don't believe that's available for Borzoi yet?Of course, it could be reconstructed from the large training dataset files, but given that I'm only looking for the genomic coordinates rather than the fully processed tracks corresponding to those, I was hoping there is an easier way.
Related to that though, all files in the
borzoi-paper
bucket currently don't seem to be available, as it returns the following error:Although I'm hoping to not need those files at the moment, I figured I'd still mention it to let you know.
I'm sure the public release has left everyone swamped with questions coming in and issues popping up, so I appreciate any bit of time you are willing to spend on this!
The text was updated successfully, but these errors were encountered: