Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I train OpenFlamingo without LIAON dataset? #215

Open
ElegantLin opened this issue Jul 4, 2023 · 3 comments · May be fixed by #261
Open

Can I train OpenFlamingo without LIAON dataset? #215

ElegantLin opened this issue Jul 4, 2023 · 3 comments · May be fixed by #261

Comments

@ElegantLin
Copy link
Contributor

ElegantLin commented Jul 4, 2023

Thanks for your great job. I wonder whether we can train open flamingo with MMC4 dataset only and I wonder why the loss from MMC4 dataset could be nan.

Thanks for your explanation.

@i-gao
Copy link
Collaborator

i-gao commented Jul 6, 2023

Hi, thanks for your question! The code is not currently configured like this, but it wouldn't be hard to implement (similar to #145). If you'd like to contribute a PR, this would make a great first issue!

Regarding nan losses: great question. This bit of code originally sought to catch cases where the mmc4 sequence looks like "text text ". In this case, all labels are masked to -100, since there are no text tokens after image tokens. We later updated data.py upstream to prevent these sequences from being sampled, so that issue is resolved. There may still be nan cases from training at larger scales than 9B. We have not worked with those scales yet to observe them.

@ElegantLin
Copy link
Contributor Author

ElegantLin commented Jul 8, 2023

Thanks for your kind reply. I am planning to contribute a PR to make this project more complete :). I will close this issue after I finish the PR.

@YerongLi
Copy link

YerongLi commented Jul 18, 2023

Good point, we need to train on smaller dataset. Wish we can get an example workflow.

@anas-awadalla anas-awadalla linked a pull request Sep 19, 2023 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants