-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gibberish text generation after converting to Huggingface. #712
Comments
Hey! Looking into this to see if it's the case on my end! |
Oh, @kanwatchara-k would you be willing to send what the exact command you ran to run https://github.com/EleutherAI/gpt-neox/pull/701/files#diff-fff7e2d700e82c3e6027c575c1cd96830ba839ff44fa6b82abf2cb21b029d55c was? There’s a chance that your discrepancy is due to my current script not accepting multiple config files like you used for training. |
@haileyschoelkopf Of course. Though I did make some changes to the code (the modified version is here). To be specific, I hard-coded the vocab file path and the tokenizer type. I also combined the two config files in the code (with the paths also hard-coded). With the modified code, I just ran the command Thanks |
Thank you!! I’ll try this to convert your checkpoint as soon as I can, hopefully later today or early tomorrow! |
Still working on finding the possible issue here--I'll keep you posted! |
@kanwatchara-k so sorry for the delay on my end. What fixed the issue on my end, where I had a model that also had this problem, was:
The issue here was that your and my models which wouldn't convert properly use a layer setup different from GPT-NeoX-20b, which is controlled by |
@haileyschoelkopf Thank you so much! It works properly now! |
Hi,
I am having trouble converting my checkpoints to Huggingface format. The model works fine when using Deepspeed + Megatron (example).
However, it becomes gibberish when converted into Huggingface format (example).
I have tried multiple conversion scripts so far (e.g., this and this) without success.
All the related files (weights, config, and tokenizer) are in my google drive.
Any help is greatly appreciated!
The text was updated successfully, but these errors were encountered: