-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust default values for batching #934
Conversation
The |
pipeline/translate/decoder.yml
Outdated
maxi-batch-sort: src | ||
precision: float16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One downside of moving it to this config is that some GPUs on Berlin cluster that we plan to include soon don't support half precision decoding. And if I remember correctly it fails silently and produces bad results on decoding, so it's quite dangerous. So, this setting should be likely propagated through marian args or there should be a detection mechanism in the script.
Well, in fact all those values were kind of tuned for those smaller GPUs, so they'll likely OOM. We should either leave the safe values and override them in the production training config template or add a todo and an issue to refactor this to adjust those values based on a GPU model form a python script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a TODO for checking that the values are safe #936. The issue is that the decoder.yml is the only place to modify these values, as the config values aren't forwarded, see #933. My CTranslate2 patch stack fixes that piece of it, but I don't want to put it up for review until the experiment results come back.
At the very least in the short term, we should have the config working for our current execution environment.
7ee578d
to
b007fca
Compare
I checked the Marian code and it matters for training, but it's the same for just translating. I'm happy to switch to it though. |
This results in a 2.2x speed up for train-mono tasks according to my testing in #931. There's also a bug with loading in Marian args #933, so I'm putting the values in the decoder.yml file as well.