pipeline并行支持动态修改batch_size和num_microbatches吗? #2325
Unanswered
yhcc
asked this question in
Community | Q&A
Replies: 1 comment 3 replies
-
目前不太支持因为本身pytorch的dataloader不方便这么做。如果您有好的修改意见可以将相关代码PR到我们的仓库中。非常感谢! |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
在最近的大模型中都提到了逐渐增大batch size的训练trick,我感觉在CAI比较直接的方法应该就是随着模型的训练,dataloader动态修改输出的batch大小,同时让engine中的schedule动态update num_microbatches的数量(由于模型比较大,一般整个训练过程中都只能保持forward的batch size为1)。这个直接修改的方案可能会有潜在的bug嘛?
Beta Was this translation helpful? Give feedback.
All reactions