VocabParallelClassifier1D中设置是否有问题? #2256
Unanswered
yhcc
asked this question in
Community | Q&A
Replies: 1 comment
-
非常感谢您的反馈。应该是确实存在这个问题,gather_output本来主要是用在模型在transformer之后是连续两个linear去输出logits时(比如bert)的倒数第二个linear层,保证它的输出的tensor是不并行的(本身1D的话此处的column linear必然会输出并行的tensor)。我们会尽快优化一下相关接口的易读性,也欢迎您提出建议,谢谢。 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
我发现在VocabParallelClassifier1D中会设置
ColossalAI/colossalai/nn/layer/parallel_1d/layers.py
Line 345 in c8c7910
同时计算loss的时候,貌似会用到这个环境变量
ColossalAI/colossalai/nn/loss/__init__.py
Line 32 in c8c7910
但是VocabParallelClassifier1D有一个这个参数
ColossalAI/colossalai/nn/layer/parallel_1d/layers.py
Line 316 in c8c7910
Beta Was this translation helpful? Give feedback.
All reactions