Skip to content

Pull requests: mlfoundations/open_lm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Avoid failing on fname not in keys
#302 opened Aug 22, 2024 by jmercat Loading…
Update strip to rstrip
#300 opened Aug 19, 2024 by Yu-Shi Loading…
Faster eval2
#299 opened Jul 31, 2024 by jmercat Loading…
Faster eval
#298 opened Jul 31, 2024 by jmercat Loading…
Adjust Attention Mechanism and Dataset Handling
#295 opened Jul 25, 2024 by OLMResearch Loading…
Build wheel
#294 opened Jul 23, 2024 by jmercat Loading…
fix: adopt latest state_dict processing
#286 opened Jun 12, 2024 by ruixin31 Loading…
fix for EOS/PAD tokens when not gpt-neox
#283 opened May 24, 2024 by jeffreywpli Loading…
Parameter input rotary-freq
#263 opened Apr 30, 2024 by jmercat Loading…
Add loss like Rho-1
#260 opened Apr 27, 2024 by GeorgiosSmyrnis Loading…
Add dMoE
#257 opened Apr 25, 2024 by Muennighoff Loading…
Checkpoint skipping.
#256 opened Apr 21, 2024 by GeorgiosSmyrnis Loading…
Mamba update
#254 opened Apr 18, 2024 by jmercat Loading…
HF Integration
#248 opened Apr 12, 2024 by sedrick-keh-tri Loading…
Bug fix to import Llama in OpenLM.
#245 opened Apr 11, 2024 by kushal-tri Loading…
adding cosine rewarmed scheduler
#243 opened Apr 10, 2024 by Tomerporian Loading…
Change GeGLU and add MQA.
#239 opened Mar 31, 2024 by GeorgiosSmyrnis Loading…
Allow mixing for pretokenized data.
#230 opened Mar 8, 2024 by GeorgiosSmyrnis Loading…
Adding depth scale init support
#225 opened Mar 4, 2024 by kalyani7195 Loading…
[WIP] Adding support for FP8 training
#218 opened Feb 21, 2024 by shahromil16 Loading…
[WIP] Attention across documents.
#213 opened Jan 31, 2024 by GeorgiosSmyrnis Loading…
ProTip! Updated in the last three days: updated:>2024-12-12.