Skip to content

Commit

Permalink
[FSDP1] reduce GPU memory usage from 78G instead of 23G (#843)
Browse files Browse the repository at this point in the history
  • Loading branch information
weifengpy authored Apr 23, 2024
1 parent 6d42e9a commit bec7bab
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion recipes/lora_finetune_distributed.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

import os
import sys
import time

Expand Down Expand Up @@ -600,7 +601,7 @@ def recipe_main(cfg: DictConfig) -> None:
"Distributed finetune recipe should be run via a distributed launcher."
"If using tune CLI, please specify --nnodes 1 and --nproc_per_node [num_gpus]"
)

os.environ["TORCH_NCCL_AVOID_RECORD_STREAMS"] = "1"
init_process_group(backend="gloo" if cfg.device == "cpu" else "nccl")

config.log_config(recipe_name="LoRAFinetuneRecipeDistributed", cfg=cfg)
Expand Down

0 comments on commit bec7bab

Please sign in to comment.