From c4e22e767e9fabf006265ee09a4fd925fc9646e2 Mon Sep 17 00:00:00 2001 From: Ivan-267 <61947090+Ivan-267@users.noreply.github.com> Date: Sun, 23 Jul 2023 16:39:37 +0200 Subject: [PATCH 1/3] Update ADV_STABLE_BASELINES_3.md Updated the docs for the sb3 example. Removed the code snippet since the example file is already linked, it may be easier than keeping the code snippet updated with each change to the example. --- docs/ADV_STABLE_BASELINES_3.md | 76 ++++++++++++++++++++++------------ 1 file changed, 50 insertions(+), 26 deletions(-) diff --git a/docs/ADV_STABLE_BASELINES_3.md b/docs/ADV_STABLE_BASELINES_3.md index 5199f048..f3e94138 100644 --- a/docs/ADV_STABLE_BASELINES_3.md +++ b/docs/ADV_STABLE_BASELINES_3.md @@ -46,42 +46,66 @@ While the default options for sb3 work reasonably well. You may be interested in We recommend taking the [sb3 example](https://github.com/edbeeching/godot_rl_agents/blob/main/examples/stable_baselines3_example.py) and modifying to match your needs. -This example exposes more parameter for the user to configure, such as `--speedup` to run the environment faster than realtime and the `--n_parallel` to launch several instances of the game executable in order to accelerate training (not available for in-editor training). +The example exposes more parameters for the user to configure, such as `--speedup` to run the environment faster than realtime and the `--n_parallel` to launch several instances of the game executable in order to accelerate training (not available for in-editor training). +To use the example script, first move to the location where the downloaded script is in the console/terminal, and then try some of the example use cases below: -```python -import argparse - -from godot_rl.wrappers.stable_baselines_wrapper import StableBaselinesGodotEnv -from stable_baselines3 import PPO +### Train a model in editor: +```bash +python .\stable_baselines3_example.py +``` -# To download the env source and binary: -# 1. gdrl.env_from_hub -r edbeeching/godot_rl_BallChase -# 2. chmod +x examples/godot_rl_BallChase/bin/BallChase.x86_64 +### Train an exported environment: +```bash +python .\stable_baselines3_example.py --env_path=path_to_executable +``` +Note that the exported environment will not be rendered in order to accelerate training. +If you want to display it, add the `--viz` argument. +### Train an exported environment using 4 environment processes: +```bash +python .\stable_baselines3_example.py --env_path=path_to_executable --n_parallel=4 +``` -parser = argparse.ArgumentParser(allow_abbrev=False) -parser.add_argument( - "--env_path", - # default="envs/example_envs/builds/JumperHard/jumper_hard.x86_64", - default=None, - type=str, - help="The Godot binary to use, do not include for in editor training", -) +### Train an exported environment using 8 times speedup: +```bash +python .\stable_baselines3_example.py --env_path=path_to_executable --speedup=8 +``` -parser.add_argument("--speedup", default=1, type=int, help="whether to speed up the physics in the env") -parser.add_argument("--n_parallel", default=1, type=int, help="whether to speed up the physics in the env") +### Set an experiment directory and name: +You can optionally set an experiment directory and name to override the default. When saving checkpoints, you need to use a unique directory or name for each run (more about that below). +```bash +python .\stable_baselines3_example.py --experiment_dir="experiments" --experiment_name="experiment1" +``` -args, extras = parser.parse_known_args() +### Train a model for 10_000 steps then save and export the model +The exported .onnx model can be used by the Godot sync node to run inference from Godot directly, while the saved .zip model can be used to resume training later or run inference from the example script by adding `--inference`. +```bash +python .\stable_baselines3_example.py --timesteps=100_000 --onnx_export_path=model.onnx --save_model_path=model.zip +``` +### Resume training from a saved .zip model +This will load the previously saved model.zip, and resume training for another 100 000 steps, so the saved model will have been trained for 200 000 steps in total. +Note that the console log will display the `total_timesteps` for the last training session only, so it will show `100000` instead of `200000`. +```bash +python .\stable_baselines3_example.py --timesteps=100_000 --save_model_path=model_200_000_total_steps.zip --resume_model_path=model.zip +``` -env = StableBaselinesGodotEnv(env_path=args.env_path, show_window=True, n_parallel=args.n_parallel, speedup=args.speedup) +### Save periodic checkpoints +You can save periodic checkpoints and later resume training from any checkpoint using the same CL argument as above, or run inference on any checkpoint just like with the saved model. +Note that you need to use a unique `experiment_name` or `experiment_dir` for each run so that checkpoints from one run won't overwrite checkpoints from another run. +Alternatively, you can remove the folder containing checkpoints from a previous run if you don't need them anymore. -model = PPO("MultiInputPolicy", env, ent_coef=0.0001, verbose=2, n_steps=32, tensorboard_log="logs/sb3") -model.learn(200000) +E.g. train for a total of 2 000 000 steps with checkpoints saved at every 50 000 steps: -print("closing env") -env.close() +```bash +python .\stable_baselines3_example.py --experiment_name=experiment1 --timesteps=2_000_000 --save_checkpoint_frequency=50_000 +``` +Checkpoints will be saved to `logs\sb3\experiment1_checkpoints` in the above case, the location is affected by `--experiment_dir` and `--experiment_name`. -``` \ No newline at end of file +### Run inference on a saved model for 100_000 steps +You can run inference on a model that was previously saved using either `--save_model_path` or `--save_checkpoint_frequency`. +```bash +python .\stable_baselines3_example.py --timesteps=100_000 --resume_model_path=model.zip --inference +``` From a1ac0ec559bd83e319847512c47238c63a9c3963 Mon Sep 17 00:00:00 2001 From: Ivan-267 <61947090+Ivan-267@users.noreply.github.com> Date: Sun, 23 Jul 2023 21:18:48 +0200 Subject: [PATCH 2/3] Removes powershell specific ./ from example CL arguments --- docs/ADV_STABLE_BASELINES_3.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/ADV_STABLE_BASELINES_3.md b/docs/ADV_STABLE_BASELINES_3.md index f3e94138..9e5a95ce 100644 --- a/docs/ADV_STABLE_BASELINES_3.md +++ b/docs/ADV_STABLE_BASELINES_3.md @@ -52,43 +52,43 @@ To use the example script, first move to the location where the downloaded scrip ### Train a model in editor: ```bash -python .\stable_baselines3_example.py +python stable_baselines3_example.py ``` ### Train an exported environment: ```bash -python .\stable_baselines3_example.py --env_path=path_to_executable +python stable_baselines3_example.py --env_path=path_to_executable ``` Note that the exported environment will not be rendered in order to accelerate training. If you want to display it, add the `--viz` argument. ### Train an exported environment using 4 environment processes: ```bash -python .\stable_baselines3_example.py --env_path=path_to_executable --n_parallel=4 +python stable_baselines3_example.py --env_path=path_to_executable --n_parallel=4 ``` ### Train an exported environment using 8 times speedup: ```bash -python .\stable_baselines3_example.py --env_path=path_to_executable --speedup=8 +python stable_baselines3_example.py --env_path=path_to_executable --speedup=8 ``` ### Set an experiment directory and name: You can optionally set an experiment directory and name to override the default. When saving checkpoints, you need to use a unique directory or name for each run (more about that below). ```bash -python .\stable_baselines3_example.py --experiment_dir="experiments" --experiment_name="experiment1" +python stable_baselines3_example.py --experiment_dir="experiments" --experiment_name="experiment1" ``` ### Train a model for 10_000 steps then save and export the model The exported .onnx model can be used by the Godot sync node to run inference from Godot directly, while the saved .zip model can be used to resume training later or run inference from the example script by adding `--inference`. ```bash -python .\stable_baselines3_example.py --timesteps=100_000 --onnx_export_path=model.onnx --save_model_path=model.zip +python stable_baselines3_example.py --timesteps=100_000 --onnx_export_path=model.onnx --save_model_path=model.zip ``` ### Resume training from a saved .zip model This will load the previously saved model.zip, and resume training for another 100 000 steps, so the saved model will have been trained for 200 000 steps in total. Note that the console log will display the `total_timesteps` for the last training session only, so it will show `100000` instead of `200000`. ```bash -python .\stable_baselines3_example.py --timesteps=100_000 --save_model_path=model_200_000_total_steps.zip --resume_model_path=model.zip +python stable_baselines3_example.py --timesteps=100_000 --save_model_path=model_200_000_total_steps.zip --resume_model_path=model.zip ``` ### Save periodic checkpoints @@ -99,7 +99,7 @@ Alternatively, you can remove the folder containing checkpoints from a previous E.g. train for a total of 2 000 000 steps with checkpoints saved at every 50 000 steps: ```bash -python .\stable_baselines3_example.py --experiment_name=experiment1 --timesteps=2_000_000 --save_checkpoint_frequency=50_000 +python stable_baselines3_example.py --experiment_name=experiment1 --timesteps=2_000_000 --save_checkpoint_frequency=50_000 ``` Checkpoints will be saved to `logs\sb3\experiment1_checkpoints` in the above case, the location is affected by `--experiment_dir` and `--experiment_name`. @@ -107,5 +107,5 @@ Checkpoints will be saved to `logs\sb3\experiment1_checkpoints` in the above cas ### Run inference on a saved model for 100_000 steps You can run inference on a model that was previously saved using either `--save_model_path` or `--save_checkpoint_frequency`. ```bash -python .\stable_baselines3_example.py --timesteps=100_000 --resume_model_path=model.zip --inference +python stable_baselines3_example.py --timesteps=100_000 --resume_model_path=model.zip --inference ``` From 1caf770f3b1f9045d105a83ce9206e5bb4f5a719 Mon Sep 17 00:00:00 2001 From: Ivan-267 <61947090+Ivan-267@users.noreply.github.com> Date: Sun, 23 Jul 2023 22:25:29 +0200 Subject: [PATCH 3/3] Update ADV_STABLE_BASELINES_3.md small correction --- docs/ADV_STABLE_BASELINES_3.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ADV_STABLE_BASELINES_3.md b/docs/ADV_STABLE_BASELINES_3.md index 9e5a95ce..74579d1d 100644 --- a/docs/ADV_STABLE_BASELINES_3.md +++ b/docs/ADV_STABLE_BASELINES_3.md @@ -55,7 +55,7 @@ To use the example script, first move to the location where the downloaded scrip python stable_baselines3_example.py ``` -### Train an exported environment: +### Train a model using an exported environment: ```bash python stable_baselines3_example.py --env_path=path_to_executable ```