Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ADV_STABLE_BASELINES_3.md #136

Merged
merged 3 commits into from
Jul 23, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 50 additions & 26 deletions docs/ADV_STABLE_BASELINES_3.md
Original file line number Diff line number Diff line change
@@ -46,42 +46,66 @@ While the default options for sb3 work reasonably well. You may be interested in

We recommend taking the [sb3 example](https://github.com/edbeeching/godot_rl_agents/blob/main/examples/stable_baselines3_example.py) and modifying to match your needs.

This example exposes more parameter for the user to configure, such as `--speedup` to run the environment faster than realtime and the `--n_parallel` to launch several instances of the game executable in order to accelerate training (not available for in-editor training).
The example exposes more parameters for the user to configure, such as `--speedup` to run the environment faster than realtime and the `--n_parallel` to launch several instances of the game executable in order to accelerate training (not available for in-editor training).

To use the example script, first move to the location where the downloaded script is in the console/terminal, and then try some of the example use cases below:

```python
import argparse

from godot_rl.wrappers.stable_baselines_wrapper import StableBaselinesGodotEnv
from stable_baselines3 import PPO
### Train a model in editor:
```bash
python .\stable_baselines3_example.py
```

# To download the env source and binary:
# 1. gdrl.env_from_hub -r edbeeching/godot_rl_BallChase
# 2. chmod +x examples/godot_rl_BallChase/bin/BallChase.x86_64
### Train an exported environment:
```bash
python .\stable_baselines3_example.py --env_path=path_to_executable
```
Note that the exported environment will not be rendered in order to accelerate training.
If you want to display it, add the `--viz` argument.

### Train an exported environment using 4 environment processes:
```bash
python .\stable_baselines3_example.py --env_path=path_to_executable --n_parallel=4
```

parser = argparse.ArgumentParser(allow_abbrev=False)
parser.add_argument(
"--env_path",
# default="envs/example_envs/builds/JumperHard/jumper_hard.x86_64",
default=None,
type=str,
help="The Godot binary to use, do not include for in editor training",
)
### Train an exported environment using 8 times speedup:
```bash
python .\stable_baselines3_example.py --env_path=path_to_executable --speedup=8
Copy link
Owner

@edbeeching edbeeching Jul 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to include the .\ in front of every command?

```

parser.add_argument("--speedup", default=1, type=int, help="whether to speed up the physics in the env")
parser.add_argument("--n_parallel", default=1, type=int, help="whether to speed up the physics in the env")
### Set an experiment directory and name:
You can optionally set an experiment directory and name to override the default. When saving checkpoints, you need to use a unique directory or name for each run (more about that below).
```bash
python .\stable_baselines3_example.py --experiment_dir="experiments" --experiment_name="experiment1"
```

args, extras = parser.parse_known_args()
### Train a model for 10_000 steps then save and export the model
The exported .onnx model can be used by the Godot sync node to run inference from Godot directly, while the saved .zip model can be used to resume training later or run inference from the example script by adding `--inference`.
```bash
python .\stable_baselines3_example.py --timesteps=100_000 --onnx_export_path=model.onnx --save_model_path=model.zip
```

### Resume training from a saved .zip model
This will load the previously saved model.zip, and resume training for another 100 000 steps, so the saved model will have been trained for 200 000 steps in total.
Note that the console log will display the `total_timesteps` for the last training session only, so it will show `100000` instead of `200000`.
```bash
python .\stable_baselines3_example.py --timesteps=100_000 --save_model_path=model_200_000_total_steps.zip --resume_model_path=model.zip
```

env = StableBaselinesGodotEnv(env_path=args.env_path, show_window=True, n_parallel=args.n_parallel, speedup=args.speedup)
### Save periodic checkpoints
You can save periodic checkpoints and later resume training from any checkpoint using the same CL argument as above, or run inference on any checkpoint just like with the saved model.
Note that you need to use a unique `experiment_name` or `experiment_dir` for each run so that checkpoints from one run won't overwrite checkpoints from another run.
Alternatively, you can remove the folder containing checkpoints from a previous run if you don't need them anymore.

model = PPO("MultiInputPolicy", env, ent_coef=0.0001, verbose=2, n_steps=32, tensorboard_log="logs/sb3")
model.learn(200000)
E.g. train for a total of 2 000 000 steps with checkpoints saved at every 50 000 steps:

print("closing env")
env.close()
```bash
python .\stable_baselines3_example.py --experiment_name=experiment1 --timesteps=2_000_000 --save_checkpoint_frequency=50_000
```

Checkpoints will be saved to `logs\sb3\experiment1_checkpoints` in the above case, the location is affected by `--experiment_dir` and `--experiment_name`.

```
### Run inference on a saved model for 100_000 steps
You can run inference on a model that was previously saved using either `--save_model_path` or `--save_checkpoint_frequency`.
```bash
python .\stable_baselines3_example.py --timesteps=100_000 --resume_model_path=model.zip --inference
```