Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ready for testing 🧪 Multi-policy training support #181

Merged
merged 14 commits into from
May 15, 2024

Conversation

Ivan-267
Copy link
Collaborator

@Ivan-267 Ivan-267 commented Apr 1, 2024

Adds support for training multiple policies with Rllib.

Plugin PR: edbeeching/godot_rl_agents_plugin#40
Example env PR: edbeeching/godot_rl_agents_examples#30

TODO:

  • Fix multiple obs spaces case (should be fixed now but still needs more checking, onnx export doesn't currently work with multiple discrete obs spaces, I think it's configured for a single space, will need to check this at some point as well).

@Ivan-267 Ivan-267 changed the title WIP🚧 Multi-policy training support Ready for testing 🧪 Multi-policy training support Apr 1, 2024
@Ivan-267 Ivan-267 requested a review from edbeeching April 1, 2024 19:30
@Ivan-267
Copy link
Collaborator Author

Ivan-267 commented Apr 3, 2024

I've done a little testing with some of my previous envs and Jumper Hard with older plugin version, and it seemed to work with both multiagent set to false in yaml config (possible that number of envs per worker should be adjusted manually), or true (not intended for single agent envs due to individual agents being inactivated in rllib after done = true, but should not cause errors due to the compatibility code in GDRLPettingZooWrapper). SB3 seems to work properly after these changes, but I only tested in on a modified version of the multi-agent env for now (made into a single agent compatible version). Further testing is always welcome, especially on Linux and Sample Factory.

LSTM/Attention wrappers work (they show a deprecated warning so possibly accessing them might be different when newer versions of rllib come out), but for exporting we can't use them yet since the state data wouldn't be fed in.

One thing I found that doesn't work well is enabling some exploration options with PPO, one that worked was RE3 with Tensorflow rather than Torch set. Curiosity needs discrete or multidiscrete actions, but didn't seem to work when I switched the env to discrete actions. I think it might be related to the tuple action space, it might not be supported in some of the exploration codes.

Warning

Edit: With the current script, the exported onnx from Rllib doesn't output just action means like our SB3 setup, so the output size is doubled and exported onnx with more than one action won't work correctly. Not yet sure how to solve that so that both onnx export from sb3 and rllib works with different sizes.

Edit2: I've just updated the plugin to handle the case above.

Copy link
Owner

@edbeeching edbeeching left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending review of other PRs


if __name__ == "__main__":
parser = argparse.ArgumentParser(allow_abbrev=False)
parser.add_argument("--config_file", default="rllib_config.yaml", type=str, help="The yaml config file")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be examples/rllib_config.ymal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually call the example from within the examples folder, so the default was based on my usage. If calling from GDRL repository directly then it should be changed.

If someone installs GDRL using pip install and then just downloads the example file and config file, they might not have the entire repository, but I'm not sure how common this is.

I leave this up to you, I can definitely change the default.

Ivan-267 added 3 commits May 9, 2024 21:46
- Adds supports for exporting envs with multidiscrete actions with sb3
- Multiple obs spaces onnx export support (for sb3) still needs to be worked on in the future
Also removes the previously removed init variables from `tune.register_env()`
Updates rllib doc to include the new process.
@Ivan-267 Ivan-267 merged commit 39852ac into main May 15, 2024
13 checks passed
@Ivan-267 Ivan-267 deleted the multiagent_experimental branch May 15, 2024 05:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants