-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: update rllib and pettingzoo examples to current release versions #113
Comments
That sounds very useful thank you! I don't know the Ray stuff etc that well, so @duenez is best to comment on any details. |
Hello Elliot, That would be fantastic! Please let me know if there's anything I can do to help. |
Appreciate it, will reach out if I run into any issues. I got a good bit of progress today, although a major blocker to updating the entire pettingzoo tutorial is that stable-baselines3 doesn't officially support gymnasium yet (pettingzoo has migrated as of October 2022). There's some open pull requests which may get merged in the future (stable-baselines3 stable-baselines3-contrib but the only way to get it working currently is manually installing the feature branch: Edit: looks like there's an API incompatibility with pettingzoo and the sb3 gymnasium support feature branch (link) so the pettingzoo tutorial will have to wait until that is fixed. I got most of the pettingzoo code adapted though, I chose to use the SB3's frame stacking functionality rather than supersuit because the latter required casting types to float, stacking frames, and then casting back to uint8 for SB3. I've also heard that supersuit will be closed down and its functionality migrated directly into gymnasium and pettingzoo, so I figure it's best not to rely on it too much if possible. My changes so far: elliottower@64b38e9 Luckily ray/rllib 2.3.0 does have gymnasium support so that shouldn't be a blocker, planning to work on that tomorrow. |
One minor issue I had with this repo was running the pylint script, didn't find any documentation on it and for some reason running I'm currently running the pytest suite but it seems to be going very slowly and says it has 4787 items. Edit: took 34 minutes 30 seconds on a 2019 intel mbp. The .sh install scripts worked great though and the tests ran pretty quickly there (less than 5 mins iirc) |
Yes, some of our tests are very slow (the full substrate tests). I believe we don't run all the tests in the install.sh script. |
Is there a reason the pettingzoo env is defined with a separate superclass class _ParallelEnv? I got some issues when trying to use wrapper functions from pettingzoo/gymnasium/SB3 because the base class when you do env.unwrapped is _ParallelEnv. Moving the EzPickle init line into the regular class (which is of the correct type parallel pettingzoo env) led to an error saying it can’t load the lab2d object from pickle. I don’t have experience with lab2d or the EzPickle function, imagine the specific definition of this ParallelEnv with the two superclasses (one for the env one for EzPickle) is what allows it to load properly? Edit: Nevermind, I got the EzPickle component to work by restructuring the code and using wrappers. Will post a PR once I'm finished. |
Any idea on how to deal with these warnings? Getting them when running pettingzoo API tests, thinking they may be corner cases that the pettingzoo wrapper doesn't check for @jagapiou @duenez
Also PR is up: #117, running final tests and will then make a PR on shimmy with melting pot compatibility. I think using shimmy is best as it has comprehensive tests and is used for other API conversions, so I can update this PR to use shimmy once that gets added (as opposed to |
Those warnings are not critical. They happen when a substrate doesn't
define a scene, it when their ASCII map contains a character that is not
present in the prefab map. Which substrate is this for? I could just add
those and fix the warnings.
Shimmy seems like a reasonable approach to unify APIs. Thanks.
Finally, regarding the EzPickle, this is needed, I believe, to serialise
the environment. I'm particular, the EzPickle class has a __set_state__ and
__get_state__ pair of functions (
https://github.com/openai/gym/blob/master/gym/utils/ezpickle.py) that I
imagine allow for saving the environment. If you can modernise the code and
not need that, it should be fine.
…On Sat, 4 Mar 2023, 00:49 Elliot Tower, ***@***.***> wrote:
Any idea on how to deal with these warnings? Getting them when running
pettingzoo API tests, thinking they may be corner cases that the pettingzoo
wrapper doesn't check for @jagapiou <https://github.com/jagapiou> @duenez
<https://github.com/duenez>
W
@/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/prefab_utils.lua:83]
Character x not found in the charPrefabMap. Ignoring.
W
@/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/base_simulation.lua:225]
GameObject 'scene' did not have a Transform component, but explicitly
specifying one is strongly preferred. Using a default.
W
@/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/base_simulation.lua:219]
GameObject 'scene' did not have a StateManager component, but explicitly
specifying one is strongly preferred. Using a default.
Also PR is up: #117 <#117>,
running final tests and will then make a PR on shimmy
<https://github.com/Farama-Foundation/Shimmy> with melting pot
compatibility. I think using shimmy is best as it has comprehensive tests
and is used for other API conversions, so I can update this PR to use
shimmy once that gets added (as opposed to examples/pettingzoo/utils.py)
—
Reply to this email directly, view it on GitHub
<#113 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAY7XEO4XO7EPPHRXV6CDJ3W2KGR5ANCNFSM6AAAAAAVLTHO6M>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Will get the list of substrates later today when I rerun the tests. @duenez Do you happen to know if it’s possible to do rendering using PIL or something other than matplotlib? Not sure exactly how the rendering works internally but on the shimmy PR they raised concerns about not wanting to have to import matplotlib (most farama code uses pygame afaik and/or PIL, I think it might be a speed concern?) And fwiw the pickle code is still used in other farama games like Atari and their devs said it looks fine. |
Yes, of course. The observation is produced as a raw numpy array of height X width X channels (3, RGB). Si you can render this to anything you want. One possibility is doing imshow in matplotlib, but equally possible is using PIL. |
@duenez thanks, I ended up searching through the code more and finding an example using pygame rendering, which is what is used by all other pettingzoo envs, and mirrored that functionality in the shimmy wrapper. |
@duenez is it possible to reset the environment using a specific seed rather than re-initializing a new environment with a specified seed? The new gym API requires that the reset() function take a seed argument, I've been trying to adapt the code but it doesn't seem to be possible without initializing the environment again, which I also have had trouble figuring out how to do. The only python file I see using seeding is I think the correct code should be something like this, but it says that the config is missing they key
Have been having a lot of difficulty understanding the inner workings of this code and if it's even possible to do what I want to do. Maybe it's best to change the way the reset() function works internally to allow a seed to be specified? |
You need to get the factory: config = meltingpot.python.substrate.get_config(substrate_name)
factory = meltingpot.python.substrate.get_factory_from_config(config)
env = factory.build(config.default_player_roles) If you need to explicitly set the seed, we currently don't expose it in the builder through the canonical way to build substrates. The reason we need a factory is because now the config doesn't need to know the number of players ahead of time. This means that we must have an intermediate step when we have the player roles to know what we need to build. If this is critical, we can think about piping a seed through. What do you think? |
Thanks for the quick reply, makes sense about the number of players. So I take it using the factory doesn’t allow seeding either? It’s a core functionality of the current gymnasium and pettingzoo APIs so if it’s possible to make the reset method take a seed argument that would be awesome. In gymnasium/pettingzoo you have to call reset() on the env before using it and set the seed that way, but if your code doesn’t do that maybe it’s better to pipe the seed through the factory build method and I can make the reset method rebuild the substrate using that. We are also planning to make a shimmy wrapper for lab2d and gymnasium, do you happen to know if that would run into the same issue? |
It depends on the shimmy API. We usually don't have a way to reset substrates without rebuilding the whole thing anyway, because some stochastic choices occur at the creation stage. This is why we have a reset wrapper that does this: https://github.com/deepmind/meltingpot/blob/7de41d2db0e5eca31107312d405e20ff3a7da39e/meltingpot/python/utils/substrates/wrappers/reset_wrapper.py |
Makes sense, I did some more testing and saw that the underlying dm_env code defines _rng to be a certain seed so I tried setting env._rng in the shimmy wrapper but then noticed what you mentioned about the stochastic environment creation. So is it possible to modify the reset wrapper to take in a seed? If you could help out with adding support for seeding that it would be greatly appreciated. I could look into it myself as well (and potentially make a PR) but I’d need a better idea of where to look and which files to modify. Might be simpler for someone more familiar with the repo though. |
I'll take a look tomorrow.
…On Wed, 8 Mar 2023, 17:16 Elliot Tower, ***@***.***> wrote:
It depends on the shimmy API. We usually don't have a way to reset
substrates without rebuilding the whole thing anyway, because some
stochastic choices occur at the creation stage. This is why we have a reset
wrapper that does this:
https://github.com/deepmind/meltingpot/blob/7de41d2db0e5eca31107312d405e20ff3a7da39e/meltingpot/python/utils/substrates/wrappers/reset_wrapper.py
Makes sense, I did some more testing and saw that the underlying dm_env
code defines _rng to be a certain seed so I tried setting env._rng in the
shimmy wrapper but then noticed what you mentioned about the stochastic
environment creation. So is it possible to modify the reset wrapper to take
in a seed? If you could help out with adding support for seeding that it
would be greatly appreciated.
I could look into it myself as well (and potentially make a PR) but I’d
need a better idea of where to look and which files to modify. Might be
simpler for someone more familiar with the repo though.
—
Reply to this email directly, view it on GitHub
<#113 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAY7XENH7ZTHPYD2KCTSROTW3C5GJANCNFSM6AAAAAAVLTHO6M>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Appreciate it |
Fundamentally, there's a API incompatibility here: a gym
But that won't help with Scenarios, and they'd need a seed to also be passed to the background bots when building them. There may also be other sources of randomness (e.g. calls to random in Wrappers) or other sources of non determinism (e.g. python set iteration order varies on each process) that might make things hard. So we'd need a bunch of tests of determinism. Finally, rather than adding an optional So it's not a small job. Alternatively, is there a quick hack?
|
Sorry update here: it's not currently possible to change the substrate seed. builder takes an |
I have locally updated the rllib example to use ray==2.5.1 instead of 2.0.0, should I submit a pull request? |
Amazing, yes please! |
Just fwiw when I get a chance I’m planning to do a PR updating the pettingzoo example to use shimmy and current pettingzoo, I’ve created official SB3 tutorials for pettingzoo in the most recent release so it would be cool to show this on our docs site as another example using SB3. Also, once RLlib merges my PR (they have some other blockers but it should be soon) RLlib will be fully compatible with current gymnasium/pettingzoo, so users will be able to do RLlib through pettingzoo if they want. well as directly like in these tutorials. We have a full CleanRL training script for Atari games that I imagine could also be similarly adapted to work on melting pot using a pettingzoo wrapper. I won’t have time to fully adapt it but thinking when I do at least update the pettingzoo tutorial I can mention something along the lines of “this enables you to use any sort of training framework as shown in our tutorials” (we even have one using LangChain, though that’s only suitable for games like chess, I would imagine LLMs would do poorly on visual input like this) |
Fantastic, let me know if there's anything on our side that's needed. |
FYI: @dimonenka's PR updating rllib: #153 |
Hi, I found a minor typo in the file "example/pettingzoo/sb3_train.py" on line 86. The environment name should be |
Hey all, I’ve been planning on it for a while but finally have time to make a PR updating this tutorial now that PettingZoo has official SB3 tutorials, was thinking about other possible tutorial options such as AgileRL, which now supports PettingZoo directly. Also would be nice to demonstrate the Shimmy conversion wrapper I made, which was based on this example but is more full featured and tested against every underlying environment in our CI, supports serialization via pickle, etc. Maybe @duenez has thoughts on the best way to proceed? |
It looks like I'm late to the party! Is it now possible to make commons_harves__open deterministic? The 'wired in' hyperlink seemed like it was useful, but it looks like that page has been taken down at this point. |
It should be possible to make the substrates deterministic. You can see how the seed is set here: meltingpot/meltingpot/python/utils/substrates/builder.py Lines 176 to 189 in 7de41d2
It should work if you just change that logic to pass a fixed seed. |
You might need to use a custom builder, the raw builder in |
Actually, that's where you were doing this. Don't use the zero seed, because that means to resample a seed. Use a fixed, non-zero one (like 42 :) ). If that fails, try setting you python random seed too, as some objects are built on the python side, but I don't think avatars would. |
I'm planning to do some testing with the newly released ray 2.3.0 and I noticed the pettingzoo example is 9 months old and still uses gym rather than gymnasium, both should probably be updated. Will create a PR if I can get them to work but figure I'd create an issue first. Appreciate the repo, very excited to use it.
The text was updated successfully, but these errors were encountered: