-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add envpool support #307
Add envpool support #307
Conversation
For DeepMind Control Suite, here are the hyperparam I understand from the paper, and that could be default: AcrobotSwingup-v1: &dmcs-defaults
policy: 'MlpPolicy'
n_timesteps: !!float 1e8
batch_size: 64
policy_kwargs: "dict(net_arch=dict(pi=[300, 200], qf=[400, 300]))"
learning_rate: !!float 1e-4
gamma: 0.99
noise_type: 'ornstein-uhlenbeck'
noise_std: 0.3 |
mmh 1e8 sounds a bit too much for me (i would prefer to stick to 1e6), and i would try mujoco default first. |
I agree, most tasks can be solved with much less interaction. |
Fujimoto et al. (2018) is actually the |
Side note: mypy now hangs with python 3.7 for some reasons... (doesn't happen with other versions) and I cannot reproduce locally... |
@qgallouedec same as openai/gym#3176, no?
|
Closing in favor of #355 |
Description
Motivation and Context
closes #241
Types of changes
Checklist:
make format
(required)make check-codestyle
andmake lint
(required)make pytest
andmake type
both pass. (required)Note: we are using a maximum length of 127 characters per line