tensorpack/examples/A3C-Gym at master · haiyang-tju/tensorpack

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
common.py		common.py
run-atari.py		run-atari.py
simulator.py		simulator.py
train-atari.py		train-atari.py

README.md

Code and models for Atari games in gym

Implemented Multi-GPU version of the A3C algorithm in Asynchronous Methods for Deep Reinforcement Learning.

Results of the same code trained on 47 different Atari games were uploaded on OpenAI Gym. You can see them in my gym page. Most of them are the best reproducible results on gym.

To train on an Atari game:

CUDA_VISIBLE_DEVICES=0 ./train-atari.py --env Breakout-v0

The speed is about 6~10 iterations/s on 1 GPU plus 12+ CPU cores. In each iteration it trains on a batch of 128 new states. The network architecture is larger than what's used in the original paper.

The pre-trained models are all trained with 4 GPUs for about 2 days. But on simple games like Breakout, you can get good performance within several hours. Also note that multi-GPU doesn't give you obvious speedup here, because the bottleneck in this implementation is not computation but data.

Some practicical notes:

On machines without huge memory, enabling tcmalloc may keep training throughput more stable.
Occasionally, processes may not get terminated completely. It is suggested to use systemd-run to run any multiprocess Python program to get a cgroup dedicated for the task.
Training with a significant slower speed (e.g. on CPU) will result in very bad score, probably because of async issues.

To run a pretrained Atari model for 100 episodes:

Download models from model zoo
ENV=Breakout-v0; ./run-atari.py --load "$ENV".tfmodel --env "$ENV" --episode 100 --output output_dir

Models are available for the following atari environments (click to watch videos of my agent):

AirRaid (this one is flickering due to gym settings)
Alien
Amidar
Assault
Asterix
Asteroids
Atlantis
BankHeist
BattleZone
BeamRider
Berzerk
Breakout
Carnival
Centipede
ChopperCommand
CrazyClimber
DemonAttack
DoubleDunk
ElevatorAction
FishingDerby
Frostbite
Gopher
Gravitar
IceHockey
Jamesbond
JourneyEscape
Kangaroo
Krull
KungFuMaster
MsPacman
NameThisGame
Phoenix
Pong
Pooyan
Qbert
Riverraid
RoadRunner
Robotank
Seaquest
SpaceInvaders
StarGunner
Tennis
Tutankham
UpNDown
VideoPinball
WizardOfWor
Zaxxon

Note that atari game settings in gym are quite different from DeepMind papers, so the scores are not comparable. The most notable differences are:

In gym, each action is randomly repeated 2~4 times.
In gym, inputs are RGB instead of greyscale.
In gym, an episode is limited to 10000 steps.
The action space also seems to be different.

Also see the DQN implementation here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A3C-Gym

A3C-Gym

README.md

Code and models for Atari games in gym

To train on an Atari game:

To run a pretrained Atari model for 100 episodes:

Files

A3C-Gym

Directory actions

More options

Directory actions

More options

Latest commit

History

A3C-Gym

Folders and files

parent directory

README.md

Code and models for Atari games in gym

To train on an Atari game:

To run a pretrained Atari model for 100 episodes: