Reinforcement learning with Generative Neural Networks

Training Reinforcement Learning agent using derivative of Generative Recurrent Neural Network which models jointly environment and reward. Run "example.py" to see it working. This code requires Chainer and Numpy to be installed.

On the high level the code works as follows:

Agent RNN is initialized, probability of agent just outputting random action is set to 1.0.
Agent acts in an environment, generating data about the environment.
Collected data about environment is split evenly into training and validation parts.
Two separete generative RNNs are trained on training and validation parts of data. Any of such generative RNNs can be viewed as a differentiable model of environemnt.
Agent is trained to optimize average reward on training environment using gradient descent over outputs of training environment GAN. Agent training stops when performance on valiadation GAN starts to decrease.
Probability of agent outputing random action is decreased. Repeat from step 2 until terminating criterion (fixed number of iterations).

Important: this is a work in progress, thus expect bugs and things changing.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.idea		.idea
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.py		example.py
gan_rl_fitter.py		gan_rl_fitter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement learning with Generative Neural Networks

About

Releases

Packages

Languages

License

iaroslav-ai/gan-rl

Folders and files

Latest commit

History

Repository files navigation

Reinforcement learning with Generative Neural Networks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages