OpenAI Taxi-v2 task solution

Getting Started

The description of the environment is described in subsection 3.1 of this paper. You can verify that the description in the paper matches the OpenAI Gym environment by peeking at the code here.

Instructions

The repository contains three files:

agent.py: Reinforcement learning agent using Expected Sarsa is coded here.
monitor.py: The interact function tests how well the agent learns from interaction with the environment.
main.py: Run this file in the terminal to check the performance of the agent.

Begin by running the following command in the terminal:

python main.py

When you run main.py, the agent specified in agent.py interacts with the environment for 20,000 episodes. The details of the interaction are specified in monitor.py, which returns two variables: avg_rewards and best_avg_reward.

avg_rewards is a deque where avg_rewards[i] is the average (undiscounted) return collected by the agent from episodes i+1 to episode i+100, inclusive. So, for instance, avg_rewards[0] is the average return collected by the agent over the first 100 episodes.
best_avg_reward is the largest entry in avg_rewards. This is the final score that determines how well the agent performed in the task.

OpenAI Gym defines "solving" this task as getting average return of 9.7 over 100 consecutive trials.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agent.py		agent.py
main.py		main.py
monitor.py		monitor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenAI Taxi-v2 task solution

Getting Started

Instructions

About

Releases

Packages

Languages

License

jcarlosgm30/OpenAI_Taxi-v2_solution

Folders and files

Latest commit

History

Repository files navigation

OpenAI Taxi-v2 task solution

Getting Started

Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages