Udacity DRLND

P1 Navigation

Project Details

Read about the environment here, and also in the included report. To obtain the results presented at the bottom of this file, an agent was trained to explore an environment with a 32-dimensional vector representation. The components of the vector describe the agents location, orientation and ray-based perception of the environment visible to it. The agent can take 4 actions: move forwards, move backwards, turn left and turn right.

Agents explore a large, square, bounded environment filled with blue and yellow bananas. Agent's receive a reward of +1 for collecting yellow bananas, and a reward of -1 for collecting blue bananas.

The environment is considered to have been resolved when the agent collects an average reward of +13 over 100 episodes.

Getting Started

The setup_env.sh script can be used to create and configure an environment with all the required dependencies to run the included code. The script assumes conda is available. The script requires 1 argument, which is the name that will be given to the created conda environment. E.g. ./setup_env.sh drlnd will create a conda environment named 'drlnd' in which all the included code and programs should run.

In addition, a number of pre-built Unity environments are required. They can be obtained by running the included script ./get_unity_envs.sh. Alternatively, they can be manually downloaded from:

The above environments should be downloaded to a "unity_environments" directory, where they should be unzipped

Instructions

The navigation.py file exposes a CLI built with click. There are three commands available:

python ./navigation.py --help
Usage: navigation.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  plot-results  Plot results from scores stored in the results directory.
  run           Program that initialises an agent using saved DQN weights...
  train         Program to train an agent using a training strategy...

For training an agent:
- python ./navigation.py train
For running a trained agent:
- python ./navigation.py run
For plotting saved results:
- python ./navigation.py plot-results

Training an agent

Train an agent using the following command:

python ./navigation.py train --help
Usage: navigation.py train [OPTIONS]

Options:
  --learning-strategy [DQN|DDQN]  Train the agent using DQN or DDQN.
  --update-type [soft|hard]       Use soft updates or hard updates for
                                  'fixed-Q' TD targets.

  --n-episodes INTEGER            Number of episodes after which training will
                                  terminate.

  --headless                      Train the agent using the headless
                                  environment.

  --keep-training                 Continue training the agent up to n-episodes
                                  after the solved condition is met.

  --checkpoint TEXT               Path to a previously trained Agent's PyTorch
                                  checkpoint, if specified the Agents network
                                  will be initialised using the weights
                                  therein.

  --help                          Show this message and exit.

Example

To train the agent for a maximum of 2000 episodes using DDQN and soft fixed-Q target network updates, but allowing the training to terminate earlier if the environment is solved, run:

python ./navigation.py train

Running a trained agent

See a trained agent navigate the environment by running the following command. With no options specified the agent will be initialised using the solved_weights.pth included in the ./checkpoints directory.

python ./navigation.py run --help
Usage: navigation.py run [OPTIONS]

Options:
  --checkpoint-path TEXT  Path to a checkpoints file with which to obtain
                          learned DQN weights.

  --help                  Show this message and exit.

Example

python ./navigation.py run

Results

For a more thorough investigation and presentation of results, and discussion of training strategies, see Report.md.

The image below shows a comparison of the scores (averaged over 100 episodes) of agents trained with a number of different strategies.

The video below shows a trained agent navigating the environment. The scores obtained by this agent in training are show by the green curve in the image above.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
checkpoints		checkpoints
conf		conf
resources		resources
results		results
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
Report.md		Report.md
agent.py		agent.py
docker_build.sh		docker_build.sh
docker_run_agent.sh		docker_run_agent.sh
env.sh		env.sh
get_unity_envs.sh		get_unity_envs.sh
model.py		model.py
navigation.py		navigation.py
setup_env.sh		setup_env.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Udacity DRLND

P1 Navigation

Project Details

Getting Started

Instructions

Training an agent

Example

Running a trained agent

Example

Results

About

Uh oh!

Releases

Packages

Languages

kyle-jarvis/drlnd-navigation

Folders and files

Latest commit

History

Repository files navigation

Udacity DRLND

P1 Navigation

Project Details

Getting Started

Instructions

Training an agent

Example

Running a trained agent

Example

Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages