Deep Q-Learning for Lunar Landing

This project implements a Deep Q-Learning algorithm to simulate lunar landing. The objective is to develop an AI agent capable of landing a lunar module safely on the moon's surface. The implementation is based on the Gymnasium library and utilizes PyTorch for the neural network architecture.

Installation

Requirements

Python 3.x
Gymnasium
PyTorch
NumPy

To install the required packages, you can use the following commands:

pip install gymnasium
pip install "gymnasium[atari, accept-rom-license]"
apt-get install -y swig
pip install gymnasium[box2d]
pip install torch
pip install numpy

Project Structure

The project consists of the following sections:

Installing the required packages and importing the libraries: This part includes commands to install necessary libraries and packages.
Building the AI: This part involves creating the architecture of the neural network.
Training the AI: This part covers the training loop, including how the agent interacts with the environment, stores experiences, and updates its knowledge through training.
Evaluation: This part evaluates the performance of the trained model.

Usage

To run the project, execute the Python script:

python deep_q_learning_for_lunar_landing.py

Ensure you have all the required libraries installed as mentioned in the installation section.

Algorithm Explanation

Deep Q-Learning

Deep Q-Learning is a reinforcement learning algorithm that uses a neural network to approximate the Q-value function. The key components include:

Q-Value: Represents the expected future rewards for an action taken in a given state.
Experience Replay: Stores the agent's experiences to break the correlation between consecutive samples.
Target Network: A separate network to stabilize training by keeping a fixed Q-value target.

Neural Network Architecture

The neural network consists of:

An input layer that takes the state representation.
Hidden layers with activation functions.
An output layer that outputs Q-values for each action.

Training Process

Initialize the environment and the network.
Interact with the environment to collect experiences.
Store experiences in a replay buffer.
Sample a mini-batch of experiences from the replay buffer.
Calculate the loss using the difference between predicted Q-values and target Q-values.
Backpropagate the loss and update the network weights.
Periodically update the target network.

Results

After training, the AI agent should be able to land the lunar module safely. The performance can be evaluated based on the average reward per episode and the number of successful landings.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Deep_Q_Learning_for_Lunar_Landing.ipynb		Deep_Q_Learning_for_Lunar_Landing.ipynb
README.md		README.md
lunarlander.gif		lunarlander.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Q-Learning for Lunar Landing

Table of Contents

Installation

Requirements

Project Structure

Usage

Algorithm Explanation

Deep Q-Learning

Neural Network Architecture

Training Process

Results

About

Releases

Packages

Languages

anandkaranubc/deep_q

Folders and files

Latest commit

History

Repository files navigation

Deep Q-Learning for Lunar Landing

Table of Contents

Installation

Requirements

Project Structure

Usage

Algorithm Explanation

Deep Q-Learning

Neural Network Architecture

Training Process

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages