This repository includes an implementation of Q-Learning and policy gradient to solve this Open AI gym environment https://gym.openai.com/envs/CarRacing-v0/ using PyTorch.
More details in the source file: https://github.com/openai/gym/blob/master/gym/envs/box2d/car_racing.py
Tested with:
- Ubuntu 18.04
- Nvidia RTX 2070 card
- Cuda 10.2
- CuDNN 7.6.5
pip install -r requirements.txt
./play.sh
TODO:
- Describe action spaces
- Add more insights (e.g. increasing the experience buffer works)
TODO
TODO
Models, configurations and outputs on Google Drive.
Model name | Main configuration improvement(s) |
---|---|
model_basic_openai_stop_expl | Stop after 50 negative consecutive rewards |
(*) Test runs should be around 100 to be reliable.
TODO