-
Notifications
You must be signed in to change notification settings - Fork 5
Hints
Glen edited this page Dec 10, 2015
·
2 revisions
This page is on general hints for using Neural Networks as RL models.
- It is best to make the samples as IID as possible.
Gradient descent works well for many applications but can have a few issues with optimizing for fitting an action value function. Direct gradient descent can move the action value functino Q(s,a) closer to r + gamma * Q(st+1, at+1) but also moves gamma * Q(s,a) closer to Q(s,a) - rt+1. I have found that this can result in a very poor behaviour during the optimization. The landscape tends to get very flat and optimization get slower and slower.