Skip to content
Glen edited this page Dec 10, 2015 · 2 revisions

Intro

This page is on general hints for using Neural Networks as RL models.

Hints

  1. It is best to make the samples as IID as possible.

Gradient descent vs gradient TD Updates

Gradient descent works well for many applications but can have a few issues with optimizing for fitting an action value function. Direct gradient descent can move the action value functino Q(s,a) closer to r + gamma * Q(st+1, at+1) but also moves gamma * Q(s,a) closer to Q(s,a) - rt+1. I have found that this can result in a very poor behaviour during the optimization. The landscape tends to get very flat and optimization get slower and slower.

Clone this wiki locally