Skip to content

Conversation

@cgao3
Copy link

@cgao3 cgao3 commented Oct 22, 2020

  1. The description of stochastic sea environment says "adds N(0,1) noise to the end of states of the chain", but in line 125, noisy reward were only added when "column" is either 0 or "_size -1".
  2. The description of stochastic sea environment says "act right with 1 - 1/N moves agent to right", but in line 121, i.e., when agent is at cell "(_size-1, _size-1)" and act right, there is no such stochasticity.

This pull request fixes these two inconsistency issues.
Without this pull request fix, expected value under optimal policy is more complicated; with these fixes,
expected value for optimal policy is simply given by (1-1/N)^N0.99 + (-0.01 + E[norm(0,1)])(1-(1-1/N)^N)

@google-cla
Copy link

google-cla bot commented Oct 22, 2020

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Oct 22, 2020
@cgao3
Copy link
Author

cgao3 commented Oct 22, 2020

@googlebot I signed it!

@googlebot I signed it!

@google-cla google-cla bot added cla: yes and removed cla: no labels Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant