title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

This paper addresses the problem of model-free reinforcement learning for Robust Markov Decision Process (RMDP) with large state spaces. The goal of the RMDPs framework is to find a policy that is robust against the parameter uncertainties due to the mismatch between the simulator model and real-world settings. We first propose the Robust Least Squares Policy Evaluation algorithm, which is a multi-step online model-free learning algorithm for policy evaluation. We prove the convergence of this algorithm using stochastic approximation techniques. We then propose Robust Least Squares Policy Iteration (RLSPI) algorithm for learning the optimal robust policy. We also give a general weighted Euclidean norm bound on the error (closeness to optimality) of the resulting policy. Finally, we demonstrate the performance of our RLSPI algorithm on some benchmark problems from OpenAI Gym.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

badrinath21a

0

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

511

520

511-520

511

false

Badrinath, Kishan Panaganti and Kalathil, Dileep

given	family
Kishan Panaganti	Badrinath

given	family
Dileep	Kalathil

2021-07-01

Proceedings of the 38th International Conference on Machine Learning

139

inproceedings

date-parts

2021

7

1

http://proceedings.mlr.press/v139/badrinath21a/badrinath21a.pdf

label	link
Supplementary PDF	http://proceedings.mlr.press/v139/badrinath21a/badrinath21a-supp.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2021-07-01-badrinath21a.md

2021-07-01-badrinath21a.md

Files

2021-07-01-badrinath21a.md

Latest commit

History

2021-07-01-badrinath21a.md

File metadata and controls