Skip to content

Latest commit

 

History

History
51 lines (51 loc) · 2.12 KB

2021-07-01-arumugam21a.md

File metadata and controls

51 lines (51 loc) · 2.12 KB
title abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Deciding What to Learn: A Rate-Distortion Approach
Agents that learn to select optimal actions represent a prominent focus of the sequential decision-making literature. In the face of a complex environment or constraints on time and resources, however, aiming to synthesize such an optimal policy can become infeasible. These scenarios give rise to an important trade-off between the information an agent must acquire to learn and the sub-optimality of the resulting policy. While an agent designer has a preference for how this trade-off is resolved, existing approaches further require that the designer translate these preferences into a fixed learning target for the agent. In this work, leveraging rate-distortion theory, we automate this process such that the designer need only express their preferences via a single hyperparameter and the agent is endowed with the ability to compute its own learning targets that best achieve the desired trade-off. We establish a general bound on expected discounted regret for an agent that decides what to learn in this manner along with computational experiments that illustrate the expressiveness of designer preferences and even show improvements over Thompson sampling in identifying an optimal policy.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
arumugam21a
0
Deciding What to Learn: A Rate-Distortion Approach
373
382
373-382
373
false
Arumugam, Dilip and Van Roy, Benjamin
given family
Dilip
Arumugam
given family
Benjamin
Van Roy
2021-07-01
Proceedings of the 38th International Conference on Machine Learning
139
inproceedings
date-parts
2021
7
1