Reinforcement Learning and Planning · Exploration

TitleAuthors
A Meta-MDP Approach to Exploration for Lifelong Reinforcement LearningFrancisco Garcia · Philip Thomas
Limiting Extrapolation in Linear Approximate Value IterationAndrea Zanette · Alessandro Lazaric · Mykel J Kochenderfer · Emma Brunskill
Propagating Uncertainty in Reinforcement Learning via Wasserstein BarycentersAlberto Maria Metelli · Amarildo Likmeta · Marcello Restelli
Provably Efficient Q-Learning with Low Switching CostYu Bai · Tengyang Xie · Nan Jiang · Yu-Xiang Wang
Regret Bounds for Learning State Representations in Reinforcement LearningRonald Ortner · Matteo Pirotta · Alessandro Lazaric · Ronan Fruit · Odalric-Ambrym Maillard
Safe Exploration for Interactive Machine LearningMatteo Turchetta · Felix Berkenkamp · Andreas Krause
Successor Uncertainties: Exploration and Uncertainty in Temporal Difference LearningDavid Janz · Jiri Hron · Przemysław Mazur · Katja Hofmann · José Miguel Hernández-Lobato · Sebastian Tschiatschek
Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative ModelAndrea Zanette · Mykel J Kochenderfer · Emma Brunskill
Better Exploration with Optimistic Actor CriticKamil Ciosek · Quan Vuong · Robert Loftin · Katja Hofmann
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking OracleSimon Du · Yuping Luo · Ruosong Wang · Hanrui Zhang
Explicit Planning for Efficient Exploration in Reinforcement LearningLiangpeng Zhang · Ke Tang · Xin Yao
Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPsJian QIAN · Ronan Fruit · Matteo Pirotta · Alessandro Lazaric
Information-Theoretic Confidence Bounds for Reinforcement LearningXiuyuan Lu · Benjamin Van Roy
Worst-Case Regret Bounds for Exploration via Randomized Value FunctionsDaniel Russo