Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs | Marek Petrik · Reazul Hasan Russel |
Correlation Priors for Reinforcement Learning | Bastian Alt · Adrian Šošić · Heinz Koeppl |
Explicit Explore-Exploit Algorithms in Continuous State Spaces | Mikael Henaff |
Learning to Predict Without Looking Ahead: World Models Without Forward Prediction | Daniel Freeman · David Ha · Luke Metz |
Mapping State Space using Landmarks for Universal Goal Reaching | Zhiao Huang · Hao Su · Fangchen Liu |
Regularizing Trajectory Optimization with Denoising Autoencoders | Rinu Boney · Norman Di Palo · Mathias Berglund · Alexander Ilin · Juho Kannala · Antti Rasmus · Harri Valpola |
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies | Yonathan Efroni · Nadav Merlis · Mohammad Ghavamzadeh · Shie Mannor |
When to Trust Your Model: Model-Based Policy Optimization | Michael Janner · Justin Fu · Marvin Zhang · Sergey Levine |
When to use parametric models in reinforcement learning? | Hado van Hasselt · Matteo Hessel · John Aslanides |