Reinforcement Learning and Planning · Markov Decision Processes

TitleAuthors
A Family of Robust Stochastic Operators for Reinforcement LearningYingdong Lu · Mark Squillante · Chai Wah Wu
A Unified Bellman Optimality Principle Combining Reward Maximization and EmpowermentFelix Leibfried · Sergio Pascual-Díaz · Jordi Grau-Moya
Finite-Sample Analysis for SARSA with Linear Function ApproximationShaofeng Zou · Tengyu Xu · Yingbin Liang
Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of RewardsFalcon Dai · Matthew Walter
Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPsMax Simchowitz · Kevin Jamieson
Regret Minimization for Reinforcement Learning with Vectorial Feedback and Complex ObjectivesWang Chi Cheung
Sampling Networks and Aggregate Simulation for Online POMDP PlanningHao(Jackson) Cui · Roni Khardon
Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian SamplesTengyu Xu · Shaofeng Zou · Yingbin Liang
Value Function in Frequency Domain and the Characteristic Value Iteration AlgorithmAmir-massoud Farahmand