firstbacksecondback
14 Results
Poster
|
Thu 18:30 |
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs Naman Agarwal · Syomantak Chaudhuri · Prateek Jain · Dheeraj Nagaraj · Praneeth Netrapalli |
|
Poster
|
Mon 18:30 |
Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming Sachin Konan · Esmaeil Seraj · Matthew Gombolay |
|
Poster
|
Tue 10:30 |
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization Zihan Zhou · Wei Fu · Bingliang Zhang · Yi Wu |
|
Spotlight
|
Mon 2:30 |
Constrained Policy Optimization via Bayesian World Models Yarden As · Ilnura Usmanova · Sebastian Curi · Andreas Krause |
|
Poster
|
Mon 2:30 |
Constrained Policy Optimization via Bayesian World Models Yarden As · Ilnura Usmanova · Sebastian Curi · Andreas Krause |
|
Poster
|
Wed 18:30 |
Gradient Information Matters in Policy Optimization by Back-propagating through Model Chongchong Li · Yue Wang · Wei Chen · Yuting Liu · Zhi-Ming Ma · Tie-Yan Liu |
|
Poster
|
Thu 10:30 |
Mirror Descent Policy Optimization Manan Tomar · Lior Shani · Yonathan Efroni · Mohammad Ghavamzadeh |
|
Poster
|
Wed 18:30 |
Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization Quanyi Li · Zhenghao Peng · Bolei Zhou |
|
Poster
|
Tue 18:30 |
Pareto Policy Pool for Model-based Offline Reinforcement Learning Yijun Yang · Jing Jiang · Tianyi Zhou · Jie Ma · Yuhui Shi |
|
Poster
|
Mon 10:30 |
Actor-critic is implicitly biased towards high entropy optimal policies Yuzheng Hu · Ziwei Ji · Matus Telgarsky |
|
Poster
|
Tue 18:30 |
Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game Haobo Fu · Weiming Liu · Shuang Wu · Yijia Wang · Tao Yang · Kai Li · Junliang Xing · Bin Li · Bo Ma · QIANG FU · Yang Wei |
|
Poster
|
Thu 10:30 |
Bregman Gradient Policy Optimization Feihu Huang · Shangqian Gao · Heng Huang |