Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Agent Learning in Open-Endedness

Generalization Games for Reinforcement Learning

Manfred Diaz · Charlie Gauthier · Glen Berseth · Liam Paull


Abstract:

Many subfields have emerged in reinforcement learning (RL) to understand how distributions of training tasks affect an RL agent's ability to transfer learned experiences to one or more evaluation tasks. While the field is extensive and ever-growing, recent research has underlined that variability among the different methods is not as significant. We leverage this intuition to demonstrate how current methods for generalization in RL are specializations of a general framework. We obtain the fundamental aspects of this formulation by rebuilding a Markov Decision Process (MDP) from the ground up by resurfacing the game-theoretic framework of games against nature. The two-player game that arises from considering nature as a complete player on this formulation explains how existing approaches rely on learned and randomized dynamics and initial state distributions. We develop this result further by drawing inspiration from mechanism design theory to introduce the role of a principal as a third player that can modify the payoff functions of the decision-making agent and nature. The main contribution of our work is the complete description of the Generalization Games for Reinforcement Learning, a multiagent, multiplayer, game-theoretic formal approach to study generalization methods in RL. The games induced by playing against the principal extend our framework to explain how learned and randomized reward functions induce generalization in RL agents. We offer a preliminary ablation experiment of the different components of the framework and demonstrate that a more simplified composition of the objectives that we introduce for each player leads to comparable, and in some cases superior, zero-shot generalization performance than state-of-the-art methods while requiring almost two orders of magnitude fewer samples.

Chat is not available.