Poster
in
Workshop: Gamification and Multiagent Solutions
Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
Stefanos Leonardos · Will Overman · Ioannis Panageas · Georgios Piliouras
Potential games are one of the most important and widely studied classes of normal-form games. They define the archetypal setting of multi-agent coordination in which all agents utilities are perfectly aligned via a common potential function. Can we embed this intuitive framework in the setting of Markov games? What are the similarities and differences between multi-agent coordination with and without state dependence? To answer these questions, we study a natural class of Markov Potential Games (MPGs) that generalizes prior attempts to capture complex stateful multi-agent coordination. Counter-intuitively, insights from normal-form potential games do not carry over since MPGs involve Markov games with zero-sum state-games, but Markov games in which all state-games are potential games are not necessarily MPGs. Nevertheless, MPGs showcase standard desirable properties such as the existence of deterministic Nash policies. In our main result, we prove convergence of independent policy gradient and its stochastic counterpart to Nash policies at a rate that is polynomial in the approximation error by adapting single-agent gradient domination properties to multi-agent settings. This answers questions on the convergence of finite-sample, independent policy gradient methods beyond settings of pure conflicting or pure common interests.