ICLR Poster Identifying Policy Gradient Subspaces

Poster

Identifying Policy Gradient Subspaces

Jan Schneider · Pierre Schumacher · Simon Guist · Le Chen · Daniel Haeufle · Bernhard Schoelkopf · Dieter Büchler

Halle B

[ Abstract ]

[ OpenReview]

Abstract:

Policy gradient methods hold great potential for solving complex continuous control tasks. Still, their training efficiency can be improved by exploiting structure within the optimization problem. Recent work indicates that supervised learning can be accelerated by leveraging the fact that gradients lie in a low-dimensional and slowly-changing subspace. In this paper, we demonstrate the existence of such gradient subspaces for policy gradient algorithms despite the continuously changing data distribution inherent to reinforcement learning. Our findings reveal promising directions for more efficient reinforcement learning, e.g., through improving parameter-space exploration or enabling second-order optimization.

Chat is not available.