Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Physics for Machine Learning

Stationary Deep Reinforcement Learning with Quantum K-spin Hamiltonian Regularization

Xiao-Yang Liu · Zechu Li · Shixun Wu · Xiaodong Wang


Abstract:

Instability is a major issue of deep reinforcement learning (DRL) algorithms --- high variance of performance over multiple runs. It is mainly caused by the existence of many local minima and worsened by the multiple fixed points issue of Bellman's equation. As a fix, we propose a quantum K-spin Hamiltonian regularization term (called H-term) to help a policy network converge to a high-quality local minimum. First, we take a quantum perspective by modeling a policy as a K-spin Ising model and employ a Hamiltonian to measure the energy of a policy. Then, we derive a novel Hamiltonian policy gradient theorem and design a generic actor-critic algorithm that utilizes the H-term to regularize the policy network. Finally, the proposed method reduces the variance of cumulative rewards by 65.2% ~ 85.6% on six MuJoCo tasks, compared with existing algorithms over 20 runs.

Chat is not available.