ICLR 2024 Prompt Gradient Projection for Continual Learning Spotlight

Spotlight

Prompt Gradient Projection for Continual Learning

Jingyang Qiao · Zhizhong Zhang · Xin Tan · Chengwei Chen · Yanyun Qu · Yong Peng · Yuan Xie

[ Abstract ]

[ OpenReview]

Abstract:

Prompt-tuning has demonstrated impressive performance in continual learning by querying relevant prompts for novel classes training. Its forgetting is therefore reduced as this instance-wise query mechanism enables us to select and update only relevant prompts. In this paper, we further integrate prompt-tuning with gradient projection approach. Our observation is: prompt-tuning releases the necessity of task identifier for gradient projection method; and gradient projection provides theoretical guarantees against forgetting for prompt-tuning. This inspires a new prompt gradient projection approach (PGP) for continual learning. In PGP, we deduce the orthogonal condition for prompt gradient via the self-attention mechanism in vision-transformer. The condition equations are then solved by conducting Singular Value Decomposition (SVD) on an element-wise sum space between input space and prompt space. We validate our method on diverse datasets and experiments demonstrate the efficiency of reducing forgetting both in class incremental, online class incremental, and task incremental settings.

Chat is not available.