Poster
in
Workshop: Mathematical and Empirical Understanding of Foundation Models (ME-FoMo)
Retrieval of Soft Prompt Enhances Zero-Shot Task Generalization
Seonghyeon Ye · Joel Jang · Doyoung Kim · Yongrae Jo · Minjoon Seo
Keywords: [ zeroshot language models ] [ natural language processing ] [ large language models ]
During zero-shot inference with language models (LMs), using hard prompts alone may not be able to fully describe the target task. In this paper, we explore how the retrieval of soft prompts obtained through prompt tuning can assist hard prompts in zero-shot task generalization. Specifically, we train soft prompt embeddings for each prompt through prompt tuning, store the samples of the training instances (hard prompt + input instances) mapped with the prompt embeddings, and retrieve the corresponding prompt embedding of the training instance closest to the query instance during inference. Results show this simple approach enhances the performance of T0 on unseen tasks by outperforming it on 10 out of 11 datasets as well as improving the mean accuracy of T0 on BIG-bench benchmark by 2.39% points while adding only 0.007% additional parameters. Also, using interpolation of multiple embeddings and variance-based ranking further improve accuracy and robustness to different evaluation prompts, widening the performance gap.