ICLR Poster A Non-monotonic Self-terminating Language Model

In-Person Poster presentation / poster accept

A Non-monotonic Self-terminating Language Model

Eugene Choi · Kyunghyun Cho · Cheolhyoung Lee

MH1-2-3-4 #36

Keywords: [ language modeling ] [ consistency ] [ non-terminating sequences ] [ sequence completion ] [ self-terminating ] [ decoding ] [ Applications ]

[ Abstract ] [ Project Page ]

[ Slides] [ Poster] [ OpenReview]

Abstract: Recent large-scale neural autoregressive sequence models have shown impressive performances on a variety of natural language generation tasks. However, their generated sequences often exhibit degenerate properties such as non-termination, undesirable repetition, and premature termination, when generated with decoding algorithms such as greedy search, beam search, top-$k$ sampling, and nucleus sampling. In this paper, we focus on the problem of non-terminating sequences resulting from an incomplete decoding algorithm. We first define an incomplete probable decoding algorithm which includes greedy search, top-$k$ sampling, and nucleus sampling, beyond the incomplete decoding algorithm originally put forward by Welleck et al. (2020). We then propose a non-monotonic self-terminating language model, which significantly relaxes the constraint of monotonically increasing termination probability in the originally proposed self-terminating language model by Welleck et al. (2020), to address the issue of non-terminating sequences when using incomplete probable decoding algorithms. We prove that our proposed model prevents non-terminating sequences when using not only incomplete probable decoding algorithms but also beam search. We empirically validate our model on sequence completion tasks with various architectures.

Chat is not available.