Skip to yearly menu bar Skip to main content


Poster

SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning

Hongjun Wang · Sagar Vaze · Kai Han

Halle B
[ ] [ Project Page ]
Tue 7 May 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

Generalized Category Discovery (GCD) aims to classify unlabelled images from both seen and unseen classes by transferring knowledge from a set of labelled seen class images. A key theme in existing GCD approaches is adapting large-scale pretrained models for the GCD task. An alternate perspective, however, is to adapt the data representation itself for better alignment with the pretrained model.As such, in this paper, we introduce a two-stage adaptation approach termed SPTNet, which iteratively optimizes model parameters (i.e., model-finetuning) and data parameters (i.e., prompt learning). Furthermore, we propose a novel spatial prompt tuning method (SPT) which considers the spatial property of image data, enabling the method to better focus on object parts, which can transfer between seen and unseen classes. We thoroughly evaluate our SPTNet on standard benchmarks and demonstrate that our method outperforms existing GCD methods. Notably, we find our method achieving an average accuracy of 61.4\% on the SSB, surpassing prior state-of-the-art methods by approximately 10\%. The improvement is particularly remarkable as our method yields extra parameters amounting to only 0.042\% of those in the backbone architecture.

Chat is not available.