Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Socially Responsible Machine Learning

TOWARDS DATA-FREE MODEL STEALING IN A HARD LABEL SETTING

Sunandini Sanyal · Sravanti Addepalli · Venkatesh Babu Radhakrishnan


Abstract:

Machine learning models deployed as a service(MLaaS) are often susceptible tomodel stealing attacks. While existing works demonstrate near-perfect performance using softmax predictions of the classification network, most of the APIs allow access to only the top-1 labels. In this work, we show that it is indeed possible to steal Machine Learning models by accessing only top-1 predictions (Hard Label setting), without access to model gradients (Black-Box setting) and even the training dataset (Data-Free setting) within a low query budget. We propose a novel GAN-based framework that trains the student and generator in tandem to steal the model effectively while utilizing gradients of the clone network as a proxy to the victim’s gradients. We overcome the large query costs by utilizing publicly available (potentially unrelated) datasets as a weak image prior. We additionally show that even in the absence of such data, it is possible to achieve state-of-the-art results within a low query budget using synthetically crafted samples. We are the first to demonstrate the scalability of Model Stealing on a 100 class dataset.

Chat is not available.