Poster
in
Workshop: Multimodal Representation Learning (MRL): Perks and Pitfalls
The Role of Pre-training Data in Transfer Learning
Rahim Entezari · Mitchell Wortsman · Olga Saukh · Moein Shariatnia · Hanie Sedghi · Ludwig Schmidt
Keywords: [ transfer learning ] [ LAION ] [ data curation ] [ self-supervised learning ] [ pre-training ] [ CLIP ] [ supervised learning ]
We explore which pre-training dataset should be used to achieve the best transfer learning performance. We investigate the impact of pre-training on the few-shot and full fine-tuning performance using 7 pre-training datasets, and 9 downstream datasets. Through extensive controlled experiments, we find that the choice of the pre-training dataset is essential for the few-shot transfer, but its role decreases as more data is made available for fine-tuning. Additionally, we explore the role of data curation and examine the trade-offs between label noise and the size of the pre-training dataset. We find that using 2000× more pre-training data from LAION can match the performance of supervised ImageNet pre-training.