Invited Talk
in
Workshop: Mathematical and Empirical Understanding of Foundation Models (ME-FoMo)
Invited Talk (Yann Dauphin): Leveraging Multiple Models and Multiple Tasks
Yann Dauphin
Abstract: In recent years, there has been a surge in the numbers of trained models and datasets that are shared online. In this talk, we will investigate methods that allow us to leverage this trend. First, we will show that ensembles that diverge more in training methodology display categorically different generalization behavior, producing increasingly uncorrelated errors. We show these models specialize in subdomains of the data, leading to higher ensemble performance: with just 2 models (each with ImageNet accuracy 76.5%), we can create ensembles with 83.4% (+7% boost). Second, we will discuss a method to make use of auxiliary tasks using an algorithm called ATTITTUD. This approach allows fine-grained resolution of conflicts between the gradient of the auxiliary task and the primary task. We will show that this approach produces significant improvements on benchmark tasks such as Chexpert.
Bio: Yann N. Dauphin is a machine learning researcher at Google Research working on understanding the fundamentals of deep learning algorithms and leveraging that in various applications. He has published seminal work on understanding the loss surface of neural nets. Prior to joining Google in 2019, he was a researcher at Facebook AI Research from 2015 to 2018 where his work led to award-winning scientific publications and helped improve automatic translation on Facebook.com. He completed his PhD at U. of Montreal under the supervision of Prof. Yoshua Bengio. During this time, he and his team won international machine learning competitions such as the Unsupervised Transfer Learning Challenge in 2013.