Skip to yearly menu bar Skip to main content


Invited talk
in
Workshop: Physics for Machine Learning

Scaling laws for deep neural networks: driving theory and understanding through experimental insights

Yasaman Bahri


Abstract:

It has been observed that the performance of deep neural networks often empirically follows a power-law as simple scaling variables such as amount of training data and model parameters are changed. We would like to understand the origins behind these empirical observations. We take a physicist’s approach in investigating this question through the pillars of exactly solvable models, perturbation theory, and empirically-motivated assumptions on natural data. By starting from a simple theoretical setting which is controlled, testing our predictions against experiments, and extrapolating to more realistic settings, we can propose a natural classification of scaling regimes that are driven by different underlying mechanisms.

Chat is not available.