Invited Talk (live)
in
Workshop: GroundedML: Anchoring Machine Learning in Classical Algorithmic Theory
Deep learning theory vs traditional theory of algorithms
Sanjeev Arora
Having spent significant part of my career in traditional algorithm design and complexity analysis before switching to theory of machine learning (especially deep learning), I will survey differences between the two fields. An important one is that machine learning often concerns a "ghost" objective (test loss, or performance on new downstream tasks) that was unavailable during the training. "Goodness" of the solution is undefined a priori, and theory necessarily has to create the notion of goodness as part of the analysis. It is increasingly clear that the key missing piece in this is better understanding of the trajectory of model parameters during training. For instance, classic notions of "size"/"capacity"/"representation power," or even the "goal" of the training, turn out to have no independent meaning without understanding the trajectory. This will be illustrated using several interesting theoretical developments as well as experiments from recent years, leading to an agenda for future work.