Skip to yearly menu bar Skip to main content


Invited Talk + Q&A
in
Workshop: Pitfalls of limited data and computation for Trustworthy ML

What Neural Networks Memorize and Why (Vitaly Feldman)


Abstract:

Deep learning algorithms tend to fit the entire training dataset thereby memorizing even noisy labels. In addition, complex models have been shown to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). This puzzling propensity to memorize seemingly useless data

is not explained by existing theories of machine learning. We provide simple conceptual explanations and theoretical models demonstrating that memorization of labels and training examples is necessary for achieving close-to-optimal generalization error when learning from long-tailed data distributions. This holds despite the fact that most of that information is ultimately irrelevant to the learning task at hand. Our results allow us to quantify the cost of limiting memorization in learning and explain the disparate effects that privacy and model compression have on different subpopulations. Finally, we demonstrate the utility of memorization and support our explanation empirically. These results rely on a new technique for efficiently estimating memorization and influence of training data points.

Chat is not available.