Poster
Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable
Keller Jordan
Halle B
Typical neural network trainings have substantial variance in test-set performance between repeated runs, impeding hyperparameter comparison and training reproducibility. We present the following results towards understanding this variation.(1) Despite having significant variance on their test-sets, we demonstrate that standard CIFAR-10 and ImageNet trainings have very little variance in their performance on the test-distributions from which their test-sets are sampled, suggesting that variance is less of a practical issue than previously thought.(2) We present a simplifying statistical assumption which closely approximates the structure of the test-set accuracy distribution.(3) We prove that test-set variance is unavoidable given the observation that ensembles of independently trained networks are well-calibrated.(4) We conduct preliminary studies of distribution-shift, fine-tuning, data augmentation and learning rate through the lens of variance between runs.