In-Person Poster presentation / poster accept
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan · Hanie Sedghi · Olga Saukh · Rahim Entezari · Behnam Neyshabur
MH1-2-3-4 #64
Keywords: [ mode connectivity ] [ loss landscape ] [ invariance ] [ Barrier ] [ permutation ] [ deep learning ] [ Deep Learning and representational learning ]
In this paper we empirically investigate the conjecture from Entezari et al. (2021) which states that if permutation invariance is taken into account, then there should be no loss barrier to the linear interpolation between SGD solutions. We conduct our investigation using standard computer vision architectures trained on CIFAR-10 and ImageNet. First, we observe a general phenomenon in which interpolated deep networks suffer a collapse in the variance of their activations. We demonstrate that an appropriate rescaling of the pre-activations of the interpolated networks ameliorates this problem and significantly reduces the barrier. Second, by combining this with an algorithm for finding permutations based on maximizing correlations between the activations of matched neurons, we are able to reduce the interpolation barrier for a standard ResNet18 trained on CIFAR-10 to 1.5% absolute test error. We explore the interaction between our method and the choice of normalization layer, and demonstrate its robustness across a variety of architectures and training sets.