Oral
in
Workshop: What do we need for successful domain generalization?
Avoiding Catastrophic Referral Failures In Medical Images Under Domain Shift
Developing robust approaches for domain generalization is critical for the real world deployment of deep learning models. Here, we address a particular domain generalization challenge: selective classification for automated medical image diagnosis. In this setting, models must learn to abstain from making predictions when label confidence is low, especially when tested with samples that deviate significantly from the training set (covariate shift). Using the example of diabetic retinopathy detection we show that even state-of-the-art deep learning models, including Bayesian networks, fail during selective classification under covariate shift. Bayesian estimates of predictive uncertainty do not generalize well under covariate shift yielding catastrophic performance drops during referral. We identify the source of these failures and propose several post hoc referral solutions that enable reliable selective classification under covariate shift.