Poster
in
Workshop: Pitfalls of limited data and computation for Trustworthy ML
Feature-Interpretable Real Concept Drift Detection
Pranoy Panda · Vineeth Balasubramanian · Gaurav Sinha
Classifiers deployed in production degrade in performance due to changes in the posterior distribution, a phenomenon referred to as real concept drift. Knowledge of such distribution shifts is helpful for two main reasons: (i) it helps retain classifier performance across time by telling us when to retrain it; and (ii) understanding the nature of shift in the relationship between input features and output labels, which can be of value for business analytics (e.g., understanding change in demand helps manage inventory) or scientific study (e.g., understanding virus behavior across changing demographics helps distribute drugs better). An interpretable real concept drift detection method is ideal for achieving this knowledge. Existing interpretable methods in this space only track covariate shifts, thus, are insensitive to the optimal decision boundary (true posterior distribution) and vulnerable to benign drifts in streaming data. Our work addresses this issue by proposing an interpretable method that leverages gradients of a classifier in a feature-wise hypothesis-testing framework to detect real concept drift. We also extend our method to a more realistic unsupervised setting where labels are not available to detect drift. Our experiments on various datasets show that the proposed method outperforms existing interpretable methods and performs at par with state-of-the-art supervised drift detection methods w.r.t the average model classification accuracy metric. Qualitatively, our method identifies features that are relevant to the drift in the USENET2 dataset, thus providing interpretability and accurate drift detection.