Track: Oral 7C

Oral

Oral 7C

Abstract:

Chat is not available.

Fri 10 May 1:00 - 1:15 PDT

Less is More: Fewer Interpretable Region via Submodular Subset Selection

Ruoyu Chen · Hua Zhang · Siyuan Liang · Jingzhi Li · Xiaochun Cao

Image attribution algorithms aim to identify important regions that are highly relevant to model decisions. Although existing attribution solutions can effectively assign importance to target elements, they still face the following challenges: 1) existing attribution methods generate inaccurate small regions thus misleading the direction of correct attribution, and 2) the model cannot produce good attribution results for samples with wrong predictions. To address the above challenges, this paper re-models the above image attribution problem as a submodular subset selection problem, aiming to enhance model interpretability using fewer regions. To address the lack of attention to local regions, we construct a novel submodular function to discover more accurate fine-grained interpretation regions. To enhance the attribution effect for all samples, we also impose four different constraints on the selection of sub-regions, i.e., confidence, effectiveness, consistency, and collaboration scores, to assess the importance of various subsets. Moreover, we also analyze the link between the validity of the submodular function and four constraints at the level of theoretical aspects. Extensive experiments show that the proposed method outperforms SOTA methods on two face datasets (Celeb-A and VGG-Face2) and one fine-grained dataset (CUB-200-2011). For correctly predicted samples, the proposed method improves the Deletion and Insertion scores with an average of 4.9% and 2.5% gain relative to HSIC-Attribution. For incorrectly predicted samples, our method achieves gains of 81.0% and 18.4% compared to the HSIC-Attribution algorithm in the average highest confidence and Insertion score respectively.

Fri 10 May 1:15 - 1:30 PDT

On the Joint Interaction of Models, Data, and Features

Yiding Jiang · Christina Baek · J Kolter

Learning features from data is one of the defining characteristics of deep learning,but our theoretical understanding of the role features play in deep learning is stillrudimentary. To address this gap, we introduce a new tool, the interaction tensor,for empirically analyzing the interaction between data and model through features.With the interaction tensor, we make several key observations about how featuresare distributed in data and how models with different random seeds learn differentfeatures. Based on these observations, we propose a conceptual framework for fea-ture learning. Under this framework, the expected accuracy for a single hypothesisand agreement for a pair of hypotheses can both be derived in closed-form. Wedemonstrate that the proposed framework can explain empirically observed phenomena, including the recently discovered Generalization Disagreement Equality(GDE) that allows for estimating the generalization error with only unlabeled data.Further, our theory also provides explicit construction of natural data distributionsthat break the GDE. Thus, we believe this work provides valuable new insight intoour understanding of feature learning.