Should Counterfactual Explanations Always Be Data Instances?

Title:Should Counterfactual Explanations Always Be Data Instances?

Authors:Junqi Jiang, Antonio Rago and Francesca Toni

Conference:XLoKR 2022

Tags:counterfactual explanations, machine learning classifiers and relation-based explanations

Abstract:

Counterfactual explanations (CEs) are an increasingly popular way of explaining machine learning classifiers. Predominantly, they amount to data instances pointing to potential changes to the inputs that would lead to alternative outputs. In this position paper we question the widespread assumption that CEs should always be data instances, and argue instead that in some cases they may be better understood in terms of special types of relations between input features and classification variables. We illustrate how a special type of these relations, amounting to critical influences, can characterise and guide the search for data instances deemed suitable as CEs. These relations also provide compact indications of which input features - rather than their specific values in data instances - have counterfactual value