ContrastiveExplanation (Foil Trees)
Contrastive Explanation provides an explanation for why an instance had the current outcome (fact) rather than a targeted outcome of interest (foil). These counterfactual explanations limit the explanation to the features relevant in distinguishing fact from foil, thereby disregarding irrelevant features. The idea of contrastive explanations is captured in this Python package ContrastiveExplanation.
Initialization : The optimizer is defined in TensorFlow (TF) internally. We first load our MNIST classifier and the (optional) auto-encoder. The example below uses Keras or TF models.
Explanation: We can finally explain the instance:
explanation = cem.explain(X)
The explain method returns an Explanation object with the following attributes:
X: original instance
X_pred: predicted class of original instance
PN or PP: Pertinent Negative or Pertinant Positive
PN_pred or PP_pred: predicted class of PN or PP
Numerical Gradients: So far, the whole optimization problem could be defined within the internal TF graph, making autodiff possible. It is however possible that we do not have access to the model architecture and weights, and are only provided with a predict function returning probabilities for each class